shithub: gpufswip

ref: b8acb5be3adb6a2c7f6f1a1cd9f371e65873b6d1
dir: /gpufs.txt/

View raw version
[[[ms
.FP lucidasans
. \" no header
.ds CH "
.
.HTML "GPU Filesystem for Plan 9"
.TL
GPU Filesystem for Plan 9
.AU
Joel Fridolin Meyer
joel@sirjofri.de
.AI
.AB
Many modern computer systems have dedicated hardware for computing graphics and other floating point-heavy operations.
GPU manufacturers and organizations try to standardize interfaces, often providing standard APIs and drivers for their specific hardware.
This WIP paper describes a potential filesystem for Plan 9 for dedicated GPU hardware.
.AE
]]]

# Graphics or not graphics?

GPU hardware evolved from very specialized pipelines for 2D (and later 3D) rendering to more generic processing units.
In the last few years GPU manufacturers started to sell ``unified shaders'' that allow even more generic computing pipelines.
Hardware accelerated AI computation and more specialized graphics processing pipelines¹ justify dropping native graphics processing support, at least in an initial implementation of a GPU filesystem for Plan 9.

[[[ms
.FS
¹ Epic Games' Nanite does rasterization within a shader.
They also plan to make Nanite compute the shading on the software level. [Nanite]
.FE
]]]

This generic use of GPUs allows us to ignore many graphics-specific parts of the API, as well as potential drivers.
Of course it also makes our interface a non-standard interface, but due to the nature of Plan 9 filesystems this should be fine.

# The implementation

Since most GPU drivers are very complex and it takes a long time to develop them, this first implementation will be fully CPU based, using an interpreter engine to run the shaders.
The interfaces however should look as similar as possible to a potential GPU implementation.

Due to the nature of filesystems, this should make it easy to ``upgrade'' applications to actual GPU hardware: Using a different GPU filesystem is all that's needed.
The software itself doesn't need to change.

Since real GPU drivers for uncommon operating systems are a rarity, this implementation approach also makes software that relies on the GPU filesystem work on systems that don't have dedicated graphics.
Obviously, there'll be a huge speed difference in this case.

# The filesystem interface

This implementation will provide a very simple interface that's based on existing APIs, mostly OpenGL and Vulkan.
Because of this simplicity the interface is not set in stone and many details are missing.
In fact, the proposed interface might not work at all with real hardware, but it can be a start.

The shader language is SPIR-V, which is used in Vulkan as an intermediate language.
The shaders are loaded into the filesystem, which will compile them further down to the specific hardware, so they can be executed. [SPIR-V]

Buffers are represented as files within the filesystem.
This gives us the flexibility to access the buffer contents by standard file IO.

Management is handled via a console like interface as a control file.
Using this interface, it is possible to initialize new shaders and allocate new buffers, as well as control binding and program execution.
For debugging purposes, an additional
[[[ms
.CW desc
]]]
file is used to display all shader and buffer bindings.

Shaders and buffers are represented as the general concept of ``objects''.
Each object has its own subdirectory within the GPU filesystem (1).
After initializing a new shader or buffer using the control file the client can read back the ID of the object.
With that, the application knows which object directory to access.

[[[ms
.DS B
.CW /dev/gpu "." "1. The GPU filesystem "
.B1
.CW
/ctl        (control file)
/desc       (descriptors file)
/0/buffer   (sample buffer file)
/0/ctl      (sample buffer ctl file)
/1/shader   (sample shader file)
/1/ctl      (sample shader ctl file)
.B2
.DE
]]]

Shaders and buffers can be loaded by writing to their files; the application can also read their contents (2).

[[[ms
.DS B
2. Loading shaders and buffers.
.B1
.CW
cat myfile.spv > /dev/gpu/1/shader
cat mydata.bin > /dev/gpu/0/buffer
…
# bindings, see (4)
# compile shader and run, see (3)
…
cp /dev/gpu/0/buffer > result.bin
.B2
.DE
]]]

The filesystem can't know when a shader is loaded completely.
Because of that, it is necessary to tell it to compile the shader.
This can be done by issuing the ‥compile‥ command on the shader control file (3).

Since a single SPIR-V program can contain multiple entry points (‥OpEntryPoint‥), it is necessary to specify which shader function to run (3).

[[[ms
.DS B
3. Compiling and running a shader.
.B1
.CW
echo c       > /dev/gpu/1/ctl
echo r main  > /dev/gpu/1/ctl
.B2
.DE
]]]

When executing the shader, it's essential to specify the number of workgroups.
The workgroup size represents the count of shader invocations for that workgroup, and it's defined within the shader program itself.
Vulkan uses 3D vectors to specify the amount and size of workgroups.
[wkgrp]

# Binding shaders and buffers

Shaders and buffers need to be bound in some way that enable shaders to access the right buffers.
It's hard to understand all the requirements of the actual hardware without diving deep into GPU architecture and existing APIs. [desc]

Our implementation provides a very simple abstraction that is based on Vulkan and the concept of ‥descriptor pools‥ and ‥descriptor sets‥ with their bindings.
Ideally, the same abstraction may be used for GPU hardware.

Each shader is bound to a descriptor pool.
A descriptor pool can describe many descriptor sets, which in turn point to the buffers.

While shaders are bound to a full descriptor pool, buffers are bound to a single binding slot within a descriptor set.
Shaders have everything needed to access a specific buffer compiled in their code.
They know the set and binding of the buffer they want to access.

Because this information is compiled in the shader, it is still possible to switch buffers by changing the binding itself.

(4) shows how to create a new descriptor pool and set up bindings.
In this example, buffer 0 is bound to the second (index 1) binding of the first (index 0) descriptor set of descriptor pool 0.

[[[ms
.DS B
4. Binding shaders and buffers.
.B1
.CW
# set up descriptor pool with 2 descriptor sets
echo n p 2      > /dev/gpu/ctl
# allocate 4 bindings
echo s 0 0 4    > /dev/gpu/ctl
# bind buffer 0
echo b 0 0 0 1  > /dev/gpu/ctl
# bind shader to pool 0
echo b 0        > /dev/gpu/1/ctl
.B2
.DE
]]]

Reading the file
[[[ms
.CW desc
]]]
shows us the layout of this structure (5).
We can see that only one binding is bound (showing the number of the buffer), while the other bindings are unset (showing -1).

[[[ms
.DS B
5. Example descriptor table.
.B1
.CW
DescPool 0
    Set 0
        0  -1
        1   0
        2  -1
        3  -1
    Set 1
.B2
.DE
]]]

While the
[[[ms
.CW desc
]]]
file can be parsed and interpreted, it should be noted that it's only meant for debugging and reviewing.
Applications should use the interface provided by the control files.


# State of code and future work

The code currently covers the described filesystem interface completely, however not all functionality is implemented.
Furthermore, there are bugs to be expected. [gpufs]

There's a rudimentary SPIR-V assembler as well as a SPIR-V disassembler.
Both are far from feature-complete according to the SPIR-V specification, missing instructions may be added easily.
[[[ms
.\"I plan to build the embedded SPIR-V compiler as soon as possible, as well as the runtime engine, so we can finally run shaders and use the filesystem as intended.
]]]

To properly test the interface, it makes sense to implement the filesystem as a kernel device of Drawterm.
This way we can use existing drivers and APIs on other operating systems.

Due to the lack of actual GPU hardware support, I don't expect much performance gain compared to other implementations with the same logic.
However, the interface is generic enough to allow applications to use different GPU implementations: GPU hardware, CPU hardware (single or multi threaded), network scenarios.

It makes sense to think about future integrations into devdraw. The GPU filesystem could control actual images of devdraw and enable faster draw times for graphics rendering.

Since SPIR-V is very low-level, it also makes sense to develop shader compilers for higher level languages like GLSL or HLSL.
Applications are developed by different people and for different reasons, so those compilers should not be part of the specific filesystem implementations.


# References

[[[ms
.nr PS -1
.nr VS -2
.IP "[Nanite]" 10
.ad l
Epic Games, ``Unreal Engine Public Roadmap: Nanite - Optimized Shading'',
.CW https://portal.productboard.com/epicgames/1-unreal-engine-
.CW public-roadmap/c/1250-nanite-optimized-shading ,
2024.
.IP "[SPIR-V]" 10
.ad l
The Khronos Group Inc., ``SPIR-V Specification'',
.CW https://registry.khronos
.CW .org/SPIR-V/specs/unified1/SPIRV.html ,
2024.
.IP "[gpufs]" 10
.ad l
Meyer, Joel, ``gpufs'' and ``spirva'',
.CW https://shithub.us/sirjofri/gpufs/HEAD/info.html
and
.CW https://shithub.us/sirjofri/spirva/HEAD/info.html ,
2024.
.IP "[desc]" 10
.ad l
vulkan tutorial, ``Descriptor pool and sets'',
.CW https://vulkan-tutorial.com/Uniform_buffers/
.CW Descriptor_pool_and_sets ,
2024.
.IP "[wkgrp]" 10
.ad l
OpenGL Wiki, ``Compute Shader'',
.CW https://www.khronos.org/opengl/wiki/Compute_Shader ,
2024.
]]]