268 lines
9.9 KiB
Markdown
268 lines
9.9 KiB
Markdown
# SPIR-V translation of shader input and output variables
|
|
|
|
WGSL [MR 1315](https://github.com/gpuweb/gpuweb/issues/1315) changed WGSL so
|
|
that pipeline inputs and outputs are handled similar to HLSL:
|
|
|
|
- Shader pipeline inputs are the WGSL entry point function arguments.
|
|
- Shader pipeline outputs are the WGSL entry point return value.
|
|
|
|
Note: In both cases, a struct may be used to pack multiple values together.
|
|
In that case, I/O specific attributes appear on struct members at the struct declaration.
|
|
|
|
Resource variables, e.g. buffers, samplers, and textures, are still declared
|
|
as variables at module scope.
|
|
|
|
## Vulkan SPIR-V today
|
|
|
|
SPIR-V for Vulkan models inputs and outputs as module-scope variables in
|
|
the Input and Output address spaces, respectively.
|
|
|
|
The `OpEntryPoint` instruction has a list of module-scope variables that must
|
|
be a superset of all the input and output variables that are statically
|
|
accessed in the shader call tree.
|
|
From SPIR-V 1.4 onward, all interface variables that might be statically accessed
|
|
must appear on that list.
|
|
So that includes all resource variables that might be statically accessed
|
|
by the shader call tree.
|
|
|
|
## Translation scheme for SPIR-V to WGSL
|
|
|
|
A translation scheme from SPIR-V to WGSL is as follows:
|
|
|
|
Each SPIR-V entry point maps to a set of Private variables proxying the
|
|
inputs and outputs, and two functions:
|
|
|
|
- An inner function with no arguments or return values, and whose body
|
|
is the same as the original SPIR-V entry point.
|
|
- Original input variables are mapped to pseudo-in Private variables
|
|
with the same store types, but no other attributes or properties copied.
|
|
In Vulkan, Input variables don't have initalizers.
|
|
- Original output variables are mapped to pseudo-out Private variables
|
|
with the same store types and optional initializer, but no other attributes
|
|
or properties are copied.
|
|
- A wrapper entry point function whose arguments correspond in type, location
|
|
and builtin attributes the original input variables, and whose return type is
|
|
a structure containing members correspond in type, location, and builtin
|
|
attributes to the original output variables.
|
|
The body of the wrapper function the following phases:
|
|
- Copy formal parameter values into pseudo-in variables.
|
|
- Insert a bitcast if the WGSL builtin variable has different signedness
|
|
from the SPIR-V declared type.
|
|
- Execute the inner function.
|
|
- Copy pseudo-out variables into the return structure.
|
|
- Insert a bitcast if the WGSL builtin variable has different signedness
|
|
from the SPIR-V declared type.
|
|
- Return the return structure.
|
|
|
|
- Replace uses of the the original input/output variables to the pseudo-in and
|
|
pseudo-out variables, respectively.
|
|
- Remap pointer-to-Input with pointer-to-Private
|
|
- Remap pointer-to-Output with pointer-to-Private
|
|
|
|
We are not concerned with the cost of extra copying input/output values.
|
|
First, the pipeline inputs/outputs tend to be small.
|
|
Second, we expect the backend compiler in the driver will be able to see
|
|
through the copying and optimize the result.
|
|
|
|
### Example
|
|
|
|
|
|
```glsl
|
|
#version 450
|
|
|
|
layout(location = 0) out vec4 frag_colour;
|
|
layout(location = 0) in vec4 the_colour;
|
|
|
|
void bar() {
|
|
frag_colour = the_colour;
|
|
}
|
|
|
|
void main() {
|
|
bar();
|
|
}
|
|
```
|
|
|
|
Current translation, through SPIR-V, SPIR-V reader, WGSL writer:
|
|
|
|
```groovy
|
|
@location(0) var<out> frag_colour : vec4<f32>;
|
|
@location(0) var<in> the_colour : vec4<f32>;
|
|
|
|
fn bar_() -> void {
|
|
const x_14 : vec4<f32> = the_colour;
|
|
frag_colour = x_14;
|
|
return;
|
|
}
|
|
|
|
@fragment
|
|
fn main() -> void {
|
|
bar_();
|
|
return;
|
|
}
|
|
```
|
|
|
|
Proposed translation, through SPIR-V, SPIR-V reader, WGSL writer:
|
|
|
|
```groovy
|
|
// 'in' variables are now 'private'.
|
|
var<private> frag_colour : vec4<f32>;
|
|
var<private> the_colour : vec4<f32>;
|
|
|
|
fn bar_() -> void {
|
|
// Accesses to the module-scope variables do not change.
|
|
// This is a big simplifying advantage.
|
|
const x_14 : vec4<f32> = the_colour;
|
|
frag_colour = x_14;
|
|
return;
|
|
}
|
|
|
|
fn main_inner() -> void {
|
|
bar_();
|
|
return;
|
|
}
|
|
|
|
// Declare a structure type to collect the return values.
|
|
struct main_result_type {
|
|
@location(0) frag_color : vec4<f32>;
|
|
};
|
|
|
|
@fragment
|
|
fn main(
|
|
|
|
// 'in' variables are entry point parameters
|
|
@location(0) the_color_arg : vec4<f32>
|
|
|
|
) -> main_result_type {
|
|
|
|
// Save 'in' arguments to 'private' variables.
|
|
the_color = the_color_arg;
|
|
|
|
// Initialize 'out' variables.
|
|
// Use the zero value, since no initializer was specified.
|
|
frag_color = vec4<f32>();
|
|
|
|
// Invoke the original entry point.
|
|
main_inner();
|
|
|
|
// Collect outputs into a structure and return it.
|
|
var result : main_outer_result_type;
|
|
result.frag_color = frag_color;
|
|
return result;
|
|
}
|
|
```
|
|
|
|
Alternately, we could emit the body of the original entry point at
|
|
the point of invocation.
|
|
However that is more complex because the original entry point function
|
|
may return from multiple locations, and we would like to have only
|
|
a single exit path to construct and return the result value.
|
|
|
|
### Handling fragment discard
|
|
|
|
In SPIR-V `OpKill` causes immediate termination of the shader.
|
|
Is the shader obligated to write its outputs when `OpKill` is executed?
|
|
|
|
The Vulkan fragment operations are as follows:
|
|
(see [6. Fragment operations](https://www.khronos.org/registry/vulkan/specs/1.2/html/vkspec.html#fragops)).
|
|
|
|
* Scissor test
|
|
* Sample mask test
|
|
* Fragment shading
|
|
* Multisample coverage
|
|
* Depth bounds test
|
|
* Stencil test
|
|
* Depth test
|
|
* Sample counting
|
|
* Coverage reduction
|
|
|
|
After that, the fragment results are used to update output attachments, including
|
|
colour, depth, and stencil attachments.
|
|
|
|
Vulkan says:
|
|
|
|
> If a fragment operation results in all bits of the coverage mask being 0,
|
|
> the fragment is discarded, and no further operations are performed.
|
|
> Fragments can also be programmatically discarded in a fragment shader by executing one of
|
|
>
|
|
> OpKill.
|
|
|
|
I interpret this to mean that the outputs of a discarded fragment are ignored.
|
|
|
|
Therefore, `OpKill` does not require us to modify the basic scheme from the previous
|
|
section.
|
|
|
|
The `OpDemoteToHelperInvocationEXT`
|
|
instruction is an alternative way to throw away a fragment, but which
|
|
does not immediately terminate execution of the invocation.
|
|
It is introduced in the [`SPV_EXT_demote_to_helper_invocation](http://htmlpreview.github.io/?https://github.com/KhronosGroup/SPIRV-Registry/blob/master/extensions/EXT/SPV_EXT_demote_to_helper_invocation.html)
|
|
extension. WGSL does not have this feature, but we expect it will be introduced by a
|
|
future WGSL extension. The same analysis applies to demote-to-helper. When introduced,
|
|
it will not affect translation of pipeline outputs.
|
|
|
|
### Handling depth-replacing mode
|
|
|
|
A Vulkan fragment shader must write to the fragment depth builtin if and only if it
|
|
has a `DepthReplacing` execution mode. Otherwise behaviour is undefined.
|
|
|
|
We will ignore the case where the SPIR-V shader writes to the `FragDepth` builtin
|
|
and then discards the fragment.
|
|
This is justified because "no further operations" are performed by the pipeline
|
|
after the fragment is discarded, and that includes writing to depth output attachments.
|
|
|
|
Assuming the shader is valid, no special translation is required.
|
|
|
|
### Handling output sample mask
|
|
|
|
By the same reasoning as for depth-replacing, it is ok to incidentally not write
|
|
to the sample-mask builtin variable when the fragment is discarded.
|
|
|
|
### Handling clip distance and cull distance
|
|
|
|
Most builtin variables are scalars or vectors.
|
|
However, the `ClipDistance` and `CullDistance` builtin variables are arrays of 32-bit float values.
|
|
Each entry defines a clip half-plane (respectively cull half-plane)
|
|
A Vulkan implementation must support array sizes of up to 8 elements.
|
|
|
|
How prevalent are shaders that use these features?
|
|
These variables are supported when Vulkan features `shaderClipDistance` and `shaderCullDistance`
|
|
are supported.
|
|
According to gpuinfo.org as of this writing, those
|
|
Vulkan features appear to be nearly universally supported on Windows devices (>99%),
|
|
but by only 70% on Android.
|
|
It appears that Qualcomm devices support them, but Mali devices do not (e.g. Mali-G77).
|
|
|
|
The proposed translation scheme forces a copy of each array from private
|
|
variables into the return value of a vertex shader, or into a private
|
|
variable of a fragment shader.
|
|
In addition to the register pressure, there may be a performance degradation
|
|
due to the bulk copying of data.
|
|
|
|
We think this is an acceptable tradeoff for the gain in usability and
|
|
consistency with other pipeline inputs and outputs.
|
|
|
|
## Translation scheme for WGSL AST to SPIR-V
|
|
|
|
To translate from the WGSL AST to SPIR-V, do the following:
|
|
|
|
- Each entry point formal parameter is mapped to a SPIR-V `Input` variable.
|
|
- Struct and array inputs may have to be broken down into individual variables.
|
|
- The return of the entry point is broken down into fields, with one
|
|
`Output` variable per field.
|
|
- In the above, builtins must be separated from user attributes.
|
|
- Builtin attributes are moved to the corresponding variable.
|
|
- Location and interpolation attributes are moved to the corresponding
|
|
variables.
|
|
- This translation relies on the fact that pipeline inputs and pipeline
|
|
outputs are IO-shareable types. IO-shareable types are always storable,
|
|
and can be the store type of input/output variables.
|
|
- Input function parameters will be automatically initialized by the system
|
|
as part of setting up the pipeline inputs to the entry point.
|
|
- Replace each return statement in the entry point with a code sequence
|
|
which writes the return value components to the synthesized output variables,
|
|
and then executes an `OpReturn` (without value).
|
|
|
|
This translation is sufficient even for fragment shaders with discard.
|
|
In that case, outputs will be ignored because downstream pipeline
|
|
operations will not be performed.
|
|
This is the same rationale as for translation from SPIR-V to WGSL AST.
|