In order to preserve padding properly for MSL, we need to use its
packed_vec type for all vec3 types in storage buffers, not just struct
members. This commit includes a complete rewrite of the PackedVec3
transform to achieve this. The key details are:
* An internal `__packed_vec3<>` type was added, which corresponds to a
`type::Vector` with an additional flag to indicate that it will be
emitted as packed vector.
* The `PackedVec3` transform replaces all vec3 types used in
host-shareable address spaces with the internal `__packed_vec3`
type. This includes vec3 types that appear as the store type of a
pointer.
* When used as an array element, these `__packed_vec3` types are
wrapped in a struct that contains a single `__packed_vec3`
member. This allows us to add an `@align()` attribute that ensures
that `array<vec3<T>>` still has the correct array element stride.
* When the `vec3<T>` appears as a struct member in the input program,
we apply the `@align()` to that member to ensure that we do not
change its offset.
* Matrix types with three rows that are used in memory are replaced
with an array of columns, where each column uses a `__packed_vec3`
inside an aligned wrapper structure as above.
* Accesses to host-shareable memory that involve any of these types
invoke a "pack" or "unpack" helper function to convert them to the
equivalent type that uses `__packed_vec3` or a regular `vec3` as
required.
* The `chromium_internal_relaxed_uniform_layout` extension is used to
avoid issues where modifying a type in the uniform address space
triggers stricter layout validation rules.
Bug: tint:1571
Fixed: tint:1837
Change-Id: Idaf2da2f5bcb2be00c85ec657edfb614186476bb
Reviewed-on: https://dawn-review.googlesource.com/c/dawn/+/121200
Reviewed-by: Ben Clayton <bclayton@google.com>
Commit-Queue: James Price <jrprice@google.com>
Kokoro: Kokoro <noreply+kokoro@google.com>
The PreservePadding transform now decomposes writes to matrices with
three rows into separate column vector writes, to avoid modifying
padding between columns.
Bug: tint:1571
Change-Id: If575f79bb87f52810783fd3338e2f3ce3228ab2e
Reviewed-on: https://dawn-review.googlesource.com/c/dawn/+/121600
Auto-Submit: James Price <jrprice@google.com>
Kokoro: Kokoro <noreply+kokoro@google.com>
Reviewed-by: Ben Clayton <bclayton@google.com>
Commit-Queue: James Price <jrprice@google.com>
Change the DecomposeMemoryAccess to behave more like the DirectVariableAccess transform, in that it'll inline the access of buffer variable into the load / store helper functions, instead of passing the array down.
This avoids large array copies observed with FXC, which can have *severe* performance costs.
Fixed: tint:1819
Change-Id: I52eb3f908813f72ab9da446743e24a2637158309
Reviewed-on: https://dawn-review.googlesource.com/c/dawn/+/121460
Kokoro: Kokoro <noreply+kokoro@google.com>
Auto-Submit: Ben Clayton <bclayton@google.com>
Reviewed-by: James Price <jrprice@google.com>
Commit-Queue: James Price <jrprice@google.com>
Shuffle the transform orders to ensure that these are embedded in a structure before running the Std140 transform.
Add more end-to-end tests for these.
As pointed out in tint:1673, arrays of matrices are not correctly decomposed by the Std140 transform.
This will be addressed by a later change.
Bug: tint:1673
Change-Id: I47c93e458ff48578922d576819792e8ed3a5723c
Reviewed-on: https://dawn-review.googlesource.com/c/dawn/+/102541
Kokoro: Kokoro <noreply+kokoro@google.com>
Reviewed-by: Zhaoming Jiang <zhaoming.jiang@intel.com>
Commit-Queue: Ben Clayton <bclayton@google.com>
Reviewed-by: David Neto <dneto@google.com>