ir3: Fix vectorizer condition for SSBOs
authorConnor Abbott <cwabbott0@gmail.com>
Tue, 14 Jun 2022 18:47:29 +0000 (20:47 +0200)
committerMarge Bot <emma+marge@anholt.net>
Thu, 23 Jun 2022 10:46:31 +0000 (10:46 +0000)
SSBO access works very differently from UBO access. Straddling
loads/stores isn't an issue, loads/stores instead must be aligned to the
element size and can have up to 4 components.

We support 16-bit access with SSBOs on a650+, and sometimes the
vectorizer tries to create a misaligned 32-bit access when combining
32-bit and 16-bit accesses. The UBO-focused logic didn't reject this,
which is now fixed. This fixes a number of VK-CTS regressions on a650+.

Fixes: bf49d4a084b ("freedreno/ir3: Enable load/store vectorization for SSBO access, too.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17040>

src/freedreno/ir3/ir3_nir.c

index 519a02c..009d5a8 100644 (file)
@@ -37,10 +37,17 @@ ir3_nir_should_vectorize_mem(unsigned align_mul, unsigned align_offset,
                              nir_intrinsic_instr *low,
                              nir_intrinsic_instr *high, void *data)
 {
+   unsigned byte_size = bit_size / 8;
+
+   if (low->intrinsic != nir_intrinsic_load_ubo) {
+      return bit_size <= 32 && align_mul >= byte_size &&
+         align_offset % byte_size == 0 &&
+         num_components <= 4;
+   }
+
    assert(bit_size >= 8);
    if (bit_size != 32)
       return false;
-   unsigned byte_size = bit_size / 8;
 
    int size = num_components * byte_size;