freedreno/ir3: disable conversion folding on a4xx
authorIlia Mirkin <imirkin@alum.mit.edu>
Sat, 4 Dec 2021 00:06:12 +0000 (19:06 -0500)
committerMarge Bot <emma+marge@anholt.net>
Tue, 8 Mar 2022 01:23:05 +0000 (01:23 +0000)
Experiments suggest that e.g.

add.u r0.y, hr0.x, hr0.y

will result in the summed value in both the high and low words of r0.y.
This only happens with odd registers, not even ones (r0.x works fine).

Seen in the bit_count lowering (which turns out to be unnecessary, but
this is still a larger problem).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15251>

src/freedreno/ir3/ir3_compiler_nir.c

index 228cfbd..0995480 100644 (file)
@@ -4545,7 +4545,9 @@ ir3_compile_shader_nir(struct ir3_compiler *compiler,
    do {
       progress = false;
 
-      progress |= IR3_PASS(ir, ir3_cf);
+      /* the folding doesn't seem to work reliably on a4xx */
+      if (ctx->compiler->gen != 4)
+         progress |= IR3_PASS(ir, ir3_cf);
       progress |= IR3_PASS(ir, ir3_cp, so);
       progress |= IR3_PASS(ir, ir3_cse);
       progress |= IR3_PASS(ir, ir3_dce, so);