ac/nir/tess: Remove jump from tess factor writes.
authorTimur Kristóf <timur.kristof@gmail.com>
Mon, 15 Aug 2022 11:22:15 +0000 (13:22 +0200)
committerMarge Bot <emma+marge@anholt.net>
Tue, 11 Oct 2022 15:42:54 +0000 (15:42 +0000)
When the output patch size <= 32 we can be sure regardless
of wave size that each wave will take this branch, therefore
the jump can be removed.

Fossil DB stats on Navi 21:

Totals from 1385 (1.03% of 134906) affected shaders:
CodeSize: 2664436 -> 2658896 (-0.21%)
Instrs: 488618 -> 487233 (-0.28%)
Latency: 2290157 -> 2289199 (-0.04%)
InvThroughput: 898658 -> 898364 (-0.03%)
Branches: 6554 -> 5169 (-21.13%)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-By: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17921>

src/amd/common/ac_nir_lower_tess_io_to_mem.c

index e14ddc3..d3a88fa 100644 (file)
@@ -564,6 +564,13 @@ hs_emit_write_tess_factors(nir_shader *shader,
    /* Only the 1st invocation of each patch needs to do this. */
    nir_if *invocation_id_zero = nir_push_if(b, nir_ieq_imm(b, invocation_id, 0));
 
+   /* When the output patch size is <= 32 then we can flatten the branch here
+    * because we know for sure that at least 1 invocation in all waves will
+    * take the branch.
+    */
+   if (shader->info.tess.tcs_vertices_out <= 32)
+      invocation_id_zero->control = nir_selection_control_divergent_always_taken;
+
    /* The descriptor where tess factors have to be stored by the shader. */
    nir_ssa_def *tessfactor_ring = nir_load_ring_tess_factors_amd(b);