ac/nir: micro-optimize boolean expression
authorRhys Perry <pendingchaos02@gmail.com>
Tue, 18 Oct 2022 18:52:15 +0000 (19:52 +0100)
committerMarge Bot <emma+marge@anholt.net>
Thu, 27 Oct 2022 13:31:40 +0000 (13:31 +0000)
Ignoring SCC spilling, the old version is probably faster because this
mixes uniform and divergent booleans.

fossil-db (navi21):
Totals from 61167 (45.10% of 135636) affected shaders:
Instrs: 29961899 -> 29932551 (-0.10%)
CodeSize: 157407028 -> 157289636 (-0.07%)
Latency: 139671953 -> 139625186 (-0.03%); split: -0.03%, +0.00%
InvThroughput: 21221097 -> 21220756 (-0.00%)
SClause: 750438 -> 750439 (+0.00%)
Copies: 2672846 -> 2582332 (-3.39%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19340>

src/amd/common/ac_nir_lower_ngg.c

index 030295c..523a951 100644 (file)
@@ -993,7 +993,8 @@ compact_vertices_after_culling(nir_builder *b,
    nir_pop_if(b, if_gs_accepted);
 
    nir_store_var(b, es_accepted_var, es_survived, 0x1u);
-   nir_store_var(b, gs_accepted_var, nir_bcsel(b, fully_culled, nir_imm_false(b), nir_has_input_primitive_amd(b)), 0x1u);
+   nir_store_var(b, gs_accepted_var,
+                 nir_iand(b, nir_inot(b, fully_culled), nir_has_input_primitive_amd(b)), 0x1u);
 }
 
 static void