aco: Don't use s_and_saveexec with branches when exec is constant.
authorTimur Kristóf <timur.kristof@gmail.com>
Fri, 7 May 2021 13:03:36 +0000 (15:03 +0200)
committerMarge Bot <eric+marge@anholt.net>
Tue, 18 May 2021 11:48:22 +0000 (11:48 +0000)
When exec is constant, we can remember the constant as the old exec,
and just copy the condition and use it as the new exec. There is no
need to save the constant.

Due to using p_parallelcopy which is lowered to s_mov_b64 (or 32),
many exec restores now become copies, hence the increase in the copy
stats.

Fossil DB changes on Sienna Cichlid:

Totals from 73969 (49.37% of 149839) affected shaders:
SpillSGPRs: 1768 -> 1610 (-8.94%)
CodeSize: 99053892 -> 99047884 (-0.01%); split: -0.02%, +0.01%
Instrs: 19372852 -> 19370398 (-0.01%); split: -0.02%, +0.01%
VClause: 515154 -> 515142 (-0.00%); split: -0.00%, +0.00%
SClause: 719236 -> 718395 (-0.12%); split: -0.14%, +0.02%
Copies: 1109770 -> 1254634 (+13.05%); split: -0.07%, +13.12%
Branches: 374338 -> 374348 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 1776481 -> 1653761 (-6.91%)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10691>

src/amd/compiler/aco_insert_exec_mask.cpp

index e39b5a4..11bec77 100644 (file)
@@ -966,11 +966,14 @@ void add_branch_code(exec_ctx& ctx, Block* block)
          transition_to_Exact(ctx, bld, idx);
 
       uint8_t mask_type = ctx.info[idx].exec.back().second & (mask_type_wqm | mask_type_exact);
+      if (ctx.info[idx].exec.back().first.constantEquals(-1u)) {
+         bld.pseudo(aco_opcode::p_parallelcopy, Definition(exec, bld.lm), cond);
+      } else {
+         Temp old_exec = bld.sop1(Builder::s_and_saveexec, bld.def(bld.lm), bld.def(s1, scc),
+                                 Definition(exec, bld.lm), cond, Operand(exec, bld.lm));
 
-      Temp old_exec = bld.sop1(Builder::s_and_saveexec, bld.def(bld.lm), bld.def(s1, scc),
-                               Definition(exec, bld.lm), cond, Operand(exec, bld.lm));
-
-      ctx.info[idx].exec.back().first = Operand(old_exec);
+         ctx.info[idx].exec.back().first = Operand(old_exec);
+      }
 
       /* add next current exec to the stack */
       ctx.info[idx].exec.emplace_back(Operand(bld.lm), mask_type);