aco/gfx11: optimize LS/HS load_local_invocation_index
authorRhys Perry <pendingchaos02@gmail.com>
Thu, 20 Oct 2022 14:47:02 +0000 (15:47 +0100)
committerMarge Bot <emma+marge@anholt.net>
Fri, 21 Oct 2022 20:01:30 +0000 (20:01 +0000)
fossil-db (gfx1100):
Totals from 1361 (1.01% of 135032) affected shaders:
Instrs: 501227 -> 500469 (-0.15%); split: -0.16%, +0.01%
CodeSize: 2730012 -> 2724820 (-0.19%); split: -0.20%, +0.00%
VGPRs: 63716 -> 63688 (-0.04%)
Latency: 2228848 -> 2228858 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 878418 -> 878275 (-0.02%); split: -0.02%, +0.00%
VClause: 14866 -> 14868 (+0.01%); split: -0.03%, +0.04%
SClause: 16674 -> 16645 (-0.17%); split: -0.22%, +0.05%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19196>

src/amd/compiler/aco_instruction_selection.cpp

index f2afe92..aa4ab7b 100644 (file)
@@ -8448,9 +8448,7 @@ visit_intrinsic(isel_context* ctx, nir_intrinsic_instr* instr)
 
             Temp temp = bld.sop2(aco_opcode::s_mul_i32, bld.def(s1), wave_id,
                                  Operand::c32(ctx->program->wave_size));
-            Temp thread_id = emit_mbcnt(ctx, bld.tmp(v1));
-
-            bld.vadd32(Definition(get_ssa_temp(ctx, &instr->dest.ssa)), temp, thread_id);
+            emit_mbcnt(ctx, get_ssa_temp(ctx, &instr->dest.ssa), Operand(), Operand(temp));
          } else {
             bld.copy(Definition(get_ssa_temp(ctx, &instr->dest.ssa)),
                      get_arg(ctx, ctx->args->ac.vs_rel_patch_id));