intel/fs: avoid reusing the VGRF for uniform load_ubo
authorLionel Landwerlin <lionel.g.landwerlin@intel.com>
Tue, 13 Jun 2023 17:37:22 +0000 (20:37 +0300)
committerMarge Bot <emma+marge@anholt.net>
Wed, 14 Jun 2023 12:04:05 +0000 (12:04 +0000)
Only found 3 shaders affected in Red Dead Redemption :

Totals from 3 (0.05% of 5969) affected shaders:
Instrs: 2246 -> 2230 (-0.71%)
Cycles: 156506 -> 148402 (-5.18%); split: -5.23%, +0.05%

This will have a larger effect when we add the
load_ubo_uniform_block_intel intrinsic where we will have larger
blocks (vec8/vec16 vs vec4 only now).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23477>

src/intel/compiler/brw_fs_nir.cpp

index 12590a0..0846104 100644 (file)
@@ -4896,7 +4896,6 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
 
          const unsigned block_sz = 64; /* Fetch one cacheline at a time. */
          const fs_builder ubld = bld.exec_all().group(block_sz / 4, 0);
-         const fs_reg packed_consts = ubld.vgrf(BRW_REGISTER_TYPE_UD);
 
          for (unsigned c = 0; c < instr->num_components;) {
             const unsigned base = load_offset + c * type_size;
@@ -4904,6 +4903,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
             const unsigned count = MIN2(instr->num_components - c,
                                         (block_sz - base % block_sz) / type_size);
 
+            const fs_reg packed_consts = ubld.vgrf(BRW_REGISTER_TYPE_UD);
             fs_reg srcs[PULL_UNIFORM_CONSTANT_SRCS];
             srcs[PULL_UNIFORM_CONSTANT_SRC_SURFACE]        = surface;
             srcs[PULL_UNIFORM_CONSTANT_SRC_SURFACE_HANDLE] = surface_handle;