intel/compiler: balanced tileY/linear friendly LID order for CS
authorFelix DeGrood <felix.j.degrood@intel.com>
Wed, 19 May 2021 16:50:45 +0000 (09:50 -0700)
committerMarge Bot <eric+marge@anholt.net>
Sat, 22 May 2021 00:15:25 +0000 (00:15 +0000)
commit380fa050f23870ff1823b1c4b2a9b89cf0835f27
tree54035db69829c6cad53fa2bf1be0e964be64e092
parentc23e2a662a84f7f2704c99f393bf65fc4e93a5ef
intel/compiler: balanced tileY/linear friendly LID order for CS

Fixes perf regression introduced from tileY LID order for CS
shaders that access both textures and buffers. Walks LIDs in
X-major fashion, but with blocks of height 4. This maps LIDs per
HW thread for SIMD8/16/32 as (2x4/4x4/8x4), which is always good
for tileY resources and usually good for linear resources.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10733>
src/intel/compiler/brw_nir_lower_cs_intrinsics.c