aco: insert a single p_end_wqm after the last derivative calculation
authorDaniel Schürmann <daniel@schuermann.dev>
Sat, 2 Sep 2023 09:14:33 +0000 (11:14 +0200)
committerMarge Bot <emma+marge@anholt.net>
Thu, 14 Sep 2023 09:25:23 +0000 (09:25 +0000)
commit45f6d38a766875616ffc480f27560389d7d585ef
treec5ef5132c4f95005d2ee5ba6ed9b41e3e25fa4b5
parent28904839dadb2a1576edbcc4a6dd77637da173f1
aco: insert a single p_end_wqm after the last derivative calculation

This new instruction replaces p_wqm.

Totals from 28065 (36.65% of 76572) affected shaders: (GFX11)
MaxWaves: 823922 -> 823952 (+0.00%); split: +0.01%, -0.01%
Instrs: 22221375 -> 22180465 (-0.18%); split: -0.26%, +0.08%
CodeSize: 117310676 -> 117040684 (-0.23%); split: -0.30%, +0.07%
VGPRs: 1183476 -> 1186656 (+0.27%); split: -0.19%, +0.46%
SpillSGPRs: 2305 -> 2302 (-0.13%)
Latency: 176559310 -> 176427793 (-0.07%); split: -0.21%, +0.14%
InvThroughput: 26245204 -> 26195550 (-0.19%); split: -0.26%, +0.07%
VClause: 368058 -> 369460 (+0.38%); split: -0.21%, +0.59%
SClause: 857077 -> 842588 (-1.69%); split: -2.06%, +0.37%
Copies: 1245650 -> 1249434 (+0.30%); split: -0.33%, +0.63%
Branches: 394837 -> 396070 (+0.31%); split: -0.01%, +0.32%
PreSGPRs: 1019139 -> 1019567 (+0.04%); split: -0.02%, +0.06%
PreVGPRs: 925739 -> 931860 (+0.66%); split: -0.00%, +0.66%

Changes are due to scheduling and re-enabling cross-lane optimizations.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25038>
src/amd/compiler/aco_insert_exec_mask.cpp
src/amd/compiler/aco_instruction_selection.cpp
src/amd/compiler/aco_instruction_selection.h
src/amd/compiler/aco_ir.cpp
src/amd/compiler/aco_opcodes.py
src/amd/compiler/aco_opt_value_numbering.cpp
src/amd/compiler/tests/test_d3d11_derivs.cpp