aco: use v_add_f{16,32} with clamp for fsat
authorGeorg Lehmann <dadschoorse@gmail.com>
Wed, 22 Feb 2023 16:31:06 +0000 (17:31 +0100)
committerMarge Bot <emma+marge@anholt.net>
Wed, 7 Jun 2023 12:30:11 +0000 (12:30 +0000)
commit177dba62a1fae4e441ae587b70876b991cf8bcd8
tree028016a65b11f9adbfe3ad16edf288cbada8779b
parent3a0bc8f0076c61591070185310903c14b0f2da4f
aco: use v_add_f{16,32} with clamp for fsat

v_add can be dual issued on gfx11, v_med3 cannot.
Don't use v_add directly to still optimize omod(fsat(x)).

Foz-DB GFX1100:
Totals from 32702 (24.24% of 134913) affected shaders:
Latency: 475008203 -> 474928037 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 59226198 -> 59140787 (-0.14%); split: -0.14%, +0.00%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21402>
src/amd/compiler/aco_optimizer.cpp
src/amd/compiler/tests/test_optimizer.cpp
src/amd/compiler/tests/test_sdwa.cpp