radv,aco: don't lower some ffma instructions
authorRhys Perry <pendingchaos02@gmail.com>
Tue, 16 Jun 2020 13:34:05 +0000 (14:34 +0100)
committerMarge Bot <emma+marge@anholt.net>
Mon, 13 Dec 2021 11:22:33 +0000 (11:22 +0000)
GFX10.3 has no v_mad_f32 and we can't recombine exact ffma into a
v_fma_f32 if they're split. GFX9+ only has v_fma_f16 and no generation has
a 64-bit MAD.

fossil-db (GFX10.3):
Totals from 84040 (57.46% of 146267) affected shaders:
VGPRs: 3717256 -> 3688064 (-0.79%); split: -0.87%, +0.08%
SpillSGPRs: 10419 -> 10403 (-0.15%)
CodeSize: 263064884 -> 262442820 (-0.24%); split: -0.31%, +0.07%
MaxWaves: 2036908 -> 2038374 (+0.07%); split: +0.10%, -0.03%
Instrs: 49849448 -> 49572182 (-0.56%); split: -0.60%, +0.04%
Latency: 908130602 -> 907764246 (-0.04%); split: -0.18%, +0.14%
InvThroughput: 207051300 -> 206762704 (-0.14%); split: -0.24%, +0.10%

fossil-db (GFX10):
Totals from 2 (0.00% of 146267) affected shaders:
Latency: 8123 -> 8107 (-0.20%)

fossil-db (GFX9):
Totals from 2 (0.00% of 146401) affected shaders:
(no statistics affected)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9805>

src/amd/vulkan/radv_shader.c

index 1aa3dca..e6b9f21 100644 (file)
@@ -78,9 +78,9 @@ radv_get_nir_options(struct radv_physical_device *device)
       .lower_unpack_unorm_2x16 = true,
       .lower_unpack_unorm_4x8 = true,
       .lower_unpack_half_2x16 = true,
-      .lower_ffma16 = true,
-      .lower_ffma32 = true,
-      .lower_ffma64 = true,
+      .lower_ffma16 = device->rad_info.chip_class < GFX9,
+      .lower_ffma32 = device->rad_info.chip_class < GFX10_3,
+      .lower_ffma64 = false,
       .lower_fpow = true,
       .lower_mul_2x32_64 = true,
       .lower_rotate = true,