[AMDGPU] Prefer v_fmac over v_fma only when no source modifiers are used
authorJay Foad <jay.foad@amd.com>
Mon, 20 Sep 2021 13:20:28 +0000 (14:20 +0100)
committerJay Foad <jay.foad@amd.com>
Tue, 21 Sep 2021 10:57:45 +0000 (11:57 +0100)
commit86dcb592069f2d18a183fa1daa611029ae80ef4c
tree9460ff1a28669e1c8d7142bf675747f5d468c370
parente83629280f32102cd93a216490188922843af06c
[AMDGPU] Prefer v_fmac over v_fma only when no source modifiers are used

v_fmac with source modifiers forces VOP3 encoding, but it is strictly
better to use the VOP3-only v_fma instead, because $dst and $src2 are
not tied so it gives the register allocator more freedom and avoids a
copy in some cases.

This is the same strategy we already use for v_mad vs v_mac and
v_fma_legacy vs v_fmac_legacy.

Differential Revision: https://reviews.llvm.org/D110070
14 files changed:
llvm/lib/Target/AMDGPU/SIInstructions.td
llvm/test/CodeGen/AMDGPU/GlobalISel/fdiv.f32.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/fma.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-fma.s32.mir
llvm/test/CodeGen/AMDGPU/dagcombine-fma-fmad.ll
llvm/test/CodeGen/AMDGPU/fdiv.ll
llvm/test/CodeGen/AMDGPU/fma.f64.ll
llvm/test/CodeGen/AMDGPU/fmad-formation-fmul-distribute-denormal-mode.ll
llvm/test/CodeGen/AMDGPU/fmuladd.f16.ll
llvm/test/CodeGen/AMDGPU/frem.ll
llvm/test/CodeGen/AMDGPU/mad-mix.ll
llvm/test/CodeGen/AMDGPU/strict_fma.f16.ll
llvm/test/CodeGen/AMDGPU/strict_fma.f32.ll
llvm/test/CodeGen/AMDGPU/udiv.ll