[AMDGPU] Use new opcode for indexed vgpr reads
authorJay Foad <jay.foad@amd.com>
Fri, 19 Nov 2021 10:32:35 +0000 (10:32 +0000)
committerJay Foad <jay.foad@amd.com>
Fri, 19 Nov 2021 13:08:11 +0000 (13:08 +0000)
commit30b27ecfc2516c019209d2ea4b05903548635647
tree55b6702950668a2b2c9dcb112fec910566509e6f
parent049799c311515c8c8b5daf91b4a731870ed54afe
[AMDGPU] Use new opcode for indexed vgpr reads

Introduce V_MOV_B32_indirect_read for indexed vgpr reads
(and rename the old V_MOV_B32_indirect to
V_MOV_B32_indirect_write) so they can be unambiguously
distinguished from regular V_MOV_B32_e32. Previously they
were distinguished by looking for extra implicit operands
but this is fragile because regular moves sometimes have
extra implicit operands too:
- either by accident, when instructions end up with
  duplicate implicit operands (see e.g. D100939)
- or by design, when SIInstrInfo::copyPhysReg breaks a
  multi-dword copy into individual subreg mov instructions
  and adds implicit operands for the super-register.

The effect of this is that SIInstrInfo::isFoldableCopy can
be simplified and identifies more foldable copies. The test
diffs show that more immediate 0 values have been folded as
inline operands.

SIInstrInfo::isReallyTriviallyReMaterializable could
probably be simplified too but that is not part of this
patch.

Differential Revision: https://reviews.llvm.org/D114230
13 files changed:
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
llvm/lib/Target/AMDGPU/SIPreEmitPeephole.cpp
llvm/lib/Target/AMDGPU/VOP1Instructions.td
llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-idiv.ll
llvm/test/CodeGen/AMDGPU/bypass-div.ll
llvm/test/CodeGen/AMDGPU/llvm.mulo.ll
llvm/test/CodeGen/AMDGPU/mul_uint24-amdgcn.ll
llvm/test/CodeGen/AMDGPU/sdiv64.ll
llvm/test/CodeGen/AMDGPU/set-gpr-idx-peephole.mir
llvm/test/CodeGen/AMDGPU/srem64.ll
llvm/test/CodeGen/AMDGPU/udiv.ll
llvm/test/CodeGen/AMDGPU/udiv64.ll
llvm/test/CodeGen/AMDGPU/urem64.ll