Fix 64-bit copy to SCC
authorPiotr Sobczak <Piotr.Sobczak@amd.com>
Thu, 30 Jul 2020 11:56:06 +0000 (13:56 +0200)
committerPiotr Sobczak <Piotr.Sobczak@amd.com>
Sun, 9 Aug 2020 18:50:30 +0000 (20:50 +0200)
commit62d8b8a2253c4615723e4fdd92505f25d78c75ee
treed3188081e83c266aed988eb62f741a2b046f2042
parent5a0d6cdbd16c80d39e3a237582f6bb47aadacb67
Fix 64-bit copy to SCC

Fix 64-bit copy to SCC by restricting the pattern resulting
in such a copy to subtargets supporting 64-bit scalar compare,
and mapping the copy to S_CMP_LG_U64.

Before introducing the S_CSELECT pattern with explicit SCC
(0045786f146e78afee49eee053dc29ebc842fee1), there was no need
for handling 64-bit copy to SCC ($scc = COPY sreg_64).

The proposed handling to read only the low bits was however
based on a false premise that it is only one bit that matters,
while in fact the copy source might be a vector of booleans and
all bits need to be considered.

The practical problem of mapping the 64-bit copy to SCC is that
the natural instruction to use (S_CMP_LG_U64) is not available
on old hardware. Fix it by restricting the problematic pattern
to subtargets supporting the instruction (hasScalarCompareEq64).

Differential Revision: https://reviews.llvm.org/D85207
27 files changed:
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
llvm/lib/Target/AMDGPU/SOPInstructions.td
llvm/test/CodeGen/AMDGPU/32-bit-local-address-space.ll
llvm/test/CodeGen/AMDGPU/addrspacecast.ll
llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-idiv.ll
llvm/test/CodeGen/AMDGPU/ctlz.ll
llvm/test/CodeGen/AMDGPU/ctlz_zero_undef.ll
llvm/test/CodeGen/AMDGPU/extractelt-to-trunc.ll
llvm/test/CodeGen/AMDGPU/fceil64.ll
llvm/test/CodeGen/AMDGPU/fshl.ll
llvm/test/CodeGen/AMDGPU/fshr.ll
llvm/test/CodeGen/AMDGPU/insert_vector_elt.ll
llvm/test/CodeGen/AMDGPU/mad_uint24.ll
llvm/test/CodeGen/AMDGPU/sad.ll
llvm/test/CodeGen/AMDGPU/sdiv.ll
llvm/test/CodeGen/AMDGPU/sdiv64.ll
llvm/test/CodeGen/AMDGPU/select-opt.ll
llvm/test/CodeGen/AMDGPU/select-vectors.ll
llvm/test/CodeGen/AMDGPU/select64.ll
llvm/test/CodeGen/AMDGPU/sint_to_fp.f64.ll
llvm/test/CodeGen/AMDGPU/srem64.ll
llvm/test/CodeGen/AMDGPU/trunc.ll
llvm/test/CodeGen/AMDGPU/udiv64.ll
llvm/test/CodeGen/AMDGPU/udivrem.ll
llvm/test/CodeGen/AMDGPU/uint_to_fp.f64.ll
llvm/test/CodeGen/AMDGPU/urem64.ll
llvm/test/CodeGen/AMDGPU/vselect.ll