[LLVM][AMDGPU] Specialize 32-bit atomic fadd instruction for generic address space
authorShilei Tian <i@tianshilei.me>
Fri, 4 Nov 2022 18:10:54 +0000 (14:10 -0400)
committerShilei Tian <i@tianshilei.me>
Fri, 4 Nov 2022 18:11:05 +0000 (14:11 -0400)
commit1186e9d59fea662292cdf62fdd1544b5b27d7d37
tree3161569ff1813ff3b92e9568e920dca7c857dcb0
parent93c7a9bf6cc142a5a37f22e7dc9fe2c4e20befe1
[LLVM][AMDGPU] Specialize 32-bit atomic fadd instruction for generic address space

The 32-bit floating-point atomic add instructions on AMDGPUs does not support a
"flat" or "generic" address space. So, if the address space cannot be determined
statically, the AMDGPU backend will fall back to a CAS loop (which does support
"flat" addressing). Instead, this patch emits runtime address-space checks to
allow native FP atomic add instructions for global and LDS memory (and non-atomic
FP add instructions for private/scratch memory).

In order to do that, this patch introduces a new interface function
`emitExpandAtomicRMW`. It is expected to be called when a common atomic expand
doesn't work for a specific target, such as the case we discussed here.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D129690
llvm/include/llvm/CodeGen/TargetLowering.h
llvm/lib/CodeGen/AtomicExpandPass.cpp
llvm/lib/Target/AMDGPU/SIISelLowering.cpp
llvm/lib/Target/AMDGPU/SIISelLowering.h
llvm/test/CodeGen/AMDGPU/atomicrmw-expand.ll [new file with mode: 0644]
llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-rmw-fadd-flat-specialization.ll [new file with mode: 0644]
llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-rmw-fadd.ll