review.tizen.org Git - platform/upstream/llvm.git/commit

author	Shilei Tian <i@tianshilei.me>
	Fri, 4 Nov 2022 18:10:54 +0000 (14:10 -0400)
committer	Shilei Tian <i@tianshilei.me>
	Fri, 4 Nov 2022 18:11:05 +0000 (14:11 -0400)
commit	1186e9d59fea662292cdf62fdd1544b5b27d7d37
tree	3161569ff1813ff3b92e9568e920dca7c857dcb0	tree \| snapshot
parent	93c7a9bf6cc142a5a37f22e7dc9fe2c4e20befe1	commit \| diff

[LLVM][AMDGPU] Specialize 32-bit atomic fadd instruction for generic address space

The 32-bit floating-point atomic add instructions on AMDGPUs does not support a
"flat" or "generic" address space. So, if the address space cannot be determined
statically, the AMDGPU backend will fall back to a CAS loop (which does support
"flat" addressing). Instead, this patch emits runtime address-space checks to
allow native FP atomic add instructions for global and LDS memory (and non-atomic
FP add instructions for private/scratch memory).

In order to do that, this patch introduces a new interface function
`emitExpandAtomicRMW`. It is expected to be called when a common atomic expand
doesn't work for a specific target, such as the case we discussed here.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D129690

llvm/include/llvm/CodeGen/TargetLowering.h		diff \| blob \| history
llvm/lib/CodeGen/AtomicExpandPass.cpp		diff \| blob \| history
llvm/lib/Target/AMDGPU/SIISelLowering.cpp		diff \| blob \| history
llvm/lib/Target/AMDGPU/SIISelLowering.h		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/atomicrmw-expand.ll	[new file with mode: 0644]	blob
llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-rmw-fadd-flat-specialization.ll	[new file with mode: 0644]	blob
llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-rmw-fadd.ll		diff \| blob \| history