review.tizen.org Git - platform/upstream/llvm.git/commit

projects / platform / upstream / llvm.git / commit

author	Jay Foad <jay.foad@amd.com>
	Fri, 19 Mar 2021 12:34:37 +0000 (12:34 +0000)
committer	Jay Foad <jay.foad@amd.com>
	Fri, 26 Mar 2021 15:38:14 +0000 (15:38 +0000)
commit	9d08f276d79b59e3d1ad3db3db19077284524ca3
tree	f1b04394c6379818561d61823d0024577bc8aea1	tree \| snapshot
parent	69d01e0e4001573612b0de234a05d3d2580fc3b8	commit \| diff

[AMDGPU] Use reductions instead of scans in the atomic optimizer

If the result of an atomic operation is not used then it can be more
efficient to build a reduction across all lanes instead of a scan. Do
this for GFX10, where the permlanex16 instruction makes it viable. For
wave64 this saves a couple of dpp operations. For wave32 it saves one
readlane (which are generally bad for performance) and one dpp
operation.

Differential Revision: https://reviews.llvm.org/D98953

llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp		diff \| blob \| history
llvm/lib/Target/AMDGPU/GCNSubtarget.h		diff \| blob \| history
llvm/lib/Target/AMDGPU/SIDefines.h		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll		diff \| blob \| history

Domain: System / Toolchain;

RSS Atom