review.tizen.org Git - platform/upstream/llvm.git/commit

author	Ruiling Song <ruiling.song@amd.com>
	Thu, 5 Jan 2023 01:29:48 +0000 (09:29 +0800)
committer	Ruiling Song <ruiling.song@amd.com>
	Wed, 11 Jan 2023 01:59:35 +0000 (09:59 +0800)
commit	9119d9bfcef47b245d15fc9d2e5044bc67724bfc
tree	fb1344b7c4fa17fa3355b75c476f896d52169d95	tree \| snapshot
parent	cce24b6af0999c658fd3e4931eb9bc58252478b8	commit \| diff

AMDGPU/SIInsertWait: Skip dummy tied source

For D16 memory load instructions, the hardware usually only write to half
of the 32bit register, but we define the destination register using
32bit register for the MachineIR instruction. Without the extra tied
source register, LLVM framework will think previous write to the other
half of the register being dead. This is because by using 32bit register
as the destination register, LLVM will think the instruction will always
overwrite the whole 32bit register. By adding the extra tied source,
LLVM will think we are reading the register, so previous write to the
register will not be dead. This dummy tied source is introducing
unnecessary read-after-write dependency. The change here is to bypass the
tied source that can be skipped, thus avoiding an unnecessary s_waitcnt.

Reviewed by: foad

Differential Revision: https://reviews.llvm.org/D140537

llvm/lib/Target/AMDGPU/BUFInstructions.td		diff \| blob \| history
llvm/lib/Target/AMDGPU/DSInstructions.td		diff \| blob \| history
llvm/lib/Target/AMDGPU/FLATInstructions.td		diff \| blob \| history
llvm/lib/Target/AMDGPU/SIDefines.h		diff \| blob \| history
llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp		diff \| blob \| history
llvm/lib/Target/AMDGPU/SIInstrFormats.td		diff \| blob \| history
llvm/lib/Target/AMDGPU/SIInstrInfo.h		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/chain-hi-to-lo.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/load-hi16.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/vector_shuffle.packed.ll		diff \| blob \| history