review.tizen.org Git - platform/upstream/llvm.git/commit

[AMDGPU] Collapse adjacent SI_END_CF

Add a pass to remove redundant S_OR_B64 instructions enabling lanes in
the exec. If two SI_END_CF (lowered as S_OR_B64) come together without any
vector instructions between them we can only keep outer SI_END_CF, given
that CFG is structured and exec bits of the outer end statement are always
not less than exec bit of the inner one.

This needs to be done before the RA to eliminate saved exec bits registers
but after register coalescer to have no vector registers copies in between
of different end cf statements.

Differential Revision: https://reviews.llvm.org/D35967

llvm-svn: 309762

author	Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>
	Tue, 1 Aug 2017 23:14:32 +0000 (23:14 +0000)
committer	Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>
	Tue, 1 Aug 2017 23:14:32 +0000 (23:14 +0000)
commit	37e7f959c0a381a0eddaed465426bb3605a1ef44
tree	67fe541ea3cbbb75607b98a02e4b81bf4b5fac8e	tree \| snapshot
parent	81ea122121ed82bb229e39014505b3b0146c0532	commit \| diff

llvm/lib/Target/AMDGPU/AMDGPU.h		diff \| blob \| history
llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp		diff \| blob \| history
llvm/lib/Target/AMDGPU/CMakeLists.txt		diff \| blob \| history
llvm/lib/Target/AMDGPU/SIOptimizeExecMaskingPreRA.cpp	[new file with mode: 0644]	blob
llvm/test/CodeGen/AMDGPU/collapse-endcf.ll	[new file with mode: 0644]	blob