[AMDGPU] Optimize SI_IF lowering for simple if regions
authorStanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>
Wed, 26 Jul 2017 21:29:15 +0000 (21:29 +0000)
committerStanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>
Wed, 26 Jul 2017 21:29:15 +0000 (21:29 +0000)
commit3197eb69812f9ff1c0ef4b2a7b894397dec3de24
tree9eb5e70f569cbb3569bcf3c65be5a9bc02fa5265
parentb3ed4bcb8f1b0a9343d47628ae98127ca33575d1
[AMDGPU] Optimize SI_IF lowering for simple if regions

Currently SI_IF results in a s_and_saveexec_b64 followed by s_xor_b64.
The xor is used to extract only the changed bits. In case of a simple
if region where the only use of that value is in the SI_END_CF to
restore the old exec mask, we can omit the xor and perform an or of
the exec mask with the original exec value saved by the
s_and_saveexec_b64.

Differential Revision: https://reviews.llvm.org/D35861

llvm-svn: 309185
14 files changed:
llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp
llvm/test/CodeGen/AMDGPU/branch-condition-and.ll
llvm/test/CodeGen/AMDGPU/branch-relaxation.ll
llvm/test/CodeGen/AMDGPU/control-flow-fastregalloc.ll
llvm/test/CodeGen/AMDGPU/i1-copy-phi.ll
llvm/test/CodeGen/AMDGPU/llvm.amdgcn.div.fmas.ll
llvm/test/CodeGen/AMDGPU/multi-divergent-exit-region.ll
llvm/test/CodeGen/AMDGPU/ret_jump.ll
llvm/test/CodeGen/AMDGPU/si-lower-control-flow-unreachable-block.ll
llvm/test/CodeGen/AMDGPU/skip-if-dead.ll
llvm/test/CodeGen/AMDGPU/subreg-coalescer-undef-use.ll
llvm/test/CodeGen/AMDGPU/uniform-cfg.ll
llvm/test/CodeGen/AMDGPU/uniform-loop-inside-nonuniform.ll
llvm/test/CodeGen/AMDGPU/valu-i1.ll