AMDGPU: Fix copying i1 value out of loop with non-uniform exit
authorNicolai Haehnle <nhaehnle@gmail.com>
Wed, 4 Apr 2018 10:57:58 +0000 (10:57 +0000)
committerNicolai Haehnle <nhaehnle@gmail.com>
Wed, 4 Apr 2018 10:57:58 +0000 (10:57 +0000)
commit3ffd383a15349392247302866777425096aedcf2
treec68b934ab2f57bc13a3f3a59540191ddfc129d91
parent21d9b33d62772c58267cc0aa725e35ac9a4661db
AMDGPU: Fix copying i1 value out of loop with non-uniform exit

Summary:
When an i1-value is defined inside of a loop and used outside of it, we
cannot simply use the SGPR bitmask from the loop's last iteration.

There are also useful and correct cases of an i1-value being copied between
basic blocks, e.g. when a condition is computed outside of a loop and used
inside it. The concept of dominators is not sufficient to capture what is
going on, so I propose the notion of "lane-dominators".

Fixes a bug encountered in Nier: Automata.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103743
Change-Id: If37b969ddc71d823ab3004aeafb9ea050e45bd9a

Reviewers: arsenm, rampitec

Subscribers: kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D40547

llvm-svn: 329164
llvm/lib/Target/AMDGPU/SILowerI1Copies.cpp
llvm/lib/Target/AMDGPU/Utils/AMDGPULaneDominator.cpp [new file with mode: 0644]
llvm/lib/Target/AMDGPU/Utils/AMDGPULaneDominator.h [new file with mode: 0644]
llvm/lib/Target/AMDGPU/Utils/CMakeLists.txt
llvm/test/CodeGen/AMDGPU/i1-copy-from-loop.ll [new file with mode: 0644]