[InstCombine] remove identity shuffle simplification for mask with undefs
authorSanjay Patel <spatel@rotateright.com>
Sun, 24 Nov 2019 15:06:26 +0000 (10:06 -0500)
committerSanjay Patel <spatel@rotateright.com>
Sun, 24 Nov 2019 15:06:26 +0000 (10:06 -0500)
commitf575f12c646544a3200852cf72212045fdf2e0b4
tree3fa3ccc309cfe279115e940d33e844994d62116e
parentf04a3e981d3175d7f3d0f5008b842823034f47ed
[InstCombine] remove identity shuffle simplification for mask with undefs

And simultaneously enhance SimplifyDemandedVectorElts() to rcognize that
pattern. That preserves some of the old optimizations in IR.

Given a shuffle that includes undef elements in an otherwise identity mask like:

define <4 x float> @shuffle(<4 x float> %arg) {
  %shuf = shufflevector <4 x float> %arg, <4 x float> undef, <4 x i32> <i32 undef, i32 1, i32 2, i32 3>
  ret <4 x float> %shuf
}

We were simplifying that to the input operand.

But as discussed in PR43958:
https://bugs.llvm.org/show_bug.cgi?id=43958
...that means that per-vector-element poison that would be stopped by the shuffle can now
leak to the result.

Also note that we still have (and there are tests for) the same transform with no undef
elements in the mask (a fully-defined identity mask). I don't think there's any
controversy about that case - it's a valid transform under any interpretation of
shufflevector/undef/poison.

Looking at a few of the diffs into codegen, I don't see any difference in final asm. So
depending on your perspective, that's good (no real loss of optimization power) or bad
(poison exists in the DAG, so we only partially fixed the bug).

Differential Revision: https://reviews.llvm.org/D70246
12 files changed:
llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
llvm/test/Transforms/InstCombine/X86/x86-avx2.ll
llvm/test/Transforms/InstCombine/X86/x86-avx512.ll
llvm/test/Transforms/InstCombine/X86/x86-fma.ll
llvm/test/Transforms/InstCombine/X86/x86-pack.ll
llvm/test/Transforms/InstCombine/X86/x86-pshufb.ll
llvm/test/Transforms/InstCombine/X86/x86-sse4a.ll
llvm/test/Transforms/InstCombine/X86/x86-vpermil.ll
llvm/test/Transforms/InstCombine/shuffle_select.ll
llvm/test/Transforms/InstCombine/vec_demanded_elts.ll
llvm/test/Transforms/InstCombine/vec_shuffle.ll