[X86][AVX] Fold PACK(LOSUBVECTOR(SHUFFLE(X)),HISUBVECTOR(SHUFFLE(X))) -> SHUFFLE...
authorSimon Pilgrim <llvm-dev@redking.me.uk>
Sat, 4 Jul 2020 12:54:30 +0000 (13:54 +0100)
committerSimon Pilgrim <llvm-dev@redking.me.uk>
Sat, 4 Jul 2020 12:54:30 +0000 (13:54 +0100)
commit71f342d6c3d82fa45c180f1981710bb6092d39fc
treea688cb0d930ee6c98d9216db876c184eb6accfed
parent07d4d84676a94741a226e254a1d536db99c7dac0
[X86][AVX] Fold PACK(LOSUBVECTOR(SHUFFLE(X)),HISUBVECTOR(SHUFFLE(X))) -> SHUFFLE(PACK(LOSUBVECTOR(X),HISUBVECTOR(X)))

Using PACK for truncations leaves us with intermediate shuffles that can be tricky to remove while the truncation tree is being formed.

This fold helps pull out the PERMQ case which is one of the most common, avoiding some costly lane-crossing shuffles.

A future patch will begin adding more general shuffle folding, which we should be able to use for HADD/HSUB as well.
18 files changed:
llvm/lib/Target/X86/X86ISelLowering.cpp
llvm/test/CodeGen/X86/bitcast-setcc-512.ll
llvm/test/CodeGen/X86/bitcast-vector-bool.ll
llvm/test/CodeGen/X86/masked_expandload.ll
llvm/test/CodeGen/X86/masked_store_trunc.ll
llvm/test/CodeGen/X86/masked_store_trunc_ssat.ll
llvm/test/CodeGen/X86/masked_store_trunc_usat.ll
llvm/test/CodeGen/X86/movmsk-cmp.ll
llvm/test/CodeGen/X86/psubus.ll
llvm/test/CodeGen/X86/vector-compare-results.ll
llvm/test/CodeGen/X86/vector-reduce-and-bool.ll
llvm/test/CodeGen/X86/vector-reduce-or-bool.ll
llvm/test/CodeGen/X86/vector-reduce-xor-bool.ll
llvm/test/CodeGen/X86/vector-trunc-packus.ll
llvm/test/CodeGen/X86/vector-trunc-ssat.ll
llvm/test/CodeGen/X86/vector-trunc-usat.ll
llvm/test/CodeGen/X86/vector-trunc.ll
llvm/test/CodeGen/X86/vselect-packss.ll