[X86][SSE] lowerShuffleWithPACK - extend to use chained PACKs for larger truncations
authorSimon Pilgrim <llvm-dev@redking.me.uk>
Tue, 31 Mar 2020 13:37:48 +0000 (14:37 +0100)
committerSimon Pilgrim <llvm-dev@redking.me.uk>
Tue, 31 Mar 2020 13:48:48 +0000 (14:48 +0100)
commitefe59d6717dcdf7777acb9b7a734e1a520bdf22a
tree7fc5dfec03b99d91317d4691e1bad146abc781f3
parentb632fe5a363488a377b22e1708c8b57d986d3fd8
[X86][SSE] lowerShuffleWithPACK - extend to use chained PACKs for larger truncations

If canLowerByDroppingEvenElements indicates that the shuffle is a N:1 compaction pattern and the inputs are suitably sign/zero extended then we can use a chain of PACKSS/PACKUS to compact.

This helps avoid PSHUFB (and its mask load) for short shuffle chains, shuffle combining will still replace with a PSHUFB if we have enough shuffles as getFauxShuffleMask can recognise PACKSS/PACKUS chains.
12 files changed:
llvm/lib/Target/X86/X86ISelLowering.cpp
llvm/test/CodeGen/X86/avx-fp2int.ll
llvm/test/CodeGen/X86/bitcast-and-setcc-512.ll
llvm/test/CodeGen/X86/masked_store_trunc_ssat.ll
llvm/test/CodeGen/X86/masked_store_trunc_usat.ll
llvm/test/CodeGen/X86/psubus.ll
llvm/test/CodeGen/X86/vec-strict-fptoint-128.ll
llvm/test/CodeGen/X86/vec-strict-fptoint-256.ll
llvm/test/CodeGen/X86/vec_cast2.ll
llvm/test/CodeGen/X86/vector-trunc-packus.ll
llvm/test/CodeGen/X86/vector-trunc-ssat.ll
llvm/test/CodeGen/X86/vector-trunc-usat.ll