[X86] lowerShuffleWithVPMOV - support direct lowering to VPMOV on VLX targets
authorSimon Pilgrim <llvm-dev@redking.me.uk>
Thu, 11 Aug 2022 16:35:44 +0000 (17:35 +0100)
committerSimon Pilgrim <llvm-dev@redking.me.uk>
Thu, 11 Aug 2022 16:40:07 +0000 (17:40 +0100)
commit6ba5fc2deedd1a29126ad784bd000974ef139438
treec03f953780e9fb59bc56466b03e9c826189528a7
parentdd4c838da30ad4b6d5dc0f700df0a6629469f719
[X86] lowerShuffleWithVPMOV - support direct lowering to VPMOV on VLX targets

lowerShuffleWithVPMOV currently only matches shuffle(truncate(x)) patterns, but on VLX targets the truncate isn't usually necessary to make the VPMOV node worthwhile (as we're only targetting v16i8/v8i16 shuffles we're almost always ending up with a PSHUFB node instead). PACKSS/PACKUS are still preferred vs VPMOV due to their lower uop count.

Fixes the remaining regression from the fixes in rG293899c64b75
llvm/lib/Target/X86/X86ISelLowering.cpp
llvm/test/CodeGen/X86/avx512-trunc.ll
llvm/test/CodeGen/X86/avx512fp16-cvt-ph-w-vl-intrinsics.ll
llvm/test/CodeGen/X86/vector-interleaved-load-i16-stride-2.ll
llvm/test/CodeGen/X86/vector-interleaved-load-i16-stride-4.ll
llvm/test/CodeGen/X86/vector-rotate-128.ll