[X86][SSE] Combine v16i8 SHL by constants to multiplies
authorSimon Pilgrim <llvm-dev@redking.me.uk>
Sun, 8 Jul 2018 12:47:50 +0000 (12:47 +0000)
committerSimon Pilgrim <llvm-dev@redking.me.uk>
Sun, 8 Jul 2018 12:47:50 +0000 (12:47 +0000)
commit2eced71ecf8d1bf69e219973f1bd5ac19375bb32
tree1fef90d258cafdced6f23dab75aba2ed7614d547
parent1795870bb8628c5858dde7a1a792d7062618d0f7
[X86][SSE] Combine v16i8 SHL by constants to multiplies

Pre-AVX512 (which can perform a quick extend/shift/truncate), extending to 2 v8i16 for the PMULLW and then truncating is more performant than relying on the generic PBLENDVB vXi8 shift path and uses a similar amount of mask constant pool data.

Differential Revision: https://reviews.llvm.org/D48963

llvm-svn: 336513
llvm/lib/Target/X86/X86ISelLowering.cpp
llvm/test/CodeGen/X86/vector-mul.ll
llvm/test/CodeGen/X86/vector-shift-shl-128.ll
llvm/test/CodeGen/X86/vector-shift-shl-256.ll