[X86][SSE] Use (V)PINSRB for direct byte insertion in 16i8 buildvector on SSE4.1...
authorSimon Pilgrim <llvm-dev@redking.me.uk>
Mon, 6 Apr 2015 18:39:00 +0000 (18:39 +0000)
committerSimon Pilgrim <llvm-dev@redking.me.uk>
Mon, 6 Apr 2015 18:39:00 +0000 (18:39 +0000)
commit49ba9b8e2ffcbe78de5971074a76a60191d59d86
tree3992764d1f169dcd21c9df44e36188936bafb049
parent7431fb0257a7c7921f3928a8699adb97b4e22936
[X86][SSE] Use (V)PINSRB for direct byte insertion in 16i8 buildvector on SSE4.1 targets

This patch allows SSE4.1 targets to use (V)PINSRB to create 16i8 vectors by inserting i8 scalars directly into a XMM register instead of merging pairs of i8 scalars into a i16 and using the SSE2 PINSRW instruction.

This allows folding of byte loads and reduces scalar register usage as well.

Differential Revision: http://reviews.llvm.org/D8839

llvm-svn: 234193
llvm/lib/Target/X86/X86ISelLowering.cpp
llvm/test/CodeGen/X86/vec_cast2.ll
llvm/test/CodeGen/X86/vector-shuffle-128-v16.ll