[X86][SSE] Vectorized v4i32 non-uniform shifts.
authorSimon Pilgrim <llvm-dev@redking.me.uk>
Sun, 12 Jul 2015 11:15:19 +0000 (11:15 +0000)
committerSimon Pilgrim <llvm-dev@redking.me.uk>
Sun, 12 Jul 2015 11:15:19 +0000 (11:15 +0000)
commit64cc4ad0a273fec56debf406c8524d1122d249b9
tree35e69efafa6d2c3ac1d74bdfa9898949f5fcba34
parentd08eca0181f0d1d21fd7f35fde62eccb509cf5c5
[X86][SSE] Vectorized v4i32 non-uniform shifts.

While the v4i32 shl operation is already vectorized using a cvttps2dq/pmulld pattern, the lshr/ashr opeations are still scalarized.

This patch adds vectorization support for non-uniform v4i32 shift operations - it splats constant shift amounts to allow them to use the immediate sse shift instructions, or extracts/zero-extends non-constant shift amounts. The individual results are then blended together.

Differential Revision: http://reviews.llvm.org/D11063

llvm-svn: 241989
llvm/lib/Target/X86/X86ISelLowering.cpp
llvm/lib/Target/X86/X86TargetTransformInfo.cpp
llvm/test/Analysis/CostModel/X86/testshiftashr.ll
llvm/test/Analysis/CostModel/X86/testshiftlshr.ll
llvm/test/CodeGen/X86/vector-shift-ashr-128.ll
llvm/test/CodeGen/X86/vector-shift-ashr-256.ll
llvm/test/CodeGen/X86/vector-shift-lshr-128.ll
llvm/test/CodeGen/X86/vector-shift-lshr-256.ll
llvm/test/CodeGen/X86/widen_load-2.ll