[X86][SSE] Support v16i8/v32i8 vector rotations
authorSimon Pilgrim <llvm-dev@redking.me.uk>
Fri, 29 Jun 2018 09:36:39 +0000 (09:36 +0000)
committerSimon Pilgrim <llvm-dev@redking.me.uk>
Fri, 29 Jun 2018 09:36:39 +0000 (09:36 +0000)
commitaab8660e232d932c9adf9190ed9c2f5e0ace1c86
treea97757c434cc5cadfdf75785c19044cf7ded1fce
parent564a33a6e85e74e1cb883da75e776de1d8039a69
[X86][SSE] Support v16i8/v32i8 vector rotations

This uses the same technique as for shifts - split the rotation into 4/2/1-bit partial rotations and select those partials based on the amount bit, making use of PBLENDVB if available. This halves the use of PBLENDVB compared to expanding to shifts, which can be a slow op.

Unfortunately I haven't found a decent way to share much of this code with the shift equivalent.

Differential Revision: https://reviews.llvm.org/D48655

llvm-svn: 335957
llvm/lib/Target/X86/X86ISelLowering.cpp
llvm/test/CodeGen/X86/vector-rotate-128.ll
llvm/test/CodeGen/X86/vector-rotate-256.ll
llvm/test/CodeGen/X86/vector-rotate-512.ll