From: Andrea Di Biagio Date: Thu, 6 Nov 2014 14:36:45 +0000 (+0000) Subject: [X86] When commuting SSE immediate blend, make sure that the new blend mask is a... X-Git-Url: http://review.tizen.org/git/?a=commitdiff_plain;h=7ecd22ca4a205788ce8228d60bf5e67bb06fdfb6;p=platform%2Fupstream%2Fllvm.git [X86] When commuting SSE immediate blend, make sure that the new blend mask is a valid imm8. Example: define <4 x i32> @test(<4 x i32> %a, <4 x i32> %b) { %shuffle = shufflevector <4 x i32> %a, <4 x i32> %b, <4 x i32> ret <4 x i32> %shuffle } Before llc (-mattr=+sse4.1), produced the following assembly instruction: pblendw $4294967103, %xmm1, %xmm0 After pblendw $63, %xmm1, %xmm0 llvm-svn: 221455 --- diff --git a/llvm/lib/Target/X86/X86InstrInfo.cpp b/llvm/lib/Target/X86/X86InstrInfo.cpp index dd463f1..a49dcc7 100644 --- a/llvm/lib/Target/X86/X86InstrInfo.cpp +++ b/llvm/lib/Target/X86/X86InstrInfo.cpp @@ -2449,7 +2449,8 @@ X86InstrInfo::commuteInstruction(MachineInstr *MI, bool NewMI) const { case X86::VPBLENDDYrri: Mask = 0xFF; break; case X86::VPBLENDWYrri: Mask = 0xFF; break; } - unsigned Imm = MI->getOperand(3).getImm(); + // Only the least significant bits of Imm are used. + unsigned Imm = MI->getOperand(3).getImm() & Mask; if (NewMI) { MachineFunction &MF = *MI->getParent()->getParent(); MI = MF.CloneMachineInstr(MI); diff --git a/llvm/test/CodeGen/X86/commuted-blend-mask.ll b/llvm/test/CodeGen/X86/commuted-blend-mask.ll new file mode 100644 index 0000000..e6322cb --- /dev/null +++ b/llvm/test/CodeGen/X86/commuted-blend-mask.ll @@ -0,0 +1,13 @@ +; RUN: llc -mtriple=x86_64-unknown-unknown -mattr=+sse4.1 < %s | FileCheck %s + +; When commuting the operands of a SSE blend, make sure that the resulting blend +; mask can be encoded as a imm8. +; Before, when commuting the operands to the shuffle in function @test, the backend +; produced the following assembly: +; pblendw $4294967103, %xmm1, %xmm0 + +define <4 x i32> @test(<4 x i32> %a, <4 x i32> %b) { + ;CHECK: pblendw $63, %xmm1, %xmm0 + %shuffle = shufflevector <4 x i32> %a, <4 x i32> %b, <4 x i32> + ret <4 x i32> %shuffle +}