review.tizen.org Git - platform/upstream/llvm.git/commit

projects / platform / upstream / llvm.git / commit

author	Simon Pilgrim <llvm-dev@redking.me.uk>
	Sun, 24 Mar 2019 19:06:35 +0000 (19:06 +0000)
committer	Simon Pilgrim <llvm-dev@redking.me.uk>
	Sun, 24 Mar 2019 19:06:35 +0000 (19:06 +0000)
commit	87d4ab8b92e17db517499403eaa2e0b19992fae2
tree	0c973e1dd13a30f2acd11e595f2201c6560f7fd0	tree \| snapshot
parent	6af0363857f5815fb69268198dd55f29c7a3539b	commit \| diff

[X86][SSE41] Start shuffle combining from ZERO_EXTEND_VECTOR_INREG (PR40685)

Enable SSE41 ZERO_EXTEND_VECTOR_INREG shuffle combines - for the PMOVZX(PSHUFD(V)) -> UNPCKH(V,0) pattern we reduce the shuffles (port5-bottleneck on Intel) at the expense of creating a zero (pxor v,v) and an extra register move - which is a good trade off as these are pretty cheap and in most cases it doesn't increase register pressure.

This also exposed a missed opportunity to use combine to ZERO_EXTEND_VECTOR_INREG with folded loads - even if we're in the float domain.

llvm-svn: 356864

14 files changed:

llvm/lib/Target/X86/X86ISelLowering.cpp		diff \| blob \| history
llvm/test/CodeGen/X86/cast-vsel.ll		diff \| blob \| history
llvm/test/CodeGen/X86/combine-pmuldq.ll		diff \| blob \| history
llvm/test/CodeGen/X86/combine-shl.ll		diff \| blob \| history
llvm/test/CodeGen/X86/pmul.ll		diff \| blob \| history
llvm/test/CodeGen/X86/psubus.ll		diff \| blob \| history
llvm/test/CodeGen/X86/slow-pmulld.ll		diff \| blob \| history
llvm/test/CodeGen/X86/vec_int_to_fp.ll		diff \| blob \| history
llvm/test/CodeGen/X86/vector-idiv-udiv-128.ll		diff \| blob \| history
llvm/test/CodeGen/X86/vector-pcmp.ll		diff \| blob \| history
llvm/test/CodeGen/X86/vector-reduce-umax.ll		diff \| blob \| history
llvm/test/CodeGen/X86/vector-reduce-umin.ll		diff \| blob \| history
llvm/test/CodeGen/X86/vector-shift-shl-sub128.ll		diff \| blob \| history
llvm/test/CodeGen/X86/vector-zext.ll		diff \| blob \| history

Domain: System / Toolchain;

RSS Atom