review.tizen.org Git - platform/upstream/llvm.git/commit

author	Simon Pilgrim <llvm-dev@redking.me.uk>
	Sat, 13 Feb 2016 21:54:04 +0000 (21:54 +0000)
committer	Simon Pilgrim <llvm-dev@redking.me.uk>
	Sat, 13 Feb 2016 21:54:04 +0000 (21:54 +0000)
commit	08ba012973c80b098b7441a8ce0ee40194596ed4
tree	60f4ea0c742e0d71c98934c95026554b45724e12	tree \| snapshot
parent	e91793c3a34233ebe347d171621f6e7feaa2962c	commit \| diff

[X86][AVX] Lower shuffles as repeated lane shuffles then lane-crossing shuffles

This patch attempts to represent a shuffle as a repeating shuffle (recognisable by is128BitLaneRepeatedShuffleMask) with the source input(s) in their original lanes, followed by a single permutation of the 128-bit lanes to their final destinations.

On AVX2 we can additionally attempt to match using 64-bit sub-lane permutation. AVX2 can also now match a similar 'broadcasted' repeating shuffle.

This patch has several benefits:

* Avoids prematurely matching with lowerVectorShuffleByMerging128BitLanes which can require both inputs to have their input lanes permuted before shuffling.
* Can replace PERMPS/PERMD instructions - although these are useful for cross-lane unary shuffling, they require their shuffle mask to be pre-loaded (and increase register pressure).
* Matching the repeating shuffle makes use of a lot of existing shuffle lowering.

There is an outstanding minor AVX1 regression (combine_unneeded_subvector1 in vector-shuffle-combining.ll) of a previously 128-bit shuffle + subvector splat being converted to a subvector splat + (2 instruction) 256-bit shuffle, I intend to fix this in a followup patch for review.

Differential Revision: http://reviews.llvm.org/D16537

llvm-svn: 260834

llvm/lib/Target/X86/X86ISelLowering.cpp		diff \| blob \| history
llvm/test/CodeGen/X86/avx-splat.ll		diff \| blob \| history
llvm/test/CodeGen/X86/avx2-conversions.ll		diff \| blob \| history
llvm/test/CodeGen/X86/vector-shuffle-256-v16.ll		diff \| blob \| history
llvm/test/CodeGen/X86/vector-shuffle-256-v32.ll		diff \| blob \| history
llvm/test/CodeGen/X86/vector-shuffle-256-v4.ll		diff \| blob \| history
llvm/test/CodeGen/X86/vector-shuffle-256-v8.ll		diff \| blob \| history
llvm/test/CodeGen/X86/vector-shuffle-combining.ll		diff \| blob \| history
llvm/test/CodeGen/X86/vector-trunc.ll		diff \| blob \| history