AMDGPU: Custom lower vector_shuffle for v4i16/v4f16
authorMatt Arsenault <Matthew.Arsenault@amd.com>
Tue, 2 Jul 2019 19:15:45 +0000 (19:15 +0000)
committerMatt Arsenault <Matthew.Arsenault@amd.com>
Tue, 2 Jul 2019 19:15:45 +0000 (19:15 +0000)
commit5fe851b6cd90ceaa4cf468d9403b3920e1d0ae15
treed5be01bc00f8101862c7d350c6e7c22028de38b9
parente6768d613adb3313722d02aa5740f33ab11b60fe
AMDGPU: Custom lower vector_shuffle for v4i16/v4f16

Ordinarily it is lowered as a build_vector of each extract_vector_elt,
which in turn get lowered to bitcasts and bit shifts. Very little
understand the lowered extract pattern, resulting in much worse
code. We treat concat_vectors of v2i16 as legal, so prefer that.

llvm-svn: 364959
llvm/lib/Target/AMDGPU/SIISelLowering.cpp
llvm/lib/Target/AMDGPU/SIISelLowering.h
llvm/test/CodeGen/AMDGPU/vector_shuffle.packed.ll