[AArch64] Turn truncating buildvectors into truncates
authorDavid Green <david.green@arm.com>
Mon, 7 Mar 2022 09:42:54 +0000 (09:42 +0000)
committerDavid Green <david.green@arm.com>
Mon, 7 Mar 2022 09:42:54 +0000 (09:42 +0000)
commitd9633d149022054bdac90bd3d03a240dbdb46f7e
treeb1d2eb584d8107d43c4a759db12e4e8f6dcd7191
parentc74c344263034edda867845c2421f4f0cef03107
[AArch64] Turn truncating buildvectors into truncates

When lowering large v16f32->v16i8 fp_to_si_sat, the fp_to_si_sat node is
split several times, creating an illegal v4i8 concat that gets expanded
into a BUILD_VECTOR. After some combining and other legalisation, it
ends up the a buildvector that extracts from 4 vectors, looking like
BUILDVECTOR(a0,a1,a2,a3,b0,b1,b2,b3,c0,c1,c2,c3,d0,d1,d2,d3). That is
really an v16i32->v16i8 truncate in disguise.

This adds a ReconstructTruncateFromBuildVector method to detect the
pattern, converting it back into the legal "concat(trunc(concat(trunc(a),
trunc(b))), trunc(concat(trunc(c), trunc(d))))" tree. The extracted
nodes could also be v4i16, in which case the truncates are not needed.
All those truncates and concats then become uzip1's, which is much
better than expanding by moving vector lanes around.

Differential Revision: https://reviews.llvm.org/D119469
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
llvm/test/CodeGen/AArch64/fptosi-sat-vector.ll
llvm/test/CodeGen/AArch64/fptoui-sat-vector.ll
llvm/test/CodeGen/AArch64/neon-extracttruncate.ll