[ARM] Match dual lane vmovs from insert_vector_elt
authorDavid Green <david.green@arm.com>
Tue, 15 Dec 2020 15:58:52 +0000 (15:58 +0000)
committerDavid Green <david.green@arm.com>
Tue, 15 Dec 2020 15:58:52 +0000 (15:58 +0000)
commit6cc3d80a84884a79967fffa4596c14001b8ba8a3
treeca8311687a4f5b63042e76c143651d5fe1b86c70
parentbda7d0af970718c243d93b22a8449c20156e574f
[ARM] Match dual lane vmovs from insert_vector_elt

MVE has a dual lane vector move instruction, capable of moving two
general purpose registers into lanes of a vector register. They look
like one of:
  vmov q0[2], q0[0], r2, r0
  vmov q0[3], q0[1], r3, r1
They only accept these lane indices though (and only insert into an
i32), either moving lanes 1 and 3, or 0 and 2.

This patch adds some tablegen patterns for them, selecting from vector
inserts elements. Because the insert_elements are know to be
canonicalized to ascending order there are several patterns that we need
to select. These lane indices are:

3 2 1 0    -> vmovqrr 31; vmovqrr 20
3 2 1      -> vmovqrr 31; vmov 2
3 1        -> vmovqrr 31
2 1 0      -> vmovqrr 20; vmov 1
2 0        -> vmovqrr 20

With the top one being the most common. All other potential patterns of
lane indices will be matched by a combination of these and the
individual vmov pattern already present. This does mean that we are
selecting several machine instructions at once due to the need to
re-arrange the inserts, but in this case there is nothing else that will
attempt to match an insert_vector_elt node.

Differential Revision: https://reviews.llvm.org/D92553
54 files changed:
llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
llvm/lib/Target/ARM/ARMInstrMVE.td
llvm/test/CodeGen/Thumb2/active_lane_mask.ll
llvm/test/CodeGen/Thumb2/mve-abs.ll
llvm/test/CodeGen/Thumb2/mve-div-expand.ll
llvm/test/CodeGen/Thumb2/mve-gather-increment.ll
llvm/test/CodeGen/Thumb2/mve-gather-ind32-unscaled.ll
llvm/test/CodeGen/Thumb2/mve-gather-ind8-unscaled.ll
llvm/test/CodeGen/Thumb2/mve-gather-ptrs.ll
llvm/test/CodeGen/Thumb2/mve-gather-scatter-opt.ll
llvm/test/CodeGen/Thumb2/mve-masked-ldst.ll
llvm/test/CodeGen/Thumb2/mve-minmax.ll
llvm/test/CodeGen/Thumb2/mve-neg.ll
llvm/test/CodeGen/Thumb2/mve-phireg.ll
llvm/test/CodeGen/Thumb2/mve-pred-and.ll
llvm/test/CodeGen/Thumb2/mve-pred-bitcast.ll
llvm/test/CodeGen/Thumb2/mve-pred-ext.ll
llvm/test/CodeGen/Thumb2/mve-pred-loadstore.ll
llvm/test/CodeGen/Thumb2/mve-pred-not.ll
llvm/test/CodeGen/Thumb2/mve-pred-or.ll
llvm/test/CodeGen/Thumb2/mve-pred-shuffle.ll
llvm/test/CodeGen/Thumb2/mve-pred-xor.ll
llvm/test/CodeGen/Thumb2/mve-satmul-loops.ll
llvm/test/CodeGen/Thumb2/mve-saturating-arith.ll
llvm/test/CodeGen/Thumb2/mve-scatter-ind8-unscaled.ll
llvm/test/CodeGen/Thumb2/mve-sext.ll
llvm/test/CodeGen/Thumb2/mve-shifts.ll
llvm/test/CodeGen/Thumb2/mve-simple-arith.ll
llvm/test/CodeGen/Thumb2/mve-soft-float-abi.ll
llvm/test/CodeGen/Thumb2/mve-vabdus.ll
llvm/test/CodeGen/Thumb2/mve-vcmp.ll
llvm/test/CodeGen/Thumb2/mve-vcmpr.ll
llvm/test/CodeGen/Thumb2/mve-vcmpz.ll
llvm/test/CodeGen/Thumb2/mve-vcreate.ll
llvm/test/CodeGen/Thumb2/mve-vcvt.ll
llvm/test/CodeGen/Thumb2/mve-vdup.ll
llvm/test/CodeGen/Thumb2/mve-vecreduce-add.ll
llvm/test/CodeGen/Thumb2/mve-vecreduce-addpred.ll
llvm/test/CodeGen/Thumb2/mve-vecreduce-mla.ll
llvm/test/CodeGen/Thumb2/mve-vecreduce-mlapred.ll
llvm/test/CodeGen/Thumb2/mve-vld2-post.ll
llvm/test/CodeGen/Thumb2/mve-vld2.ll
llvm/test/CodeGen/Thumb2/mve-vld3.ll
llvm/test/CodeGen/Thumb2/mve-vld4-post.ll
llvm/test/CodeGen/Thumb2/mve-vld4.ll
llvm/test/CodeGen/Thumb2/mve-vmulh.ll
llvm/test/CodeGen/Thumb2/mve-vmull-loop.ll
llvm/test/CodeGen/Thumb2/mve-vqdmulh.ll
llvm/test/CodeGen/Thumb2/mve-vqmovn.ll
llvm/test/CodeGen/Thumb2/mve-vqshrn.ll
llvm/test/CodeGen/Thumb2/mve-vst2.ll
llvm/test/CodeGen/Thumb2/mve-vst3.ll
llvm/test/CodeGen/Thumb2/mve-vst4.ll
llvm/test/CodeGen/Thumb2/mve-widen-narrow.ll