arm: Auto-vectorization for MVE: add pack/unpack patterns
This patch adds vec_unpack<US>_hi_<mode>, vec_unpack<US>_lo_<mode>,
vec_pack_trunc_<mode> patterns for MVE.
It does so by moving the unpack patterns from neon.md to
vec-common.md, while adding them support for MVE. The pack expander is
derived from the Neon one (which in turn is renamed into
neon_quad_vec_pack_trunc_<mode>).
The patch introduces mve_vec_unpack<US>_lo_<mode> and
mve_vec_unpack<US>_hi_<mode> which are similar to their Neon
counterparts, except for the assembly syntax.
The patch introduces mve_vec_pack_trunc_lo_<mode> to avoid the need for a
zero-initialized temporary, which is needed if the
vec_pack_trunc_<mode> expander calls @mve_vmovn[bt]q_<supf><mode>
instead.
With this patch, we can now vectorize the 16 and 8-bit versions of
vclz and vshl, although the generated code could still be improved.
For test_clz_s16, we now generate
vldrh.16 q3, [r1]
vmovlb.s16 q2, q3
vmovlt.s16 q3, q3
vclz.i32 q2, q2
vclz.i32 q3, q3
vmovnb.i32 q1, q2
vmovnt.i32 q1, q3
vstrh.16 q1, [r0]
which could be improved to
vldrh.16 q3, [r1]
vclz.i16 q1, q3
vstrh.16 q1, [r0]
if we could avoid the need for unpack/pack steps.
For reference, clang-12 generates:
vldrh.s32 q0, [r1]
vldrh.s32 q1, [r1, #8]
vclz.i32 q0, q0
vstrh.32 q0, [r0]
vclz.i32 q0, q1
vstrh.32 q0, [r0, #8]
2021-06-11 Christophe Lyon <christophe.lyon@linaro.org>
gcc/
* config/arm/mve.md (mve_vec_unpack<US>_lo_<mode>): New pattern.
(mve_vec_unpack<US>_hi_<mode>): New pattern.
(@mve_vec_pack_trunc_lo_<mode>): New pattern.
(mve_vmovntq_<supf><mode>): Prefix with '@'.
* config/arm/neon.md (vec_unpack<US>_hi_<mode>): Move to
vec-common.md.
(vec_unpack<US>_lo_<mode>): Likewise.
(vec_pack_trunc_<mode>): Rename to
neon_quad_vec_pack_trunc_<mode>.
* config/arm/vec-common.md (vec_unpack<US>_hi_<mode>): New
pattern.
(vec_unpack<US>_lo_<mode>): New.
(vec_pack_trunc_<mode>): New.
gcc/testsuite/
* gcc.target/arm/simd/mve-vclz.c: Update expected results.
* gcc.target/arm/simd/mve-vshl.c: Likewise.
* gcc.target/arm/simd/mve-vec-pack.c: New test.
* gcc.target/arm/simd/mve-vec-unpack.c: New test.