aarch64: Use memcpy to copy structures in vst4[q]_lane intrinsics
authorJonathan Wright <jonathan.wright@arm.com>
Thu, 29 Jul 2021 11:24:17 +0000 (12:24 +0100)
committerJonathan Wright <jonathan.wright@arm.com>
Fri, 6 Aug 2021 10:01:52 +0000 (11:01 +0100)
commita6075926947be9bcbf7016bf4b29f549102ad91d
tree4f55681dc74b7cd440adea2cb8c03b35b1156dac
parent318113a961220c8da79d8d29619138827ccc69f1
aarch64: Use memcpy to copy structures in vst4[q]_lane intrinsics

Use __builtin_memcpy to copy vector structures instead of using a
union - or constructing a new opaque structure one vector at a time -
in each of the vst4[q]_lane Neon intrinsics in arm_neon.h.

Add new code generation tests to verify that superfluous move
instructions are not generated for the vst4q_lane intrinsics.

gcc/ChangeLog:

2021-07-29  Jonathan Wright  <jonathan.wright@arm.com>

* config/aarch64/arm_neon.h (__ST4_LANE_FUNC): Delete.
(__ST4Q_LANE_FUNC): Delete.
(vst4_lane_f16): Use __builtin_memcpy to copy vector
structure instead of constructing __builtin_aarch64_simd_xi
one vector at a time.
(vst4_lane_f32): Likewise.
(vst4_lane_f64): Likewise.
(vst4_lane_p8): Likewise.
(vst4_lane_p16): Likewise.
(vst4_lane_p64): Likewise.
(vst4_lane_s8): Likewise.
(vst4_lane_s16): Likewise.
(vst4_lane_s32): Likewise.
(vst4_lane_s64): Likewise.
(vst4_lane_u8): Likewise.
(vst4_lane_u16): Likewise.
(vst4_lane_u32): Likewise.
(vst4_lane_u64): Likewise.
(vst4_lane_bf16): Likewise.
(vst4q_lane_f16): Use __builtin_memcpy to copy vector
structure instead of using a union.
(vst4q_lane_f32): Likewise.
(vst4q_lane_f64): Likewise.
(vst4q_lane_p8): Likewise.
(vst4q_lane_p16): Likewise.
(vst4q_lane_p64): Likewise.
(vst4q_lane_s8): Likewise.
(vst4q_lane_s16): Likewise.
(vst4q_lane_s32): Likewise.
(vst4q_lane_s64): Likewise.
(vst4q_lane_u8): Likewise.
(vst4q_lane_u16): Likewise.
(vst4q_lane_u32): Likewise.
(vst4q_lane_u64): Likewise.
(vst4q_lane_bf16): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/vector_structure_intrinsics.c: Add new
tests.
gcc/config/aarch64/arm_neon.h
gcc/testsuite/gcc.target/aarch64/vector_structure_intrinsics.c