aarch64: Use memcpy to copy vector tables in vtbx4 intrinsics
authorJonathan Wright <jonathan.wright@arm.com>
Thu, 8 Jul 2021 22:27:54 +0000 (23:27 +0100)
committerJonathan Wright <jonathan.wright@arm.com>
Fri, 23 Jul 2021 11:15:02 +0000 (12:15 +0100)
commit4848e283ccaed451ddcc38edcb9f5ce9e9f2d7eb
tree4ae2831d01b01fc7c3e1ec6865dd55da75e2e8e7
parentf2f04d8b9d1f5d4fc8c3a17c7fa5ac518574f2df
aarch64: Use memcpy to copy vector tables in vtbx4 intrinsics

Use __builtin_memcpy to copy vector structures instead of building
a new opaque structure one vector at a time in each of the vtbx4
Neon intrinsics in arm_neon.h. This simplifies the header file and
also improves code generation - superfluous move instructions were
emitted for every register extraction/set in this additional
structure.

gcc/ChangeLog:

2021-07-19  Jonathan Wright  <jonathan.wright@arm.com>

* config/aarch64/arm_neon.h (vtbx4_s8): Use __builtin_memcpy
instead of constructing __builtin_aarch64_simd_oi one vector
at a time.
(vtbx4_u8): Likewise.
(vtbx4_p8): Likewise.
gcc/config/aarch64/arm_neon.h