arm: Replace arm_builtin_vectorized_function [PR106253]
authorRichard Sandiford <richard.sandiford@arm.com>
Mon, 18 Jul 2022 11:57:10 +0000 (12:57 +0100)
committerRichard Sandiford <richard.sandiford@arm.com>
Mon, 18 Jul 2022 11:57:10 +0000 (12:57 +0100)
commit7313381d2ce44b72b4c9f70bd5670e5d78d1f631
tree9df6d1d1217e63a819687b13d479466f92fde366
parent9c8349ee1a35dac61b84bbae115ee6a1eeb6ddbd
arm: Replace arm_builtin_vectorized_function [PR106253]

This patch extends the fix for PR106253 to AArch32.  As with AArch64,
we were using ACLE intrinsics to vectorise scalar built-ins, even
though the two sometimes have different ECF_* flags.  (That in turn
is because the ACLE intrinsics should follow the instruction semantics
as closely as possible, whereas the scalar built-ins follow language
specs.)

The patch also removes the copysignf built-in, which only existed
for this purpose and wasn't a “real” arm_neon.h built-in.

Doing this also has the side-effect of enabling vectorisation of
rint and roundeven.  Logically that should be a separate patch,
but making it one would have meant adding a new int iterator
for the original set of instructions and then removing it again
when including new functions.

I've restricted the bswap tests to little-endian because we end
up with excessive spilling on big-endian.  E.g.:

        sub     sp, sp, #8
        vstr    d1, [sp]
        vldr    d16, [sp]
        vrev16.8        d16, d16
        vstr    d16, [sp]
        vldr    d0, [sp]
        add     sp, sp, #8
        @ sp needed
        bx      lr

Similarly, the copysign tests require little-endian because on
big-endian we unnecessarily load the constant from the constant pool:

        vldr.32 s15, .L3
        vdup.32 d0, d7[1]
        vbsl    d0, d2, d1
        bx      lr
.L3:
        .word   -2147483648

gcc/
PR target/106253
* config/arm/arm-builtins.cc (arm_builtin_vectorized_function):
Delete.
* config/arm/arm-protos.h (arm_builtin_vectorized_function): Delete.
* config/arm/arm.cc (TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION):
Delete.
* config/arm/arm_neon_builtins.def (copysignf): Delete.
* config/arm/iterators.md (nvrint_pattern): New attribute.
* config/arm/neon.md (<NEON_VRINT:nvrint_pattern><VCVTF:mode>2):
New pattern.
(l<NEON_VCVT:nvrint_pattern><su_optab><VCVTF:mode><v_cmp_result>2):
Likewise.
(neon_copysignf<mode>): Rename to...
(copysign<mode>3): ...this.

gcc/testsuite/
PR target/106253
* gcc.target/arm/vect_unary_1.c: New test.
* gcc.target/arm/vect_binary_1.c: Likewise.
gcc/config/arm/arm-builtins.cc
gcc/config/arm/arm-protos.h
gcc/config/arm/arm.cc
gcc/config/arm/arm_neon_builtins.def
gcc/config/arm/iterators.md
gcc/config/arm/neon.md
gcc/testsuite/gcc.target/arm/vect_binary_1.c [new file with mode: 0644]
gcc/testsuite/gcc.target/arm/vect_unary_1.c [new file with mode: 0644]