review.tizen.org Git - test

AArch32: correct dot-product RTL patterns.

The previous fix for this problem was wrong due to a subtle difference between
where NEON expects the RMW values and where intrinsics expects them.

The insn pattern is modeled after the intrinsics and so needs an expand for
the vectorizer optab to switch the RTL.

However operand[3] is not expected to be written to so the current pattern is
bogus.

Instead we use the expand to shuffle around the RTL.

The vectorizer expects operands[3] and operands[0] to be
the same but the aarch64 intrinsics expanders expect operands[0] and
operands[1] to be the same.

This also fixes some issues with big-endian, each dot product performs 4 8-byte
multiplications.  However compared to AArch64 we don't enter lanes in GCC
lane indexed in AArch32 aside from loads/stores.  This means no lane remappings
are done in arm-builtins.c and so none should be done at the instruction side.

There are some other instructions that need inspections as I think there are
more incorrect ones.

Third there was a bug in the ACLE specication for dot product which has now been
fixed[1].  This means some intrinsics were missing and are added by this patch.

Bootstrapped and regtested on arm-none-linux-gnueabihf and no issues.

Ok for master? and active branches after some stew?

[1] https://github.com/ARM-software/acle/releases/tag/r2021Q3

gcc/ChangeLog:

* config/arm/arm_neon.h (vdot_laneq_u32, vdotq_laneq_u32,
vdot_laneq_s32, vdotq_laneq_s32): New.
* config/arm/arm_neon_builtins.def (sdot_laneq, udot_laneq): New.
* config/arm/neon.md (neon_<sup>dot<vsi2qi>): New.
(<sup>dot_prod<vsi2qi>): Re-order rtl.
(neon_<sup>dot_lane<vsi2qi>): Fix rtl order and endiannes.
(neon_<sup>dot_laneq<vsi2qi>): New.

gcc/testsuite/ChangeLog:

* gcc.target/arm/simd/vdot-compile.c: Add new cases.
* gcc.target/arm/simd/vdot-exec.c: Likewise.

author	Tamar Christina <tamar.christina@arm.com>
	Mon, 7 Feb 2022 12:54:42 +0000 (12:54 +0000)
committer	Tamar Christina <tamar.christina@arm.com>
	Mon, 7 Feb 2022 12:56:37 +0000 (12:56 +0000)
commit	12aae3b93aeae50f5ced1bbef57fe207ecd12930
tree	ef60abe0688fe7e9f66ea5c6ecf21ebe83cc9742	tree \| snapshot
parent	db95441cf5399aabc46ca83df19f7290c3e23cb1	commit \| diff

gcc/config/arm/arm_neon.h		diff \| blob \| history
gcc/config/arm/arm_neon_builtins.def		diff \| blob \| history
gcc/config/arm/neon.md		diff \| blob \| history
gcc/testsuite/gcc.target/arm/simd/vdot-compile.c		diff \| blob \| history
gcc/testsuite/gcc.target/arm/simd/vdot-exec.c		diff \| blob \| history