aarch64: Add more vec_combine patterns
vec_combine is really one instruction on aarch64, provided that
the lowpart element is in the same register as the destination
vector. This patch adds patterns for that.
The patch fixes a regression from GCC 8. Before the patch:
int64x2_t s64q_1(int64_t a0, int64_t a1) {
if (__BYTE_ORDER__ == __ORDER_BIG_ENDIAN__)
return (int64x2_t) { a1, a0 };
else
return (int64x2_t) { a0, a1 };
}
generated:
fmov d0, x0
ins v0.d[1], x1
ins v0.d[1], x1
ret
whereas GCC 8 generated the more respectable:
dup v0.2d, x0
ins v0.d[1], x1
ret
gcc/
* config/aarch64/predicates.md (aarch64_reg_or_mem_pair_operand):
New predicate.
* config/aarch64/aarch64-simd.md (*aarch64_combine_internal<mode>)
(*aarch64_combine_internal_be<mode>): New patterns.
gcc/testsuite/
* gcc.target/aarch64/vec-init-9.c: New test.
* gcc.target/aarch64/vec-init-10.c: Likewise.
* gcc.target/aarch64/vec-init-11.c: Likewise.