vect: Restore variable-length SLP permutes [PR97513]
Many of the gcc.target/sve/slp-perm*.c tests started failing
after the introduction of separate SLP permute nodes.
This patch adds variable-length support using a similar
technique to vect_transform_slp_perm_load.
As there, the idea is to detect when every permute mask vector
is the same and can be generated using a regular stepped sequence.
We can easily handle those cases for variable-length, but still
need to restrict the general case to constant-length.
Again copying vect_transform_slp_perm_load, the idea is to distinguish
the two cases regardless of whether the length is variable or not,
partly to increase testing coverage and partly because it avoids
generating redundant trees.
Doing this means that we can also use SLP for the two-vector
permute in pr88834.c, which we couldn't before VEC_PERM_EXPR
nodes were introduced. The patch therefore makes pr88834.c
check that we don't regress back to not using SLP and adds
pr88834_ld3.c to check for the original problem in the PR.
gcc/
PR tree-optimization/97513
* tree-vect-slp.c (vect_add_slp_permutation): New function,
split out from...
(vectorizable_slp_permutation): ...here. Detect cases in which
all VEC_PERM_EXPRs are guaranteed to have the same stepped
permute vector and only generate one permute vector for that case.
Extend that case to handle variable-length vectors.
gcc/testsuite/
* gcc.target/aarch64/sve/pr88834.c: Expect the vectorizer to use SLP.
* gcc.target/aarch64/sve/pr88834_ld3.c: New test.