Allow gather loads to be used for grouped accesses
authorRichard Sandiford <richard.sandiford@linaro.org>
Sat, 13 Jan 2018 18:01:49 +0000 (18:01 +0000)
committerRichard Sandiford <rsandifo@gcc.gnu.org>
Sat, 13 Jan 2018 18:01:49 +0000 (18:01 +0000)
commit429ef523f74bb85c20ba60b0f83ab7e73f82d74d
treebc0950b9148fc31de4da0f855b784687fae88a81
parentab2fc782509f934ef0cc22c31d743fcb63063c1b
Allow gather loads to be used for grouped accesses

Following on from the previous patch for strided accesses, this patch
allows gather loads to be used with grouped accesses, if we otherwise
would need to fall back to VMAT_ELEMENTWISE.  However, as the comment
says, this is restricted to single-element groups for now:

 ??? Although the code can handle all group sizes correctly,
 it probably isn't a win to use separate strided accesses based
 on nearby locations.  Or, even if it's a win over scalar code,
 it might not be a win over vectorizing at a lower VF, if that
 allows us to use contiguous accesses.

Single-element groups are an important special case though,
and this means that code is less sensitive to GCC's classification
of single accesses with constant steps as "grouped" and ones with
variable steps as "strided".

2018-01-13  Richard Sandiford  <richard.sandiford@linaro.org>
    Alan Hayward  <alan.hayward@arm.com>
    David Sherwood  <david.sherwood@arm.com>

gcc/
* tree-vectorizer.h (vect_gather_scatter_fn_p): Declare.
* tree-vect-data-refs.c (vect_gather_scatter_fn_p): Make public.
* tree-vect-stmts.c (vect_truncate_gather_scatter_offset): New
function.
(vect_use_strided_gather_scatters_p): Take a masked_p argument.
Use vect_truncate_gather_scatter_offset if we can't treat the
operation as a normal gather load or scatter store.
(get_group_load_store_type): Take the gather_scatter_info
as argument.  Try using a gather load or scatter store for
single-element groups.
(get_load_store_type): Update calls to get_group_load_store_type
and vect_use_strided_gather_scatters_p.

gcc/testsuite/
* gcc.target/aarch64/sve/reduc_strict_3.c: Expect FADDA to be used
for double_reduc1.
* gcc.target/aarch64/sve/strided_load_4.c: New test.
* gcc.target/aarch64/sve/strided_load_5.c: Likewise.
* gcc.target/aarch64/sve/strided_load_6.c: Likewise.
* gcc.target/aarch64/sve/strided_load_7.c: Likewise.

Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256642
gcc/ChangeLog
gcc/testsuite/ChangeLog
gcc/testsuite/gcc.target/aarch64/sve/reduc_strict_3.c
gcc/testsuite/gcc.target/aarch64/sve/strided_load_4.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/strided_load_5.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/strided_load_6.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/strided_load_7.c [new file with mode: 0644]
gcc/tree-vect-data-refs.c
gcc/tree-vect-stmts.c
gcc/tree-vectorizer.h