Handle peeling for alignment with masking
authorRichard Sandiford <richard.sandiford@linaro.org>
Sat, 13 Jan 2018 17:59:32 +0000 (17:59 +0000)
committerRichard Sandiford <rsandifo@gcc.gnu.org>
Sat, 13 Jan 2018 17:59:32 +0000 (17:59 +0000)
commit535e7c114a7ad2ad7a6a0def88cf9448fcd5f029
tree1c9a22e58ae70f7dd6784c23c21a355f50625864
parentc2700f7466bac153def05a0e070aa78cd2ffc0ae
Handle peeling for alignment with masking

This patch adds support for aligning vectors by using a partial
first iteration.  E.g. if the start pointer is 3 elements beyond
an aligned address, the first iteration will have a mask in which
the first three elements are false.

On SVE, the optimisation is only useful for vector-length-specific
code.  Vector-length-agnostic code doesn't try to align vectors
since the vector length might not be a power of 2.

2018-01-13  Richard Sandiford  <richard.sandiford@linaro.org>
    Alan Hayward  <alan.hayward@arm.com>
    David Sherwood  <david.sherwood@arm.com>

gcc/
* tree-vectorizer.h (_loop_vec_info::mask_skip_niters): New field.
(LOOP_VINFO_MASK_SKIP_NITERS): New macro.
(vect_use_loop_mask_for_alignment_p): New function.
(vect_prepare_for_masked_peels, vect_gen_while_not): Declare.
* tree-vect-loop-manip.c (vect_set_loop_masks_directly): Add an
niters_skip argument.  Make sure that the first niters_skip elements
of the first iteration are inactive.
(vect_set_loop_condition_masked): Handle LOOP_VINFO_MASK_SKIP_NITERS.
Update call to vect_set_loop_masks_directly.
(get_misalign_in_elems): New function, split out from...
(vect_gen_prolog_loop_niters): ...here.
(vect_update_init_of_dr): Take a code argument that specifies whether
the adjustment should be added or subtracted.
(vect_update_init_of_drs): Likewise.
(vect_prepare_for_masked_peels): New function.
(vect_do_peeling): Skip prologue peeling if we're using a mask
instead.  Update call to vect_update_inits_of_drs.
* tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Initialize
mask_skip_niters.
(vect_analyze_loop_2): Allow fully-masked loops with peeling for
alignment.  Do not include the number of peeled iterations in
the minimum threshold in that case.
(vectorizable_induction): Adjust the start value down by
LOOP_VINFO_MASK_SKIP_NITERS iterations.
(vect_transform_loop): Call vect_prepare_for_masked_peels.
Take the number of skipped iterations into account when calculating
the loop bounds.
* tree-vect-stmts.c (vect_gen_while_not): New function.

gcc/testsuite/
* gcc.target/aarch64/sve/nopeel_1.c: New test.
* gcc.target/aarch64/sve/peel_ind_1.c: Likewise.
* gcc.target/aarch64/sve/peel_ind_1_run.c: Likewise.
* gcc.target/aarch64/sve/peel_ind_2.c: Likewise.
* gcc.target/aarch64/sve/peel_ind_2_run.c: Likewise.
* gcc.target/aarch64/sve/peel_ind_3.c: Likewise.
* gcc.target/aarch64/sve/peel_ind_3_run.c: Likewise.
* gcc.target/aarch64/sve/peel_ind_4.c: Likewise.
* gcc.target/aarch64/sve/peel_ind_4_run.c: Likewise.

Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256630
15 files changed:
gcc/ChangeLog
gcc/testsuite/ChangeLog
gcc/testsuite/gcc.target/aarch64/sve/nopeel_1.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/peel_ind_1.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/peel_ind_1_run.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/peel_ind_2.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/peel_ind_2_run.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/peel_ind_3.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/peel_ind_3_run.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/peel_ind_4.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/peel_ind_4_run.c [new file with mode: 0644]
gcc/tree-vect-loop-manip.c
gcc/tree-vect-loop.c
gcc/tree-vect-stmts.c
gcc/tree-vectorizer.h