Add a loop versioning pass
authorRichard Sandiford <richard.sandiford@arm.com>
Mon, 17 Dec 2018 10:05:51 +0000 (10:05 +0000)
committerRichard Sandiford <rsandifo@gcc.gnu.org>
Mon, 17 Dec 2018 10:05:51 +0000 (10:05 +0000)
commit13e08dc93941675cd6a7cf5470b437c4f640c996
tree86777b79b7db416b5e4abe23fac22576eeb717ec
parentfb2974dcf53f960231e8c4bc2b294f8900b3beef
Add a loop versioning pass

This patch adds a pass that versions loops with variable index strides
for the case in which the stride is 1.  E.g.:

    for (int i = 0; i < n; ++i)
      x[i * stride] = ...;

becomes:

    if (stepx == 1)
      for (int i = 0; i < n; ++i)
        x[i] = ...;
    else
      for (int i = 0; i < n; ++i)
        x[i * stride] = ...;

This is useful for both vector code and scalar code, and in some cases
can enable further optimisations like loop interchange or pattern
recognition.

The pass gives a 7.6% improvement on Cortex-A72 for 554.roms_r at -O3
and a 2.4% improvement for 465.tonto.  I haven't found any SPEC tests
that regress.

Sizewise, there's a 10% increase in .text for both 554.roms_r and
465.tonto.  That's obviously a lot, but in tonto's case it's because
the whole program is written using assumed-shape arrays and pointers,
so a large number of functions really do benefit from versioning.
roms likewise makes heavy use of assumed-shape arrays, and that
improvement in performance IMO justifies the code growth.

The next biggest .text increase is 4.5% for 548.exchange2_r.  I did see
a small (0.4%) speed improvement there, but although both 3-iteration runs
produced stable results, that might still be noise.  There was a slightly
larger (non-noise) improvement for a 256-bit SVE model.

481.wrf and 521.wrf_r .text grew by 2.8% and 2.5% respectively, but
without any noticeable improvement in performance.  No other test grew
by more than 2%.

Although the main SPEC beneficiaries are all Fortran tests, the
benchmarks we use for SVE also include some C and C++ tests that
benefit.

Using -frepack-arrays gives the same benefits in many Fortran cases.
The problem is that using that option inappropriately can force a full
array copy for arguments that the function only reads once, and so it
isn't really something we can turn on by default.  The new pass is
supposed to give most of the benefits of -frepack-arrays without
the risk of unnecessary repacking.

The patch therefore enables the pass by default at -O3.

2018-12-17  Richard Sandiford  <richard.sandiford@arm.com>
    Ramana Radhakrishnan  <ramana.radhakrishnan@arm.com>
    Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

gcc/
* doc/invoke.texi (-fversion-loops-for-strides): Document
(loop-versioning-group-size, loop-versioning-max-inner-insns)
(loop-versioning-max-outer-insns): Document new --params.
* Makefile.in (OBJS): Add gimple-loop-versioning.o.
* common.opt (fversion-loops-for-strides): New option.
* opts.c (default_options_table): Enable fversion-loops-for-strides
at -O3.
* params.def (PARAM_LOOP_VERSIONING_GROUP_SIZE)
(PARAM_LOOP_VERSIONING_MAX_INNER_INSNS)
(PARAM_LOOP_VERSIONING_MAX_OUTER_INSNS): New parameters.
* passes.def: Add pass_loop_versioning.
* timevar.def (TV_LOOP_VERSIONING): New time variable.
* tree-ssa-propagate.h
(substitute_and_fold_engine::substitute_and_fold): Add an optional
block parameter.
* tree-ssa-propagate.c
(substitute_and_fold_engine::substitute_and_fold): Likewise.
When passed, only walk blocks dominated by that block.
* tree-vrp.h (range_includes_p): Declare.
(range_includes_zero_p): Turn into an inline wrapper around
range_includes_p.
* tree-vrp.c (range_includes_p): New function, generalizing...
(range_includes_zero_p): ...this.
* tree-pass.h (make_pass_loop_versioning): Declare.
* gimple-loop-versioning.cc: New file.

gcc/testsuite/
* gcc.dg/loop-versioning-1.c: New test.
* gcc.dg/loop-versioning-10.c: Likewise.
* gcc.dg/loop-versioning-11.c: Likewise.
* gcc.dg/loop-versioning-2.c: Likewise.
* gcc.dg/loop-versioning-3.c: Likewise.
* gcc.dg/loop-versioning-4.c: Likewise.
* gcc.dg/loop-versioning-5.c: Likewise.
* gcc.dg/loop-versioning-6.c: Likewise.
* gcc.dg/loop-versioning-7.c: Likewise.
* gcc.dg/loop-versioning-8.c: Likewise.
* gcc.dg/loop-versioning-9.c: Likewise.
* gfortran.dg/loop_versioning_1.f90: Likewise.
* gfortran.dg/loop_versioning_2.f90: Likewise.
* gfortran.dg/loop_versioning_3.f90: Likewise.
* gfortran.dg/loop_versioning_4.f90: Likewise.
* gfortran.dg/loop_versioning_5.f90: Likewise.
* gfortran.dg/loop_versioning_6.f90: Likewise.
* gfortran.dg/loop_versioning_7.f90: Likewise.
* gfortran.dg/loop_versioning_8.f90: Likewise.

From-SVN: r267197
39 files changed:
gcc/ChangeLog
gcc/Makefile.in
gcc/common.opt
gcc/doc/invoke.texi
gcc/gimple-loop-versioning.cc [new file with mode: 0644]
gcc/opts.c
gcc/params.def
gcc/passes.def
gcc/testsuite/ChangeLog
gcc/testsuite/gcc.dg/loop-versioning-1.c [new file with mode: 0644]
gcc/testsuite/gcc.dg/loop-versioning-10.c [new file with mode: 0644]
gcc/testsuite/gcc.dg/loop-versioning-11.c [new file with mode: 0644]
gcc/testsuite/gcc.dg/loop-versioning-12.c [new file with mode: 0644]
gcc/testsuite/gcc.dg/loop-versioning-13.c [new file with mode: 0644]
gcc/testsuite/gcc.dg/loop-versioning-14.c [new file with mode: 0644]
gcc/testsuite/gcc.dg/loop-versioning-2.c [new file with mode: 0644]
gcc/testsuite/gcc.dg/loop-versioning-3.c [new file with mode: 0644]
gcc/testsuite/gcc.dg/loop-versioning-4.c [new file with mode: 0644]
gcc/testsuite/gcc.dg/loop-versioning-5.c [new file with mode: 0644]
gcc/testsuite/gcc.dg/loop-versioning-6.c [new file with mode: 0644]
gcc/testsuite/gcc.dg/loop-versioning-7.c [new file with mode: 0644]
gcc/testsuite/gcc.dg/loop-versioning-8.c [new file with mode: 0644]
gcc/testsuite/gcc.dg/loop-versioning-9.c [new file with mode: 0644]
gcc/testsuite/gcc.dg/vect/slp-43.c
gcc/testsuite/gcc.dg/vect/slp-45.c
gcc/testsuite/gfortran.dg/loop_versioning_1.f90 [new file with mode: 0644]
gcc/testsuite/gfortran.dg/loop_versioning_2.f90 [new file with mode: 0644]
gcc/testsuite/gfortran.dg/loop_versioning_3.f90 [new file with mode: 0644]
gcc/testsuite/gfortran.dg/loop_versioning_4.f90 [new file with mode: 0644]
gcc/testsuite/gfortran.dg/loop_versioning_5.f90 [new file with mode: 0644]
gcc/testsuite/gfortran.dg/loop_versioning_6.f90 [new file with mode: 0644]
gcc/testsuite/gfortran.dg/loop_versioning_7.f90 [new file with mode: 0644]
gcc/testsuite/gfortran.dg/loop_versioning_8.f90 [new file with mode: 0644]
gcc/timevar.def
gcc/tree-pass.h
gcc/tree-ssa-propagate.c
gcc/tree-ssa-propagate.h
gcc/tree-vrp.c
gcc/tree-vrp.h