SLP vectorize multiple BBs at once
authorRichard Biener <rguenther@suse.de>
Tue, 6 Oct 2020 13:47:15 +0000 (15:47 +0200)
committerRichard Biener <rguenther@suse.de>
Thu, 8 Oct 2020 14:07:15 +0000 (16:07 +0200)
commit181702ef8ab76afbf5d2cd4d7bc0cef613397d6e
tree47224d10f36c8f065c9fbe74346cb5f7cd3e838c
parentf997b67550144c6c0562f94c9b9cb932125d0444
SLP vectorize multiple BBs at once

This work from Martin Liska was motivated by gcc.dg/vect/bb-slp-22.c
which shows how poorly we currently BB vectorize code like

  a0 = in[0] + 23;
  a1 = in[1] + 142;
  a2 = in[2] + 2;
  a3 = in[3] + 31;

  if (x > y)
    {
      b[0] = a0;
      b[1] = a1;
      b[2] = a2;
      b[3] = a3;
    }
  else
    {
      out[0] = a0 * (x + 1);
      out[1] = a1 * (y + 1);
      out[2] = a2 * (x + 1);
      out[3] = a3 * (y + 1);
    }

namely by vectorizing the stores but not the common load (and add)
they are feeded with.

Thus with the following patch we change the BB vectorizer from
operating on a single basic-block at a time to consider somewhat
larger regions (but not the whole function yet because of issues
with vector size iteration).

I took the opportunity to remove the fancy region iterations again
now that we operate on BB granularity and in the end need to visit
PHI nodes as well.

2020-10-08  Martin Liska  <mliska@suse.cz>
    Richard Biener  <rguenther@suse.de>

* tree-vectorizer.h (_bb_vec_info::const_iterator): Remove.
(_bb_vec_info::const_reverse_iterator): Likewise.
(_bb_vec_info::region_stmts): Likewise.
(_bb_vec_info::reverse_region_stmts): Likewise.
(_bb_vec_info::_bb_vec_info): Adjust.
(_bb_vec_info::bb): Remove.
(_bb_vec_info::region_begin): Remove.
(_bb_vec_info::region_end): Remove.
(_bb_vec_info::bbs): New vector of BBs.
(vect_slp_function): Declare.
* tree-vect-patterns.c (vect_determine_precisions): Use
regular stmt iteration.
(vect_pattern_recog): Likewise.
* tree-vect-slp.c: Include cfganal.h, tree-eh.h and tree-cfg.h.
(vect_build_slp_tree_1): Properly refuse to vectorize
volatile and throwing stmts.
(vect_build_slp_tree_2): Pass group-size down to
get_vectype_for_scalar_type.
(_bb_vec_info::_bb_vec_info): Use regular stmt iteration,
adjust for changed region specification.
(_bb_vec_info::~_bb_vec_info): Likewise.
(vect_slp_check_for_constructors): Likewise.
(vect_slp_region): Likewise.
(vect_slp_bbs): New worker operating on a vector of BBs.
(vect_slp_bb): Wrap it.
(vect_slp_function): New function splitting the function
into multi-BB regions.
(vect_create_constant_vectors): Handle the case of inserting
after a throwing def.
(vect_schedule_slp_instance): Adjust.
* tree-vectorizer.c (vec_info::remove_stmt): Simplify again.
(vec_info::insert_seq_on_entry): Adjust.
(pass_slp_vectorize::execute): Also init PHIs.  Call
vect_slp_function.

* gcc.dg/vect/bb-slp-22.c: Adjust.
* gfortran.dg/pr68627.f: Likewise.
gcc/testsuite/gcc.dg/vect/bb-slp-22.c
gcc/testsuite/gfortran.dg/pr68627.f
gcc/tree-vect-patterns.c
gcc/tree-vect-slp.c
gcc/tree-vectorizer.c
gcc/tree-vectorizer.h