remove SLP_TREE_TWO_OPERATORS, add SLP permutation node
This removes the SLP_TREE_TWO_OPERATORS hack in favor of having
explicit SLP nodes for both computations and the blend operation.
For this introduce a generic merge + select + permute SLP node
(with implementation limits).
Building upon earlier patches it adds vect_stmt_dominates_stmt_p
and the ability to compute a vector insertion place from
vectorized stmts (which now have UID zero) as needed for
the permute node.
2020-06-17 Richard Biener <rguenther@suse.de>
* tree-vectorizer.h (_slp_tree::two_operators): Remove.
(_slp_tree::lane_permutation): New member.
(_slp_tree::code): Likewise.
(SLP_TREE_TWO_OPERATORS): Remove.
(SLP_TREE_LANE_PERMUTATION): New.
(SLP_TREE_CODE): Likewise.
(vect_stmt_dominates_stmt_p): Declare.
* tree-vectorizer.c (vect_stmt_dominates_stmt_p): New function.
* tree-vect-stmts.c (vect_model_simple_cost): Remove
SLP_TREE_TWO_OPERATORS handling.
* tree-vect-slp.c (_slp_tree::_slp_tree): Amend.
(_slp_tree::~_slp_tree): Likewise.
(vect_two_operations_perm_ok_p): Remove.
(vect_build_slp_tree_1): Remove verification of two-operator
permutation here.
(vect_build_slp_tree_2): When we have two different operators
build two computation SLP nodes and a blend.
(vect_print_slp_tree): Print the lane permutation if it exists.
(slp_copy_subtree): Copy it.
(vect_slp_rearrange_stmts): Re-arrange it.
(vect_slp_analyze_node_operations_1): Handle SLP_TREE_CODE
VEC_PERM_EXPR explicitely.
(vect_schedule_slp_instance): Likewise. Remove old
SLP_TREE_TWO_OPERATORS code.
(vectorizable_slp_permutation): New function.