From 78c718490bc2843d4dadcef8a0ae14aed1d15a32 Mon Sep 17 00:00:00 2001 From: Tamar Christina Date: Mon, 11 Apr 2022 15:09:05 +0100 Subject: [PATCH] middle-end: Prevent the use of the cond inversion detection code when both conditions are external. [PR105197] Previously ifcvt used to enforce that a mask A and the inverse of said mask be represented as ~A. So for the masks _25 = _6 != 0; _44 = _4 != 0; ifcvt would produce for an operation requiring the inverse of said mask _26 = ~_25; _43 = ~_44; but now that VN is applied to the entire function body we get a simplification on the mask and produce: _26 = _6 == 0; _43 = _4 == 0; This in itself is not a problem semantically speaking (though it does create more masks that need to be tracked) but when vectorizing the masked conditional we would still detect _26 and _43 to be inverses of _25 and _44 and mark them as requiring their operands be swapped. When vectorizing we swap the operands but don't find the BIT_NOT_EXPR to remove and so we leave the condition as is which produces invalid code: ------>vectorizing statement: _ifc__41 = _43 ? 0 : _ifc__40; created new init_stmt: vect_cst__136 = { 0, ... } add new stmt: _137 = mask__43.26_135 & loop_mask_111 note: add new stmt: vect__ifc__41.27_138 = VEC_COND_EXPR <_137, vect__ifc__40.25_133, vect_cst__136>; This fixes disabling the inversion detection code when the loop isn't masked since both conditional would be external. We'd then not use the new cond_code and would incorrectly still swap the operands. The resulting code is also better than GCC-11 with most operations now predicated on the loop mask rather than a ptrue. gcc/ChangeLog: PR target/105197 * tree-vect-stmts.cc (vectorizable_condition): Prevent cond swap when not masked. gcc/testsuite/ChangeLog: PR target/105197 * gcc.target/aarch64/sve/pr105197-1.c: New test. * gcc.target/aarch64/sve/pr105197-2.c: New test. --- gcc/testsuite/gcc.target/aarch64/sve/pr105197-1.c | 20 ++++++++++++++++++++ gcc/testsuite/gcc.target/aarch64/sve/pr105197-2.c | 9 +++++++++ gcc/tree-vect-stmts.cc | 2 +- 3 files changed, 30 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pr105197-1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pr105197-2.c diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr105197-1.c b/gcc/testsuite/gcc.target/aarch64/sve/pr105197-1.c new file mode 100644 index 0000000..e33532d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr105197-1.c @@ -0,0 +1,20 @@ +/* { dg-do run { target aarch64_sve_hw } } */ +/* { dg-additional-options "-O -ftree-vectorize" } */ + +unsigned char arr_7[9][3]; +unsigned char (*main_arr_7)[3] = arr_7; +int main() { + char arr_2[9]; + int arr_6[9]; + int x; + unsigned i; + for (i = 0; i < 9; ++i) { + arr_2[i] = 21; + arr_6[i] = 6; + } + for (i = arr_2[8] - 21; i < 2; i++) + x = arr_6[i] ? (main_arr_7[8][i] ? main_arr_7[8][i] : 8) : (char)arr_6[i]; + if (x != 8) + __builtin_abort (); +} + diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr105197-2.c b/gcc/testsuite/gcc.target/aarch64/sve/pr105197-2.c new file mode 100644 index 0000000..5eec5cd --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr105197-2.c @@ -0,0 +1,9 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O -ftree-vectorize" } */ + +void f(int n, int y, char *arr_2, char *arr_6) { + for (int i = y; i < n; i++) + arr_6[i] = arr_6[i] ? (arr_2[i] ? 3 : 8) : 1; +} + +/* { dg-final { scan-assembler-not {\tand\tp[0-9]+.b} } } */ diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 87368e3..c9534ef 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -10512,7 +10512,7 @@ vectorizable_condition (vec_info *vinfo, bool honor_nans = HONOR_NANS (TREE_TYPE (cond.op0)); tree_code orig_code = cond.code; cond.code = invert_tree_comparison (cond.code, honor_nans); - if (loop_vinfo->scalar_cond_masked_set.contains (cond)) + if (!masked && loop_vinfo->scalar_cond_masked_set.contains (cond)) { masks = &LOOP_VINFO_MASKS (loop_vinfo); cond_code = cond.code; -- 2.7.4