Previously ifcvt used to enforce that a mask A and the inverse of said mask be
represented as ~A. So for the masks
_25 = _6 != 0;
_44 = _4 != 0;
ifcvt would produce for an operation requiring the inverse of said mask
_26 = ~_25;
_43 = ~_44;
but now that VN is applied to the entire function body we get a simplification
on the mask and produce:
_26 = _6 == 0;
_43 = _4 == 0;
This in itself is not a problem semantically speaking (though it does create
more masks that need to be tracked) but when vectorizing the masked conditional
we would still detect _26 and _43 to be inverses of _25 and _44 and mark them
as requiring their operands be swapped.
When vectorizing we swap the operands but don't find the BIT_NOT_EXPR to remove
and so we leave the condition as is which produces invalid code:
------>vectorizing statement: _ifc__41 = _43 ? 0 : _ifc__40;
created new init_stmt: vect_cst__136 = { 0, ... }
add new stmt: _137 = mask__43.26_135 & loop_mask_111
note: add new stmt: vect__ifc__41.27_138 = VEC_COND_EXPR <_137, vect__ifc__40.25_133, vect_cst__136>;
This fixes disabling the inversion detection code when the loop isn't masked
since both conditional would be external. We'd then not use the new cond_code
and would incorrectly still swap the operands.
The resulting code is also better than GCC-11 with most operations now
predicated on the loop mask rather than a ptrue.
gcc/ChangeLog:
PR target/105197
* tree-vect-stmts.cc (vectorizable_condition): Prevent cond swap when
not masked.
gcc/testsuite/ChangeLog:
PR target/105197
* gcc.target/aarch64/sve/pr105197-1.c: New test.
* gcc.target/aarch64/sve/pr105197-2.c: New test.
--- /dev/null
+/* { dg-do run { target aarch64_sve_hw } } */
+/* { dg-additional-options "-O -ftree-vectorize" } */
+
+unsigned char arr_7[9][3];
+unsigned char (*main_arr_7)[3] = arr_7;
+int main() {
+ char arr_2[9];
+ int arr_6[9];
+ int x;
+ unsigned i;
+ for (i = 0; i < 9; ++i) {
+ arr_2[i] = 21;
+ arr_6[i] = 6;
+ }
+ for (i = arr_2[8] - 21; i < 2; i++)
+ x = arr_6[i] ? (main_arr_7[8][i] ? main_arr_7[8][i] : 8) : (char)arr_6[i];
+ if (x != 8)
+ __builtin_abort ();
+}
+
--- /dev/null
+/* { dg-do compile } */
+/* { dg-additional-options "-O -ftree-vectorize" } */
+
+void f(int n, int y, char *arr_2, char *arr_6) {
+ for (int i = y; i < n; i++)
+ arr_6[i] = arr_6[i] ? (arr_2[i] ? 3 : 8) : 1;
+}
+
+/* { dg-final { scan-assembler-not {\tand\tp[0-9]+.b} } } */
bool honor_nans = HONOR_NANS (TREE_TYPE (cond.op0));
tree_code orig_code = cond.code;
cond.code = invert_tree_comparison (cond.code, honor_nans);
- if (loop_vinfo->scalar_cond_masked_set.contains (cond))
+ if (!masked && loop_vinfo->scalar_cond_masked_set.contains (cond))
{
masks = &LOOP_VINFO_MASKS (loop_vinfo);
cond_code = cond.code;