Extend is_cond_scalar_reduction to handle bit_and/bit_xor/bit_ior.
authorliuhongt <hongtao.liu@intel.com>
Mon, 8 Nov 2021 07:49:17 +0000 (15:49 +0800)
committerliuhongt <hongtao.liu@intel.com>
Wed, 10 Nov 2021 08:28:42 +0000 (16:28 +0800)
commit249b4eeef1fe30237acb4d8e1832243b39d61e7e
treec4bb6d262e576119cbff86e719102636d21cf992
parentf2572a398d21fd52435c94065c0651fd79db847c
Extend is_cond_scalar_reduction to handle bit_and/bit_xor/bit_ior.

This will enable transformation like

-  # sum1_50 = PHI <prephitmp_64(13), 0(4)>
-  # sum2_52 = PHI <sum2_21(13), 0(4)>
+  # sum1_50 = PHI <_87(13), 0(4)>
+  # sum2_52 = PHI <_89(13), 0(4)>
   # ivtmp_62 = PHI <ivtmp_61(13), 64(4)>
   i.2_7 = (long unsigned int) i_49;
   _8 = i.2_7 * 8;
...
   vec1_i_38 = vec1_29 >> _10;
   vec2_i_39 = vec2_31 >> _10;
   _11 = vec1_i_38 & 1;
-  _63 = tmp_37 ^ sum1_50;
-  prephitmp_64 = _11 == 0 ? sum1_50 : _63;
+  _ifc__86 = _11 != 0 ? tmp_37 : 0;
+  _87 = sum1_50 ^ _ifc__86;
   _12 = vec2_i_39 & 1;
:

so that vectorizer won't failed due to

  /* If this isn't a nested cycle or if the nested cycle reduction value
     is used ouside of the inner loop we cannot handle uses of the reduction
     value.  */
  if (nlatch_def_loop_uses > 1 || nphi_def_loop_uses > 1)
    {
      if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
 "reduction used in loop.\n");
      return NULL;
    }

gcc/ChangeLog:

PR tree-optimization/103126
* tree-vect-loop.c (neutral_op_for_reduction): Remove static.
* tree-vectorizer.h (neutral_op_for_reduction): Declare.
* tree-if-conv.c : Include tree-vectorizer.h.
(is_cond_scalar_reduction): Handle
BIT_XOR_EXPR/BIT_IOR_EXPR/BIT_AND_EXPR.
(convert_scalar_cond_reduction): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/ifcvt-reduction-logic-op.c: New test.
gcc/testsuite/gcc.target/i386/ifcvt-reduction-logic-op.c [new file with mode: 0644]
gcc/tree-if-conv.c
gcc/tree-vect-loop.c
gcc/tree-vectorizer.h