intel/compiler: Teach signed integer range analysis about imax and imin
authorIan Romanick <ian.d.romanick@intel.com>
Fri, 4 Feb 2022 02:26:40 +0000 (18:26 -0800)
committerMarge Bot <emma+marge@anholt.net>
Tue, 8 Nov 2022 00:02:16 +0000 (00:02 +0000)
This is especially helpful for a*isign(a) generated by idiv_by_const
optimization.  On many GPUs, isign(a) is lowered to imax(imin(a, 1),
-1).

There are no changes on fossil-db because ANV uses a different
optimization path for idiv with a constant denominator.  A future MR
will change this.

NOTE: This commit used to help a few hundred shader-db shaders, but
now none are affected.  I suspect this is due to some change in the
idiv_by_const optimization.  This could possibly be dropped.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17718>

src/intel/compiler/brw_nir_opt_peephole_imul32x16.c

index 3762150..fb2522b 100644 (file)
@@ -122,6 +122,38 @@ signed_integer_range_analysis(nir_shader *shader, struct hash_table *range_ht,
          return root ^ integer_neg;
       }
 
+      case nir_op_imax: {
+         int src0_lo, src0_hi;
+         int src1_lo, src1_hi;
+
+         signed_integer_range_analysis(shader, range_ht,
+                                       nir_ssa_scalar_chase_alu_src(scalar, 0),
+                                       &src0_lo, &src0_hi);
+         signed_integer_range_analysis(shader, range_ht,
+                                       nir_ssa_scalar_chase_alu_src(scalar, 1),
+                                       &src1_lo, &src1_hi);
+
+         *lo = MAX2(src0_lo, src1_lo);
+         *hi = MAX2(src0_hi, src1_hi);
+         break;
+      }
+
+      case nir_op_imin: {
+         int src0_lo, src0_hi;
+         int src1_lo, src1_hi;
+
+         signed_integer_range_analysis(shader, range_ht,
+                                       nir_ssa_scalar_chase_alu_src(scalar, 0),
+                                       &src0_lo, &src0_hi);
+         signed_integer_range_analysis(shader, range_ht,
+                                       nir_ssa_scalar_chase_alu_src(scalar, 1),
+                                       &src1_lo, &src1_hi);
+
+         *lo = MIN2(src0_lo, src1_lo);
+         *hi = MIN2(src0_hi, src1_hi);
+         break;
+      }
+
       default:
          break;
       }