ir3: Do 16b tex dst folding only for floats
authorDanylo Piliaiev <dpiliaiev@igalia.com>
Tue, 20 Dec 2022 15:27:21 +0000 (16:27 +0100)
committerDanylo Piliaiev <dpiliaiev@igalia.com>
Fri, 23 Dec 2022 14:48:18 +0000 (15:48 +0100)
Folding signed or unsigned i32 -> i16 conversion into sampling
instruction causes it to behave differently with out-of-bounds
values. The conversion expects higher bits being masked, however
folded variant does clamp the value.

A concrete example is that:

 isaml.base0 (u16)(x)hr0.x

is not equal this:

 isaml.base0 (u32)(x)r0.w
 (sy)cov.u32u16 hr0.x, r0.w

Fixes misrendering in "Injustice 2".

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7869

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20396>

src/freedreno/ir3/ir3_nir.c

index a4fbff6..54f2ce0 100644 (file)
@@ -769,7 +769,7 @@ ir3_nir_lower_variant(struct ir3_shader_variant *so, nir_shader *s)
          };
          struct nir_fold_16bit_tex_image_options fold_16bit_options = {
             .rounding_mode = nir_rounding_mode_rtz,
-            .fold_tex_dest_types = nir_type_float | nir_type_uint | nir_type_int,
+            .fold_tex_dest_types = nir_type_float,
             /* blob dumps have no half regs on pixel 2's ldib or stib, so only enable for a6xx+. */
             .fold_image_load_store_data = so->compiler->gen >= 6,
             .fold_srcs_options_count = 1,