NOTE: This commit needs "nir: All set-on-comparison opcodes can take all
float types" or regressions will occur in other Vulkan SPIR-V tests.
No shader-db changes on any Intel platform.
NOTE: This commit depends on "nir: All set-on-comparison opcodes can
take all float types".
v2: Fix handling 16-bit (and presumably 64-bit) values.
About 280 shaders in Talos are hurt by a few instructions, and a couple
shaders in Doom 2016 are hurt by a few instructions.
Tiger Lake
Instructions in all programs:
159893290 ->
159895026 (+0.0%)
SENDs in all programs: 6936431 -> 6936431 (+0.0%)
Loops in all programs: 38385 -> 38385 (+0.0%)
Cycles in all programs:
7019260087 ->
7019254134 (-0.0%)
Spills in all programs: 101389 -> 101389 (+0.0%)
Fills in all programs: 131532 -> 131532 (+0.0%)
Ice Lake
Instructions in all programs:
143624235 ->
143625691 (+0.0%)
SENDs in all programs: 6980289 -> 6980289 (+0.0%)
Loops in all programs: 38383 -> 38383 (+0.0%)
Cycles in all programs:
8440083238 ->
8440090702 (+0.0%)
Spills in all programs: 102246 -> 102246 (+0.0%)
Fills in all programs: 131908 -> 131908 (+0.0%)
Skylake
Instructions in all programs:
134185495 ->
134186618 (+0.0%)
SENDs in all programs: 6938790 -> 6938790 (+0.0%)
Loops in all programs: 38356 -> 38356 (+0.0%)
Cycles in all programs:
8222366923 ->
8222365826 (-0.0%)
Spills in all programs: 98821 -> 98821 (+0.0%)
Fills in all programs: 125218 -> 125218 (+0.0%)
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Fixes:
1feeee9cf47 ("nir/spirv: Add initial support for GLSL 4.50 builtins")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13999>
break;
}
- case GLSLstd450Step:
- dest->def = nir_sge(nb, src[1], src[0]);
+ case GLSLstd450Step: {
+ /* The SPIR-V Extended Instructions for GLSL spec says:
+ *
+ * Result is 0.0 if x < edge; otherwise result is 1.0.
+ *
+ * Here src[1] is x, and src[0] is edge. The direct implementation is
+ *
+ * bcsel(src[1] < src[0], 0.0, 1.0)
+ *
+ * This is effectively b2f(!(src1 < src0)). Previously this was
+ * implemented using sge(src1, src0), but that produces incorrect
+ * results for NaN. Instead, we use the identity b2f(!x) = 1 - b2f(x).
+ */
+ const bool exact = nb->exact;
+ nb->exact = true;
+
+ nir_ssa_def *cmp = nir_slt(nb, src[1], src[0]);
+
+ nb->exact = exact;
+ dest->def = nir_fsub(nb, nir_imm_floatN_t(nb, 1.0f, cmp->bit_size), cmp);
break;
+ }
case GLSLstd450Length:
dest->def = nir_fast_length(nb, src[0]);