From: Alyssa Rosenzweig Date: Sat, 19 Feb 2022 00:20:27 +0000 (-0500) Subject: pan/bi: Specialize IDVS in NIR X-Git-Tag: upstream/22.3.5~12396 X-Git-Url: http://review.tizen.org/git/?a=commitdiff_plain;h=e0e63c2a8e6dba9d5806aebe355f16a0431fe64b;p=platform%2Fupstream%2Fmesa.git pan/bi: Specialize IDVS in NIR It's a bit more code, but it's needed to chew through control flow since we don't have a backend version of dead_cf. Results are really good, meaning I really screwed this up the first time around (hence the cc mesa-stable). total instructions in shared programs: 1963576 -> 1939513 (-1.23%) instructions in affected programs: 671053 -> 646990 (-3.59%) helped: 4436 HURT: 729 helped stats (abs) min: 1.0 max: 43.0 x̄: 5.75 x̃: 6 helped stats (rel) min: 0.21% max: 100.00% x̄: 6.47% x̃: 5.17% HURT stats (abs) min: 1.0 max: 22.0 x̄: 2.01 x̃: 1 HURT stats (rel) min: 0.50% max: 50.00% x̄: 10.45% x̃: 9.09% 95% mean confidence interval for instructions value: -4.77 -4.55 95% mean confidence interval for instructions %-change: -4.36% -3.80% Instructions are helped. total tuples in shared programs: 1533335 -> 1523194 (-0.66%) tuples in affected programs: 483167 -> 473026 (-2.10%) helped: 3414 HURT: 1288 helped stats (abs) min: 1.0 max: 20.0 x̄: 3.73 x̃: 2 helped stats (rel) min: 0.27% max: 100.00% x̄: 4.87% x̃: 3.03% HURT stats (abs) min: 1.0 max: 19.0 x̄: 2.02 x̃: 1 HURT stats (rel) min: 0.24% max: 38.10% x̄: 8.10% x̃: 5.88% 95% mean confidence interval for tuples value: -2.28 -2.03 95% mean confidence interval for tuples %-change: -1.62% -1.02% Tuples are helped. total clauses in shared programs: 351432 -> 329158 (-6.34%) clauses in affected programs: 142237 -> 119963 (-15.66%) helped: 5328 HURT: 3 helped stats (abs) min: 1.0 max: 43.0 x̄: 4.18 x̃: 4 helped stats (rel) min: 0.74% max: 100.00% x̄: 19.44% x̃: 17.24% HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 HURT stats (rel) min: 9.09% max: 12.50% x̄: 10.90% x̃: 11.11% 95% mean confidence interval for clauses value: -4.25 -4.11 95% mean confidence interval for clauses %-change: -19.72% -19.12% Clauses are helped. total cycles in shared programs: 202830.92 -> 172084.50 (-15.16%) cycles in affected programs: 117078.42 -> 86332 (-26.26%) helped: 5450 HURT: 1 helped stats (abs) min: 0.083333 max: 49.0 x̄: 5.64 x̃: 5 helped stats (rel) min: 1.42% max: 100.00% x̄: 27.94% x̃: 25.64% HURT stats (abs) min: 0.25 max: 0.25 x̄: 0.25 x̃: 0 HURT stats (rel) min: 2.46% max: 2.46% x̄: 2.46% x̃: 2.46% 95% mean confidence interval for cycles value: -5.74 -5.54 95% mean confidence interval for cycles %-change: -28.30% -27.58% Cycles are helped. total arith in shared programs: 57274.29 -> 57145.04 (-0.23%) arith in affected programs: 16418.33 -> 16289.08 (-0.79%) helped: 2442 HURT: 1784 helped stats (abs) min: 0.041665999999999315 max: 0.75 x̄: 0.14 x̃: 0 helped stats (rel) min: 0.23% max: 100.00% x̄: 5.51% x̃: 2.87% HURT stats (abs) min: 0.041665999999999315 max: 0.9166670000000003 x̄: 0.12 x̃: 0 HURT stats (rel) min: 0.00% max: 100.00% x̄: 25.13% x̃: 9.09% 95% mean confidence interval for arith value: -0.04 -0.03 95% mean confidence interval for arith %-change: 6.61% 8.24% Inconclusive result (value mean confidence interval and %-change mean confidence interval disagree). total texture in shared programs: 12857 -> 12857 (0.00%) texture in affected programs: 0 -> 0 helped: 0 HURT: 0 total vary in shared programs: 11157.75 -> 11157.75 (0.00%) vary in affected programs: 0 -> 0 helped: 0 HURT: 0 total ldst in shared programs: 177208 -> 146420 (-17.37%) ldst in affected programs: 117098 -> 86310 (-26.29%) helped: 5447 HURT: 0 helped stats (abs) min: 1.0 max: 49.0 x̄: 5.65 x̃: 5 helped stats (rel) min: 1.92% max: 100.00% x̄: 27.91% x̃: 25.64% 95% mean confidence interval for ldst value: -5.75 -5.55 95% mean confidence interval for ldst %-change: -28.27% -27.56% Ldst are helped. total quadwords in shared programs: 1436507 -> 1398329 (-2.66%) quadwords in affected programs: 515101 -> 476923 (-7.41%) helped: 5150 HURT: 111 helped stats (abs) min: 1.0 max: 39.0 x̄: 7.46 x̃: 6 helped stats (rel) min: 0.17% max: 100.00% x̄: 10.02% x̃: 8.24% HURT stats (abs) min: 1.0 max: 9.0 x̄: 2.01 x̃: 1 HURT stats (rel) min: 0.43% max: 21.62% x̄: 3.57% x̃: 1.94% 95% mean confidence interval for quadwords value: -7.41 -7.11 95% mean confidence interval for quadwords %-change: -9.98% -9.49% Quadwords are helped. total threads in shared programs: 35025 -> 35228 (0.58%) threads in affected programs: 218 -> 421 (93.12%) helped: 208 HURT: 5 helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 HURT stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00% 95% mean confidence interval for threads value: 0.91 0.99 95% mean confidence interval for threads %-change: 93.40% 99.55% Threads are helped. total loops in shared programs: 128 -> 125 (-2.34%) loops in affected programs: 3 -> 0 helped: 3 HURT: 0 helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% total spills in shared programs: 158 -> 149 (-5.70%) spills in affected programs: 15 -> 6 (-60.00%) helped: 9 HURT: 0 total fills in shared programs: 1133 -> 966 (-14.74%) fills in affected programs: 197 -> 30 (-84.77%) helped: 9 HURT: 0 Signed-off-by: Alyssa Rosenzweig Cc: mesa-stable Part-of: --- diff --git a/src/panfrost/bifrost/bifrost_compile.c b/src/panfrost/bifrost/bifrost_compile.c index ea25134..93be2de 100644 --- a/src/panfrost/bifrost/bifrost_compile.c +++ b/src/panfrost/bifrost/bifrost_compile.c @@ -695,6 +695,27 @@ bi_should_remove_store(nir_intrinsic_instr *intr, enum bi_idvs_mode idvs) } } +static bool +bifrost_nir_specialize_idvs(nir_builder *b, nir_instr *instr, void *data) +{ + enum bi_idvs_mode *idvs = data; + + if (instr->type != nir_instr_type_intrinsic) + return false; + + nir_intrinsic_instr *intr = nir_instr_as_intrinsic(instr); + + if (intr->intrinsic != nir_intrinsic_store_output) + return false; + + if (bi_should_remove_store(intr, *idvs)) { + nir_instr_remove(instr); + return true; + } + + return false; +} + static void bi_emit_store_vary(bi_builder *b, nir_intrinsic_instr *instr) { @@ -710,12 +731,6 @@ bi_emit_store_vary(bi_builder *b, nir_intrinsic_instr *instr) unsigned imm_index = 0; bool immediate = bi_is_intr_immediate(instr, &imm_index, 16); - /* Skip stores to the wrong kind of variable in a specialized IDVS - * shader. Backend dead code elimination will clean up the mess. - */ - if (bi_should_remove_store(instr, b->shader->idvs)) - return; - /* Only look at the total components needed. In effect, we fill in all * the intermediate "holes" in the write mask, since we can't mask off * stores. Since nir_lower_io_to_temporaries ensures each varying is @@ -3607,8 +3622,6 @@ bi_optimize_nir(nir_shader *nir, unsigned gpu_id, bool is_blend) NIR_PASS_V(nir, nir_shader_instructions_pass, nir_invalidate_divergence, nir_metadata_all, NULL); } - - NIR_PASS(progress, nir, nir_convert_from_ssa, true); } /* The cmdstream lowers 8-bit fragment output as 16-bit, so we need to do the @@ -3871,6 +3884,35 @@ bi_compile_variant_nir(nir_shader *nir, ctx->info = info; ctx->idvs = idvs; + if (idvs != BI_IDVS_NONE) { + /* Specializing shaders for IDVS is destructive, so we need to + * clone. However, the last (second) IDVS shader does not need + * to be preserved so we can skip cloning that one. + */ + if (offset == 0) + ctx->nir = nir = nir_shader_clone(ctx, nir); + + NIR_PASS_V(nir, nir_shader_instructions_pass, + bifrost_nir_specialize_idvs, + nir_metadata_block_index | nir_metadata_dominance, + &idvs); + + /* After specializing, clean up the mess */ + bool progress = true; + + while (progress) { + progress = false; + + NIR_PASS(progress, nir, nir_opt_dce); + NIR_PASS(progress, nir, nir_opt_dead_cf); + } + } + + /* We can only go out-of-SSA after speciailizing IDVS, as opt_dead_cf + * doesn't know how to deal with nir_register. + */ + NIR_PASS_V(nir, nir_convert_from_ssa, true); + /* If nothing is pushed, all UBOs need to be uploaded */ ctx->ubo_mask = ~0;