From aead5316d28e4b39c26225a99c45d9d3b7036305 Mon Sep 17 00:00:00 2001 From: Alyssa Rosenzweig Date: Sun, 20 Aug 2023 12:19:55 -0400 Subject: [PATCH] nir/opt_sink: Move ALU with constant sources MIME-Version: 1.0 Content-Type: text/plain; charset=utf8 Content-Transfer-Encoding: 8bit In general, sinking ALU instructions can negatively impact register pressure, since it extends the live ranges of the sources, although it does shrink the live range of the destination. However, constants do not usually contribute to register pressure. This is not a totally true assumption, but it's pretty good in practice, since... * constants can be rematerialized (backend-dependent) * constants can often be inlined (ISA-dependent) * constants can sometimes be promoted to free uniform registers (ISA-dependent) * constants can live in scalar registers although the ALU destination might need a vector register (and vector registers are assumed to be much more expensive than scalar registers, again ISA-dependent) So, assume that constants have zero effect on register pressure. Now consider an ALU instruction where all but one source is a constant. Then there are two cases: 1. The ALU instruction is moved past when its source was otherwise killed. Then there is no effect on register pressure, since the source live range is extended exactly as much as the destination live range shrinks. 2. The ALU instruction is moved down but its source is still alive where it's moved to. Then register pressure is improved, since the source live range is unchanged while the destination live range shrinks. So, as a heuristic, we always move ALU instructions where n-1 sources are constant. As an inevitable special case, this also (necessarily) moves unary ALU ops, which should be beneficial by the same justification. This is not 100% perfect but it is well-motivated. Results on AGX are decent: total instructions in shared programs: 1796101 -> 1795652 (-0.02%) instructions in affected programs: 326822 -> 326373 (-0.14%) helped: 800 HURT: 371 Inconclusive result (%-change mean confidence interval includes 0). total bytes in shared programs: 11805004 -> 11801424 (-0.03%) bytes in affected programs: 2610630 -> 2607050 (-0.14%) helped: 912 HURT: 462 Inconclusive result (%-change mean confidence interval includes 0). total halfregs in shared programs: 525818 -> 515399 (-1.98%) halfregs in affected programs: 118197 -> 107778 (-8.81%) helped: 2095 HURT: 804 Halfregs are helped. total threads in shared programs: 18916608 -> 18917056 (<.01%) threads in affected programs: 4800 -> 5248 (9.33%) helped: 7 HURT: 0 Threads are helped. Signed-off-by: Alyssa Rosenzweig Reviewed-by: Daniel Schürmann Part-of: --- src/compiler/nir/nir.h | 1 + src/compiler/nir/nir_opt_sink.c | 18 +++++++++++++++++- 2 files changed, 18 insertions(+), 1 deletion(-) diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h index 0095634..3fb6cb6 100644 --- a/src/compiler/nir/nir.h +++ b/src/compiler/nir/nir.h @@ -6037,6 +6037,7 @@ typedef enum { nir_move_copies = (1 << 4), nir_move_load_ssbo = (1 << 5), nir_move_load_uniform = (1 << 6), + nir_move_alu = (1 << 7), } nir_move_options; bool nir_can_move_instr(nir_instr *instr, nir_move_options options); diff --git a/src/compiler/nir/nir_opt_sink.c b/src/compiler/nir/nir_opt_sink.c index bae3dd0..4bfb59c 100644 --- a/src/compiler/nir/nir_opt_sink.c +++ b/src/compiler/nir/nir_opt_sink.c @@ -58,7 +58,23 @@ nir_can_move_instr(nir_instr *instr, nir_move_options options) return options & nir_move_copies; if (nir_alu_instr_is_comparison(alu)) return options & nir_move_comparisons; - return false; + + /* Assuming that constants do not contribute to register pressure, it is + * beneficial to sink ALU instructions where all but one source is + * constant. Detect that case last. + */ + if (!(options & nir_move_alu)) + return false; + + unsigned inputs = nir_op_infos[alu->op].num_inputs; + unsigned constant_inputs = 0; + + for (unsigned i = 0; i < inputs; ++i) { + if (nir_src_is_const(alu->src[i].src)) + constant_inputs++; + } + + return (constant_inputs + 1 >= inputs); } case nir_instr_type_intrinsic: { nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr); -- 2.7.4