Accidentally, we picked the index of the predecessors instead of the predecessors.
Totals from 8496 (6.30% of 134913) affected shaders: (GFX10.3)
CodeSize:
64070724 ->
64022516 (-0.08%); split: -0.08%, +0.00%
Instrs:
11932750 ->
11920698 (-0.10%); split: -0.10%, +0.00%
Latency:
144040266 ->
144017062 (-0.02%); split: -0.02%, +0.00%
InvThroughput:
29327735 ->
29326421 (-0.00%); split: -0.00%, +0.00%
Fossil DB stats on Rembrandt (RDNA2):
Totals from 4488 (3.33% of 134906) affected shaders:
CodeSize:
42759736 ->
42735392 (-0.06%); split: -0.06%, +0.00%
Instrs: 7960522 -> 7954436 (-0.08%); split: -0.08%, +0.00%
Latency:
96192647 ->
96172571 (-0.02%); split: -0.02%, +0.00%
InvThroughput:
19313576 ->
19312575 (-0.01%); split: -0.01%, +0.00%
Fixes:
75967a4814be7988afc20e59bac4b48bafacab00 ('aco/optimizer_postRA: Speed up reset_block() with predecessors.')
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16161>
/* Mark overwritten if it doesn't match with other predecessors. */
const unsigned until_reg = min_reg + num_regs;
- for (unsigned pred = 1; pred < num_preds; ++pred) {
- for (unsigned i = min_reg; i < until_reg; ++i) {
- Idx& idx = instr_idx_by_regs[block_index][i];
+ for (unsigned i = 1; i < num_preds; ++i) {
+ unsigned pred = preds[i];
+ for (unsigned reg = min_reg; reg < until_reg; ++reg) {
+ Idx& idx = instr_idx_by_regs[block_index][reg];
if (idx == overwritten_untrackable)
continue;
- if (idx != instr_idx_by_regs[pred][i])
+ if (idx != instr_idx_by_regs[pred][reg])
idx = overwritten_untrackable;
}
}