It caused ~5% of perf regression for some gfxbench benchmarks.
Fixes:
b8d10d9e87a ("gallium: split up req_local_mem")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20219>
A6XX_SP_CS_CTRL_REG0_BRANCHSTACK(ir3_shader_branchstack_hw(v)));
uint32_t shared_size =
- MAX2(((int)v->cs.req_local_mem + variable_shared_size- 1) / 1024, 1);
+ MAX2(((int)(v->cs.req_local_mem + variable_shared_size) - 1) / 1024, 1);
OUT_PKT4(ring, REG_A6XX_SP_CS_UNKNOWN_A9B1, 1);
OUT_RING(ring, A6XX_SP_CS_UNKNOWN_A9B1_SHARED_SIZE(shared_size) |
A6XX_SP_CS_UNKNOWN_A9B1_UNK6);