freedreno: fix compute shared_size underflow
authorChia-I Wu <olvaffe@gmail.com>
Thu, 8 Dec 2022 04:04:55 +0000 (20:04 -0800)
committerMarge Bot <emma+marge@anholt.net>
Thu, 8 Dec 2022 22:33:56 +0000 (22:33 +0000)
It caused ~5% of perf regression for some gfxbench benchmarks.

Fixes: b8d10d9e87a ("gallium: split up req_local_mem")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20219>

src/gallium/drivers/freedreno/a6xx/fd6_compute.c

index 9d9df19..dec3595 100644 (file)
@@ -73,7 +73,7 @@ cs_program_emit(struct fd_context *ctx, struct fd_ringbuffer *ring,
                A6XX_SP_CS_CTRL_REG0_BRANCHSTACK(ir3_shader_branchstack_hw(v)));
 
    uint32_t shared_size =
-      MAX2(((int)v->cs.req_local_mem + variable_shared_size- 1) / 1024, 1);
+      MAX2(((int)(v->cs.req_local_mem + variable_shared_size) - 1) / 1024, 1);
    OUT_PKT4(ring, REG_A6XX_SP_CS_UNKNOWN_A9B1, 1);
    OUT_RING(ring, A6XX_SP_CS_UNKNOWN_A9B1_SHARED_SIZE(shared_size) |
                      A6XX_SP_CS_UNKNOWN_A9B1_UNK6);