From bd0ef080d03fa3a9eefb513aec8fee88339c33df Mon Sep 17 00:00:00 2001 From: Iago Toral Quiroga Date: Wed, 10 Feb 2021 08:34:05 +0100 Subject: [PATCH] v3d/compiler: fix QPU scheduler TMU sequence shuffling MIME-Version: 1.0 Content-Type: text/plain; charset=utf8 Content-Transfer-Encoding: 8bit The QPU scheduler allows to move certain TMU instructions around and since we enabled pipelining, we need to protect against the case where doing this might break a TMU sequence. For example, this test: dEQP-VK.rasterization.line_continuity.line-strip Was generating this VIR: mov tmud, t187 mov.pushz null, t176 mov.ifa tmua, t9 nop null; wrtmuc (img[0].p0 | 0x0) mov tmut, t185 mov tmud, t180 mov.ifa tmusf, t183 nop null; thrsw where we have a general TMU access (tmud,tmua) followed by an image access (wrtmuc, tmut, tmud, tmusf), which the QPU scheduler was turning into: nop ; nop ; ldunifrf.rf22 (0xffffff00 / -nan) nop ; nop ; wrtmuc (img[0].p0 | 0x0) nop ; nop ; ldtmu.r2 add r0, r2, 1 ; nop ; ldtmu.r3 nop ; nop ; ldtmu.r4 nop ; mov tmud, r0 nop ; mov.ifa tmua, rf15 nop ; mov tmut, r4 ; thrsw nop ; mov tmud, rf22 nop ; mov.ifa tmusf, r3 where it allowed the wrtmuc to move up and before the general TMU access, leading to an incorrect TMU sequence. Fix this by flagging TMUA writes (which are the sequence terminators for general TMU accessess) as writing new TMU configuration, like we do for all other TMU sequence terminators for textures and images. Fixes: 197090a3fc ('broadcom/compiler: implement pipelining for general TMU operations') Reviewed-by: Alejandro Piñeiro Part-of: --- src/broadcom/compiler/qpu_schedule.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/broadcom/compiler/qpu_schedule.c b/src/broadcom/compiler/qpu_schedule.c index ee3d800..b75d565 100644 --- a/src/broadcom/compiler/qpu_schedule.c +++ b/src/broadcom/compiler/qpu_schedule.c @@ -184,6 +184,8 @@ process_waddr_deps(struct schedule_state *state, struct schedule_node *n, case V3D_QPU_WADDR_TMUSCM: case V3D_QPU_WADDR_TMUSF: case V3D_QPU_WADDR_TMUSLOD: + case V3D_QPU_WADDR_TMUA: + case V3D_QPU_WADDR_TMUAU: add_write_dep(state, &state->last_tmu_config, n); break; default: -- 2.7.4