broadcom/compiler: avoid using ldvary sequence to hide latency of branching
authorIago Toral Quiroga <itoral@igalia.com>
Wed, 9 Nov 2022 12:07:45 +0000 (13:07 +0100)
committerMarge Bot <emma+marge@anholt.net>
Wed, 9 Nov 2022 20:51:25 +0000 (20:51 +0000)
commit1174f376096ed6ceebb0fb2810456f1501a68df7
treedd7427aa3ab81831b667bb6732029035b85fb813
parent019ca611fa8bd5e94c15775308d61ca916ea8457
broadcom/compiler: avoid using ldvary sequence to hide latency of branching

This can cause us to stomp the contents of r5 before we have a chance to read
it, like this:

0x3d103186bb800000 nop                           ; nop                         ; ldvary.r0
0x3d105686bbf40000 nop                           ; mov rf26, r5                ; ldvary.r1
0x020000ef0000d000 bu.allna  232, r:unif (0x0000001c / 0.000000)
0x3d1096c6bbf40000 nop                           ; mov rf27, r5                ; ldvary.r2

Here, the MOV in the last instruction is supposed to read r5 produced from
ldvary.r0, but because we have inserted the bu instruction in between now
that read happens at the same time that ldvary.r1 updates r5, stomping the
value we were supposed to read.

Fix this by disallowing injection of a branch instruction in between an ldvary
instruction and its write to the r5 register 2 instructions later.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7062
Reviewed-by: Alejandro PiƱeiro <apinheiro@igalia.com>
cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19616>
src/broadcom/compiler/qpu_schedule.c