turnip: Fix the lack of WFM before indirect draws
authorDanylo Piliaiev <dpiliaiev@igalia.com>
Fri, 25 Mar 2022 13:26:52 +0000 (15:26 +0200)
committerMarge Bot <emma+marge@anholt.net>
Mon, 28 Mar 2022 16:09:07 +0000 (16:09 +0000)
We have to add WFM to pending bits when we are flushing into CP
for indirect draw to know when they should apply WFM workaround.

Fixes CTS tests:
dEQP-VK.draw.renderpass.indirect_draw.*_data_from_compute.indirect_draw_count*

Fixes: abf0ae014a878d063132a4bf2f2515dc7052f069
("tu: Properly handle waiting on an earlier pipeline stage")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15577>

src/freedreno/vulkan/tu_cmd_buffer.c

index 0879de3..a1e47a5 100644 (file)
@@ -2868,6 +2868,9 @@ tu_flush_for_stage(struct tu_cache_state *cache,
     * for any WFI's to finish. This is already done for draw calls, including
     * before indirect param reads, for the most part, so we just need to WFI.
     *
+    * However, some indirect draw opcodes, depending on firmware, don't have
+    * implicit CP_WAIT_FOR_ME so we have to handle it manually.
+    *
     * Transform feedback counters are read via CP_MEM_TO_REG, which implicitly
     * does CP_WAIT_FOR_ME, but we still need a WFI if the GPU writes it.
     *
@@ -2879,8 +2882,11 @@ tu_flush_for_stage(struct tu_cache_state *cache,
     * future, or if CP_DRAW_PRED_SET grows the capability to do 32-bit
     * comparisons, then this will have to be dealt with.
     */
-   if (src_stage > dst_stage)
+   if (src_stage > dst_stage) {
       cache->flush_bits |= TU_CMD_FLAG_WAIT_FOR_IDLE;
+      if (dst_stage == TU_STAGE_CP)
+         cache->pending_flush_bits |= TU_CMD_FLAG_WAIT_FOR_ME;
+   }
 }
 
 static enum tu_cmd_access_mask