So far we only write a maximum of 4 dwords further into the batch and
it seems just going over the CS prefetch was enough.
Turns out writing more dwords can delay the writes and we start
prefetching stuff that hasn't landed in memory yet.
This fixes the issue by stalling the CS to ensure the writes have
landed before we go over the prefetch.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes:
796fccce631bf8 ("intel/mi-builder: add framework for self modifying batches")
Reviewed-by: Marcin Ĺšlusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8525>
static inline void
gen_mi_self_mod_barrier(struct gen_mi_builder *b)
{
+ /* First make sure all the memory writes from previous modifying commands
+ * have landed. We want to do this before going through the CS cache,
+ * otherwise we could be fetching memory that hasn't been written to yet.
+ */
+ gen_mi_builder_emit(b, GENX(PIPE_CONTROL), pc) {
+ pc.CommandStreamerStallEnable = true;
+ }
/* Documentation says Gen11+ should be able to invalidate the command cache
* but experiment show it doesn't work properly, so for now just get over
* the CS prefetch.