drm/i915/gt: Cancel the preemption timeout on responding to it
authorChris Wilson <chris@chris-wilson.co.uk>
Fri, 4 Dec 2020 15:12:32 +0000 (15:12 +0000)
committerChris Wilson <chris@chris-wilson.co.uk>
Fri, 4 Dec 2020 15:41:32 +0000 (15:41 +0000)
We currently presume that the engine reset is successful, cancelling the
expired preemption timer in the process. However, engine resets can
fail, leaving the timeout still pending and we will then respond to the
timeout again next time the tasklet fires. What we want is for the
failed engine reset to be promoted to a full device reset, which is
kicked by the heartbeat once the engine stops processing events.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1168
Fixes: 3a7a92aba8fb ("drm/i915/execlists: Force preemption")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org> # v5.5+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201204151234.19729-2-chris@chris-wilson.co.uk
drivers/gpu/drm/i915/gt/intel_lrc.c

index 1d209a8..7f25894 100644 (file)
@@ -3209,8 +3209,10 @@ static void execlists_submission_tasklet(unsigned long data)
                spin_unlock_irqrestore(&engine->active.lock, flags);
 
                /* Recheck after serialising with direct-submission */
-               if (unlikely(timeout && preempt_timeout(engine)))
+               if (unlikely(timeout && preempt_timeout(engine))) {
+                       cancel_timer(&engine->execlists.preempt);
                        execlists_reset(engine, "preemption time out");
+               }
        }
 }