drm/i915/gt: Allow temporary suspension of inflight requests
authorChris Wilson <chris@chris-wilson.co.uk>
Thu, 16 Jan 2020 18:47:53 +0000 (18:47 +0000)
committerJani Nikula <jani.nikula@intel.com>
Wed, 12 Feb 2020 14:55:58 +0000 (16:55 +0200)
commitc3f1ed90e6ffbf4e22010522351779f920e53d0d
tree68d2c5cadd7d9d6351aceae2f7a40fa201011a4f
parent9e2750fc80b5cc606365201132d49fed00570dd1
drm/i915/gt: Allow temporary suspension of inflight requests

In order to support out-of-line error capture, we need to remove the
active request from HW and put it to one side while a worker compresses
and stores all the details associated with that request. (As that
compression may take an arbitrary user-controlled amount of time, we
want to let the engine continue running on other workloads while the
hanging request is dumped.) Not only do we need to remove the active
request, but we also have to remove its context and all requests that
were dependent on it (both in flight, queued and future submission).

Finally once the capture is complete, we need to be able to resubmit the
request and its dependents and allow them to execute.

v2: Replace stack recursion with a simple list.
v3: Check all the parents, not just the first, when searching for a
stuck ancestor!

References: https://gitlab.freedesktop.org/drm/intel/issues/738
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200116184754.2860848-2-chris@chris-wilson.co.uk
(cherry picked from commit 32ff621fd74496f0c33644125fb69ff175859b1f)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
drivers/gpu/drm/i915/gt/intel_engine_cs.c
drivers/gpu/drm/i915/gt/intel_engine_types.h
drivers/gpu/drm/i915/gt/intel_lrc.c
drivers/gpu/drm/i915/gt/selftest_lrc.c
drivers/gpu/drm/i915/i915_request.h