drm/i915/gt: Bump the reset-failure timeout to 60s
authorChris Wilson <chris@chris-wilson.co.uk>
Fri, 16 Sep 2022 20:48:23 +0000 (13:48 -0700)
committerMatt Roper <matthew.d.roper@intel.com>
Tue, 27 Sep 2022 17:42:17 +0000 (10:42 -0700)
If attempting to perform a GT reset takes long than 5 seconds (including
resetting the display for gen3/4), then we declare all hope lost and
discard all user work and wedge the device to prevent further
misbehaviour. 5 seconds is too short a time for such drastic action, as
we may be stuck on other timeouts and watchdogs. If we allow a little
bit longer before hitting the big red button, we should at the very
least capture other hung task indicators pointing towards the reason why
the reset was hanging; and allow more marginal cases the extra headroom
to complete the reset without further collateral damage.

Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/6448
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220916204823.1897089-1-ashutosh.dixit@intel.com
drivers/gpu/drm/i915/gt/intel_reset.c

index b366743..3159df6 100644 (file)
@@ -1278,7 +1278,7 @@ static void intel_gt_reset_global(struct intel_gt *gt,
        kobject_uevent_env(kobj, KOBJ_CHANGE, reset_event);
 
        /* Use a watchdog to ensure that our reset completes */
-       intel_wedge_on_timeout(&w, gt, 5 * HZ) {
+       intel_wedge_on_timeout(&w, gt, 60 * HZ) {
                intel_display_prepare_reset(gt->i915);
 
                intel_gt_reset(gt, engine_mask, reason);