drm/i915/hangcheck: Prevent long walks across full-ppgtt
authorMika Kuoppala <mika.kuoppala@linux.intel.com>
Wed, 2 Mar 2016 14:48:29 +0000 (16:48 +0200)
committerMika Kuoppala <mika.kuoppala@intel.com>
Fri, 4 Mar 2016 13:17:14 +0000 (15:17 +0200)
commit24a65e624bcdc726c7711ae90efeffaf0a8e9f32
tree37c200c671b61ce58a5edce28e5ac4d994ba1a92
parentd431440cce2427dcdd665d936865fe802637b4c2
drm/i915/hangcheck: Prevent long walks across full-ppgtt

With full-ppgtt, it takes the GPU an eon to traverse the entire 256PiB
address space, causing a loop to be detected. Under the current scheme,
if ACTHD walks off the end of a batch buffer and into an empty
address space, we "never" detect the hang. If we always increment the
score as the ACTHD is progressing then we will eventually timeout (after
~46.5s (31 * 1.5s) without advancing onto a new batch). To counter act
this, increase the amount we reduce the score for good batches, so that
only a series of almost-bad batches trigger a full reset. DoS detection
suffers slightly but series of long running shader tests will benefit.

Based on a patch from Chris Wilson.

Testcase: igt/drv_hangman/hangcheck-unterminated
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1456930109-21532-1-git-send-email-mika.kuoppala@intel.com
drivers/gpu/drm/i915/i915_debugfs.c
drivers/gpu/drm/i915/i915_gpu_error.c
drivers/gpu/drm/i915/i915_irq.c
drivers/gpu/drm/i915/intel_ringbuffer.h