drm/i915: Refine VT-d scanout workaround
authorChris Wilson <chris@chris-wilson.co.uk>
Wed, 30 Nov 2022 23:58:04 +0000 (00:58 +0100)
committerAndi Shyti <andi.shyti@linux.intel.com>
Tue, 6 Dec 2022 09:52:46 +0000 (10:52 +0100)
commiteea380ad6b4234d70db544b15bcdcd4e76bc6136
treef67c817a7e75d1e446cde0acdddea3db27d0b104
parent6110225144d1136db5b026a22efbd76cee197027
drm/i915: Refine VT-d scanout workaround

VT-d may cause overfetch of the scanout PTE, both before and after the
vma (depending on the scanout orientation). bspec recommends that we
provide a tile-row in either directions, and suggests using 168 PTE,
warning that the accesses will wrap around the ends of the GGTT.
Currently, we fill the entire GGTT with scratch pages when using VT-d to
always ensure there are valid entries around every vma, including
scanout. However, writing every PTE is slow as on recent devices we
perform 8MiB of uncached writes, incurring an extra 100ms during resume.

If instead we focus on only putting guard pages around scanout, we can
avoid touching the whole GGTT. To avoid having to introduce extra nodes
around each scanout vma, we adjust the scanout drm_mm_node to be smaller
than the allocated space, and fixup the extra PTE during dma binding.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221130235805.221010-5-andi.shyti@linux.intel.com
drivers/gpu/drm/i915/gem/i915_gem_domain.c
drivers/gpu/drm/i915/gt/intel_ggtt.c