From 8cf97637ff8891be040bac37b96dd97e5996ca93 Mon Sep 17 00:00:00 2001 From: Daniel Vetter Date: Thu, 2 Sep 2021 16:20:49 +0200 Subject: [PATCH] drm/i915: Keep gem ctx->vm alive until the final put MIME-Version: 1.0 Content-Type: text/plain; charset=utf8 Content-Transfer-Encoding: 8bit The comment added in commit b81dde719439c8f09bb61e742ed95bfc4b33946b Author: Chris Wilson Date: Tue May 21 22:11:29 2019 +0100 drm/i915: Allow userspace to clone contexts on creation and moved in commit 27dbae8f36c1c25008b7885fc07c57054b7dfba3 Author: Chris Wilson Date: Wed Nov 6 09:13:12 2019 +0000 drm/i915/gem: Safely acquire the ctx->vm when copying suggested that i915_address_space were at least intended to be managed through SLAB_TYPESAFE_BY_RCU: * This ppgtt may have be reallocated between * the read and the kref, and reassigned to a third * context. In order to avoid inadvertent sharing * of this ppgtt with that third context (and not * src), we have to confirm that we have the same * ppgtt after passing through the strong memory * barrier implied by a successful * kref_get_unless_zero(). But extensive git history search has not brough any such reuse to light. What has come to light though is that ever since commit 2850748ef8763ab46958e43a4d1c445f29eeb37d Author: Chris Wilson Date: Fri Oct 4 14:39:58 2019 +0100 drm/i915: Pull i915_vma_pin under the vm->mutex (yes this commit is earlier) the final i915_vma_put call has been moved from i915_gem_context_free (now called _release) to context_close, which means it's not actually safe anymore to access the ctx->vm pointer without lock helds, because it might disappear at any moment. Note that superficially things all still work, because the i915_address_space is RCU protected since commit b32fa811156328aea5a3c2ff05cc096490382456 Author: Chris Wilson Date: Thu Jun 20 19:37:05 2019 +0100 drm/i915/gtt: Defer address space cleanup to an RCU worker except the very clever macro above (which is designed to protected against object reuse due to SLAB_TYPESAFE_BY_RCU or similar tricks) results in an endless loop if the refcount of the ctx->vm ever permanently drops to 0. Which it totally now can. Fix that by moving the final i915_vm_put to where it should be. Note that i915_gem_context is rcu protected, but _only_ the final kfree. This means anyone who chases a pointer to a gem ctx solely under the protection can pretty only call kref_get_unless_zero(). This seems to be pretty much the case, aside from a bunch of cases that consult the scheduling information without any further protection. Reviewed-by: Maarten Lankhorst Cc: Jason Ekstrand Cc: Chris Wilson Cc: Tvrtko Ursulin Cc: Joonas Lahtinen Cc: Matthew Brost Cc: Matthew Auld Cc: Maarten Lankhorst Cc: "Thomas Hellström" Cc: Lionel Landwerlin Cc: Dave Airlie Fixes: 2850748ef876 ("drm/i915: Pull i915_vma_pin under the vm->mutex") Signed-off-by: Daniel Vetter Link: https://patchwork.freedesktop.org/patch/msgid/20210902142057.929669-3-daniel.vetter@ffwll.ch --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 5a053cf..12e2de1 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -990,6 +990,7 @@ static void i915_gem_context_release_work(struct work_struct *work) { struct i915_gem_context *ctx = container_of(work, typeof(*ctx), release_work); + struct i915_address_space *vm; trace_i915_context_free(ctx); GEM_BUG_ON(!i915_gem_context_is_closed(ctx)); @@ -997,6 +998,10 @@ static void i915_gem_context_release_work(struct work_struct *work) if (ctx->syncobj) drm_syncobj_put(ctx->syncobj); + vm = i915_gem_context_vm(ctx); + if (vm) + i915_vm_put(vm); + mutex_destroy(&ctx->engines_mutex); mutex_destroy(&ctx->lut_mutex); @@ -1220,8 +1225,15 @@ static void context_close(struct i915_gem_context *ctx) set_closed_name(ctx); vm = i915_gem_context_vm(ctx); - if (vm) + if (vm) { + /* i915_vm_close drops the final reference, which is a bit too + * early and could result in surprises with concurrent + * operations racing with thist ctx close. Keep a full reference + * until the end. + */ + i915_vm_get(vm); i915_vm_close(vm); + } ctx->file_priv = ERR_PTR(-EBADF); -- 2.7.4