drm/i915/ppgtt: Defer request freeing on reset
authorBen Widawsky <benjamin.widawsky@intel.com>
Wed, 1 Jan 2014 18:15:13 +0000 (10:15 -0800)
committerDaniel Vetter <daniel.vetter@ffwll.ch>
Wed, 22 Jan 2014 09:34:37 +0000 (10:34 +0100)
We need to defer the free request until the object/vma is capable of
being freed - or else we have a problem  when we try to destroy the
context.

The exact same issue is described and fixed here:
commit e20780439b26ba95aeb29d3e27cd8cc32bc82a4c
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Fri Dec 6 14:11:22 2013 -0800

    drm/i915: Defer request freeing

I had this fix previously, but decided not to keep it for some reason I
can no longer remember.

gem_reset_stats is a really good test at hitting the problem.

For the inquisitive:
[  170.516392] ------------[ cut here ]------------
[  170.517227] WARNING: CPU: 1 PID: 105 at drivers/gpu/drm/drm_mm.c:578 drm_mm_takedown+0x2e/0x30 [drm]()
[  170.518064] Memory manager not clean during takedown.
[  170.518941] CPU: 1 PID: 105 Comm: kworker/1:1 Not tainted 3.13.0-rc4-BEN+ #28
[  170.519787] Hardware name: Hewlett-Packard HP EliteBook 8470p/179B, BIOS 68ICF Ver. F.02 04/27/2012
[  170.520662] Call Trace:
[  170.521517]  [<ffffffff814f0589>] dump_stack+0x4e/0x7a
[  170.522373]  [<ffffffff81049e6d>] warn_slowpath_common+0x7d/0xa0
[  170.523227]  [<ffffffff81049edc>] warn_slowpath_fmt+0x4c/0x50
[  170.524079]  [<ffffffffa06c414e>] drm_mm_takedown+0x2e/0x30 [drm]
[  170.524934]  [<ffffffffa07213f3>] gen6_ppgtt_cleanup+0x23/0x110
[i915]
[  170.525777]  [<ffffffffa07837ed>] ppgtt_release.part.5+0x24/0x29
[i915]
[  170.526603]  [<ffffffffa071aaa5>] i915_gem_context_free+0x195/0x1a0
[i915]
[  170.527423]  [<ffffffffa071189d>] i915_gem_free_request+0x9d/0xb0
[i915]
[  170.528247]  [<ffffffffa0718af9>] i915_gem_reset+0x1f9/0x3f0 [i915]
[  170.529065]  [<ffffffffa0700cce>] i915_reset+0x4e/0x180 [i915]
[  170.529870]  [<ffffffffa070829d>] i915_error_work_func+0xcd/0x120
[i915]
[  170.530666]  [<ffffffff8106c13a>] process_one_work+0x1fa/0x6d0
[  170.531453]  [<ffffffff8106c0d8>] ? process_one_work+0x198/0x6d0
[  170.532230]  [<ffffffff8106c72b>] worker_thread+0x11b/0x3a0
[  170.532996]  [<ffffffff8106c610>] ? process_one_work+0x6d0/0x6d0
[  170.533771]  [<ffffffff810743ef>] kthread+0xff/0x120
[  170.534548]  [<ffffffff810742f0>] ? insert_kthread_work+0x80/0x80
[  170.535322]  [<ffffffff814f97ac>] ret_from_fork+0x7c/0xb0
[  170.536089]  [<ffffffff810742f0>] ? insert_kthread_work+0x80/0x80
[  170.536847] ---[ end trace 3d4c12892e42d58f ]---

v2: Whitespace fix. (Chris)

Note: This is a bug that only hits the ppgtt topic branch but I've
figured that doing the request cleanup in this order is generally the
right thing to do.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Add a code comment to clarify what's actually going on since
the lifetime rules aroung ppgtt cleanup are ... fuzzy a best atm. Also
add a note about why we need this.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
drivers/gpu/drm/i915/i915_gem.c

index 51d4c1a..4461336 100644 (file)
@@ -2388,16 +2388,6 @@ static void i915_gem_reset_ring_status(struct drm_i915_private *dev_priv,
 static void i915_gem_reset_ring_cleanup(struct drm_i915_private *dev_priv,
                                        struct intel_ring_buffer *ring)
 {
-       while (!list_empty(&ring->request_list)) {
-               struct drm_i915_gem_request *request;
-
-               request = list_first_entry(&ring->request_list,
-                                          struct drm_i915_gem_request,
-                                          list);
-
-               i915_gem_free_request(request);
-       }
-
        while (!list_empty(&ring->active_list)) {
                struct drm_i915_gem_object *obj;
 
@@ -2407,6 +2397,23 @@ static void i915_gem_reset_ring_cleanup(struct drm_i915_private *dev_priv,
 
                i915_gem_object_move_to_inactive(obj);
        }
+
+       /*
+        * We must free the requests after all the corresponding objects have
+        * been moved off active lists. Which is the same order as the normal
+        * retire_requests function does. This is important if object hold
+        * implicit references on things like e.g. ppgtt address spaces through
+        * the request.
+        */
+       while (!list_empty(&ring->request_list)) {
+               struct drm_i915_gem_request *request;
+
+               request = list_first_entry(&ring->request_list,
+                                          struct drm_i915_gem_request,
+                                          list);
+
+               i915_gem_free_request(request);
+       }
 }
 
 void i915_gem_restore_fences(struct drm_device *dev)