drm/amd/display: Fix deadlock with display during hanged ring recovery.
authorAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Wed, 13 Feb 2019 18:53:45 +0000 (13:53 -0500)
committerAlex Deucher <alexander.deucher@amd.com>
Wed, 13 Feb 2019 22:51:17 +0000 (17:51 -0500)
When ring hang happens amdgpu_dm_commit_planes during flip is holding
the BO reserved and then stack waiting for fences to signal in
reservation_object_wait_timeout_rcu (which won't signal because there
was a hnag). Then when we try to shutdown display block during reset
recovery from drm_atomic_helper_suspend we also try to reserve the BO
from dm_plane_helper_cleanup_fb ending in deadlock.
Also remove useless WARN_ON

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c

index ad31d7b..c7b66fa 100644 (file)
@@ -4698,14 +4698,21 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
                         */
                        abo = gem_to_amdgpu_bo(fb->obj[0]);
                        r = amdgpu_bo_reserve(abo, true);
-                       if (unlikely(r != 0)) {
+                       if (unlikely(r != 0))
                                DRM_ERROR("failed to reserve buffer before flip\n");
-                               WARN_ON(1);
-                       }
 
-                       /* Wait for all fences on this FB */
-                       WARN_ON(reservation_object_wait_timeout_rcu(abo->tbo.resv, true, false,
-                                                                                   MAX_SCHEDULE_TIMEOUT) < 0);
+                       /*
+                        * Wait for all fences on this FB. Do limited wait to avoid
+                        * deadlock during GPU reset when this fence will not signal
+                        * but we hold reservation lock for the BO.
+                        */
+                       r = reservation_object_wait_timeout_rcu(abo->tbo.resv,
+                                                               true, false,
+                                                               msecs_to_jiffies(5000));
+                       if (unlikely(r == 0))
+                               DRM_ERROR("Waiting for fences timed out.");
+
+
 
                        amdgpu_bo_get_tiling_flags(abo, &tiling_flags);