drm/amdgpu: fix the nullptr issue when reenter GPU recovery
authorDennis Li <Dennis.Li@amd.com>
Thu, 20 Aug 2020 02:17:39 +0000 (10:17 +0800)
committerAlex Deucher <alexander.deucher@amd.com>
Mon, 24 Aug 2020 16:23:54 +0000 (12:23 -0400)
in single gpu system, if driver reenter gpu recovery,
amdgpu_device_lock_adev will return false, but hive is
nullptr now.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c

index 82242e2..81b1d9a 100644 (file)
@@ -4371,8 +4371,8 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
                if (!amdgpu_device_lock_adev(tmp_adev)) {
                        DRM_INFO("Bailing on TDR for s_job:%llx, as another already in progress",
                                  job ? job->base.id : -1);
-                       mutex_unlock(&hive->hive_lock);
-                       return 0;
+                       r = 0;
+                       goto skip_recovery;
                }
 
                /*
@@ -4505,6 +4505,7 @@ skip_sched_resume:
                amdgpu_device_unlock_adev(tmp_adev);
        }
 
+skip_recovery:
        if (hive) {
                atomic_set(&hive->in_reset, 0);
                mutex_unlock(&hive->hive_lock);