drm/amdgpu: SRIOV flr_work should use down_write
authorVictor Skvortsov <victor.skvortsov@amd.com>
Mon, 13 Dec 2021 21:38:20 +0000 (21:38 +0000)
committerAlex Deucher <alexander.deucher@amd.com>
Tue, 14 Dec 2021 21:09:02 +0000 (16:09 -0500)
Host initiated VF FLR may fail if someone else is
already holding a read_lock. Change from down_write_trylock
to down_write to guarantee the reset goes through.

Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Reviewed by: Shaoyun.liu <Shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c

index 23b066b..0077e73 100644 (file)
@@ -252,11 +252,12 @@ static void xgpu_ai_mailbox_flr_work(struct work_struct *work)
         * otherwise the mailbox msg will be ruined/reseted by
         * the VF FLR.
         */
-       if (!down_write_trylock(&adev->reset_sem))
+       if (atomic_cmpxchg(&adev->in_gpu_reset, 0, 1) != 0)
                return;
 
+       down_write(&adev->reset_sem);
+
        amdgpu_virt_fini_data_exchange(adev);
-       atomic_set(&adev->in_gpu_reset, 1);
 
        xgpu_ai_mailbox_trans_msg(adev, IDH_READY_TO_RESET, 0, 0, 0);
 
index a35e6d8..477d0dd 100644 (file)
@@ -281,11 +281,12 @@ static void xgpu_nv_mailbox_flr_work(struct work_struct *work)
         * otherwise the mailbox msg will be ruined/reseted by
         * the VF FLR.
         */
-       if (!down_write_trylock(&adev->reset_sem))
+       if (atomic_cmpxchg(&adev->in_gpu_reset, 0, 1) != 0)
                return;
 
+       down_write(&adev->reset_sem);
+
        amdgpu_virt_fini_data_exchange(adev);
-       atomic_set(&adev->in_gpu_reset, 1);
 
        xgpu_nv_mailbox_trans_msg(adev, IDH_READY_TO_RESET, 0, 0, 0);