drm/amdgpu: Disable GPU reset on SRIOV before remove pci.
authorGavin Wan <Gavin.Wan@amd.com>
Wed, 26 Oct 2022 17:45:25 +0000 (13:45 -0400)
committerAlex Deucher <alexander.deucher@amd.com>
Wed, 2 Nov 2022 21:16:25 +0000 (17:16 -0400)
The recent change brought a bug on SRIOV envrionment. It caused
unloading amdgpu failed on Guest VM. The reason is that the VF
FLR was requested while unloading amdgpu driver, but the VF FLR
of SRIOV sequence is wrong while removing PCI device.

For SRIOV, the guest driver should not trigger the whole XGMI hive
to do the reset. Host driver control how the device been reset.

Fixes: f5c7e7797060 ("drm/amdgpu: Adjust removal control flow for smu v13_0_2")
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Signed-off-by: Gavin Wan <Gavin.Wan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c

index 3c9fecdd6b2f322fc7f1dbbb28aecf91739c84f6..bf2d50c8c92ad5e1f64ce125f6793a1b3131e88a 100644 (file)
@@ -2201,7 +2201,8 @@ amdgpu_pci_remove(struct pci_dev *pdev)
                pm_runtime_forbid(dev->dev);
        }
 
-       if (adev->ip_versions[MP1_HWIP][0] == IP_VERSION(13, 0, 2)) {
+       if (adev->ip_versions[MP1_HWIP][0] == IP_VERSION(13, 0, 2) &&
+           !amdgpu_sriov_vf(adev)) {
                bool need_to_reset_gpu = false;
 
                if (adev->gmc.xgmi.num_physical_nodes > 1) {