drm/amdgpu: Fix bug where DPM is not enabled after hibernate and resume
authorSandeep Raghuraman <sandy.8925@gmail.com>
Thu, 6 Aug 2020 17:22:20 +0000 (22:52 +0530)
committerAlex Deucher <alexander.deucher@amd.com>
Fri, 7 Aug 2020 21:31:37 +0000 (17:31 -0400)
Reproducing bug report here:
After hibernating and resuming, DPM is not enabled. This remains the case
even if you test hibernate using the steps here:
https://www.kernel.org/doc/html/latest/power/basic-pm-debugging.html

I debugged the problem, and figured out that in the file hardwaremanager.c,
in the function, phm_enable_dynamic_state_management(), the check
'if (!hwmgr->pp_one_vf && smum_is_dpm_running(hwmgr) && !amdgpu_passthrough(adev) && adev->in_suspend)'
returns true for the hibernate case, and false for the suspend case.

This means that for the hibernate case, the AMDGPU driver doesn't enable DPM
(even though it should) and simply returns from that function.
In the suspend case, it goes ahead and enables DPM, even though it doesn't need to.

I debugged further, and found out that in the case of suspend, for the
CIK/Hawaii GPUs, smum_is_dpm_running(hwmgr) returns false, while in the case of
hibernate, smum_is_dpm_running(hwmgr) returns true.

For CIK, the ci_is_dpm_running() function calls the ci_is_smc_ram_running() function,
which is ultimately used to determine if DPM is currently enabled or not,
and this seems to provide the wrong answer.

I've changed the ci_is_dpm_running() function to instead use the same method that
some other AMD GPU chips do (e.g Fiji), which seems to read the voltage controller.
I've tested on my R9 390 and it seems to work correctly for both suspend and
hibernate use cases, and has been stable so far.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=208839
Signed-off-by: Sandeep Raghuraman <sandy.8925@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drivers/gpu/drm/amd/powerplay/smumgr/ci_smumgr.c

index 02159ca29fa29202ce38c83d9b6bb3c9ad3f3f21..c18169aa59ce5bd117a1d82ac2ac00dd5af8ca31 100644 (file)
@@ -2725,7 +2725,10 @@ static int ci_initialize_mc_reg_table(struct pp_hwmgr *hwmgr)
 
 static bool ci_is_dpm_running(struct pp_hwmgr *hwmgr)
 {
-       return ci_is_smc_ram_running(hwmgr);
+       return (1 == PHM_READ_INDIRECT_FIELD(hwmgr->device,
+                                            CGS_IND_REG__SMC, FEATURE_STATUS,
+                                            VOLTAGE_CONTROLLER_ON))
+               ? true : false;
 }
 
 static int ci_smu_init(struct pp_hwmgr *hwmgr)