drm/amdgpu: Move reset domain init before calling RREG32
authorPhilip Yang <Philip.Yang@amd.com>
Tue, 15 Mar 2022 18:38:51 +0000 (14:38 -0400)
committerAlex Deucher <alexander.deucher@amd.com>
Tue, 15 Mar 2022 18:40:49 +0000 (14:40 -0400)
commit436afdfa35dc8aaf43959593f6c433d0ad29abc3
tree5862ca181935ca6e78cedab4669936222361fa26
parent72a98763b473890e6605604bfcaf71fc212b4720
drm/amdgpu: Move reset domain init before calling RREG32

amdgpu_detect_virtualization reads register, amdgpu_device_rreg access
adev->reset_domain->sem if kernel defined CONFIG_LOCKDEP, below is the
random boot hang backtrace on Vega10. It may get random NULL pointer
access backtrace if amdgpu_sriov_runtime is true too.

Move amdgpu_reset_create_reset_domain before calling to RREG32.

 BUG: kernel NULL pointer dereference, address:
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: 0000 [#1] PREEMPT SMP NOPTI
 Workqueue: events work_for_cpu_fn
 RIP: 0010:down_read_trylock+0x13/0xf0
 Call Trace:
  <TASK>
  amdgpu_device_skip_hw_access+0x38/0x80 [amdgpu]
  amdgpu_device_rreg+0x1b/0x170 [amdgpu]
  amdgpu_detect_virtualization+0x73/0x100 [amdgpu]
  amdgpu_device_init.cold.60+0xbe/0x16b1 [amdgpu]
  ? pci_bus_read_config_word+0x43/0x70
  amdgpu_driver_load_kms+0x15/0x120 [amdgpu]
  amdgpu_pci_probe+0x1a1/0x3a0 [amdgpu]

Fixes: d0fb18b535679a ("drm/amdgpu: Move reset sem into reset_domain")
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c