drm/amdgpu: Set vmbo destroy after pt bo is created
authorPhilip Yang <Philip.Yang@amd.com>
Mon, 3 Oct 2022 17:03:26 +0000 (13:03 -0400)
committerAlex Deucher <alexander.deucher@amd.com>
Thu, 6 Oct 2022 16:08:09 +0000 (12:08 -0400)
Under VRAM usage pression, map to GPU may fail to create pt bo and
vmbo->shadow_list is not initialized, then ttm_bo_release calling
amdgpu_bo_vm_destroy to access vmbo->shadow_list generates below
dmesg and NULL pointer access backtrace:

Set vmbo destroy callback to amdgpu_bo_vm_destroy only after creating pt
bo successfully, otherwise use default callback amdgpu_bo_destroy.

amdgpu: amdgpu_vm_bo_update failed
amdgpu: update_gpuvm_pte() failed
amdgpu: Failed to map bo to gpuvm
amdgpu 0000:43:00.0: amdgpu: Failed to map peer:0000:43:00.0 mem_domain:2
BUG: kernel NULL pointer dereference, address:
 RIP: 0010:amdgpu_bo_vm_destroy+0x4d/0x80 [amdgpu]
 Call Trace:
  <TASK>
  ttm_bo_release+0x207/0x320 [amdttm]
  amdttm_bo_init_reserved+0x1d6/0x210 [amdttm]
  amdgpu_bo_create+0x1ba/0x520 [amdgpu]
  amdgpu_bo_create_vm+0x3a/0x80 [amdgpu]
  amdgpu_vm_pt_create+0xde/0x270 [amdgpu]
  amdgpu_vm_ptes_update+0x63b/0x710 [amdgpu]
  amdgpu_vm_update_range+0x2e7/0x6e0 [amdgpu]
  amdgpu_vm_bo_update+0x2bd/0x600 [amdgpu]
  update_gpuvm_pte+0x160/0x420 [amdgpu]
  amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x313/0x1130 [amdgpu]
  kfd_ioctl_map_memory_to_gpu+0x115/0x390 [amdgpu]
  kfd_ioctl+0x24a/0x5b0 [amdgpu]

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c

index e6a9b9f..2e8f6cd 100644 (file)
@@ -688,13 +688,16 @@ int amdgpu_bo_create_vm(struct amdgpu_device *adev,
         * num of amdgpu_vm_pt entries.
         */
        BUG_ON(bp->bo_ptr_size < sizeof(struct amdgpu_bo_vm));
-       bp->destroy = &amdgpu_bo_vm_destroy;
        r = amdgpu_bo_create(adev, bp, &bo_ptr);
        if (r)
                return r;
 
        *vmbo_ptr = to_amdgpu_bo_vm(bo_ptr);
        INIT_LIST_HEAD(&(*vmbo_ptr)->shadow_list);
+       /* Set destroy callback to amdgpu_bo_vm_destroy after vmbo->shadow_list
+        * is initialized.
+        */
+       bo_ptr->tbo.destroy = &amdgpu_bo_vm_destroy;
        return r;
 }