drm/amdkfd: add RAS poison consumption handling for UTCL2 (v2)
authorTao Zhou <tao.zhou1@amd.com>
Wed, 16 Mar 2022 06:38:12 +0000 (14:38 +0800)
committerAlex Deucher <alexander.deucher@amd.com>
Fri, 25 Mar 2022 16:40:26 +0000 (12:40 -0400)
Do RAS page retirement and use gpu reset as fallback in UTCL2 fault
handler.

v2: replace vm fault event with posion consumed event in UTCL2
poison consumption.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c

index 7db2421..56902b5 100644 (file)
@@ -308,6 +308,12 @@ static void event_interrupt_wq_v9(struct kfd_dev *dev,
                struct kfd_vm_fault_info info = {0};
                uint16_t ring_id = SOC15_RING_ID_FROM_IH_ENTRY(ih_ring_entry);
 
+               if (client_id == SOC15_IH_CLIENTID_UTCL2 &&
+                   amdgpu_amdkfd_ras_query_utcl2_poison_status(dev->adev)) {
+                       event_interrupt_poison_consumption(dev, pasid, client_id);
+                       return;
+               }
+
                info.vmid = vmid;
                info.mc_id = client_id;
                info.page_addr = ih_ring_entry[4] |