amd/amdgpu: force to trigger a no-retry-fault after a retry-fault
authorAlex Sierra <alex.sierra@amd.com>
Mon, 18 Nov 2019 21:33:07 +0000 (15:33 -0600)
committerAlex Deucher <alexander.deucher@amd.com>
Fri, 22 Nov 2019 19:20:23 +0000 (14:20 -0500)
Only for the debugger use case.

[why]
Avoid endless translation retries, after an invalid address access has
been issued to the GPU. Instead, the trap handler is forced to enter by
generating a no-retry-fault.
A s_trap instruction is inserted in the debugger case to let the wave to
enter trap handler to save context.

[how]
Intentionally using an invalid flag combination (F and P set at the same
time) to trigger a no-retry-fault, after a retry-fault happens. This is
only valid under compute context.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c

index 63f6e46..f20b572 100644 (file)
@@ -3197,11 +3197,20 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *adev, unsigned int pasid,
        flags = AMDGPU_PTE_VALID | AMDGPU_PTE_SNOOPED |
                AMDGPU_PTE_SYSTEM;
 
-       if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_NEVER) {
+       if (vm->is_compute_context) {
+               /* Intentionally setting invalid PTE flag
+                * combination to force a no-retry-fault
+                */
+               flags = AMDGPU_PTE_EXECUTABLE | AMDGPU_PDE_PTE |
+                       AMDGPU_PTE_TF;
+               value = 0;
+
+       } else if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_NEVER) {
                /* Redirect the access to the dummy page */
                value = adev->dummy_page_addr;
                flags |= AMDGPU_PTE_EXECUTABLE | AMDGPU_PTE_READABLE |
                        AMDGPU_PTE_WRITEABLE;
+
        } else {
                /* Let the hw retry silently on the PTE */
                value = 0;