drm/amdgpu: introduce a kind of halt state for amdgpu device
authorLang Yu <lang.yu@amd.com>
Thu, 9 Dec 2021 07:10:32 +0000 (15:10 +0800)
committerAlex Deucher <alexander.deucher@amd.com>
Mon, 13 Dec 2021 21:33:16 +0000 (16:33 -0500)
commit34f3a4a98bd388ad6298c42dc9b00c72d3398330
treef690ee30a63e7e3a94e4c8bce3dcfdd8c79bb3df
parentcace4bff750ff4f55b16c3aa90aa9376d7488929
drm/amdgpu: introduce a kind of halt state for amdgpu device

It is useful to maintain error context when debugging
SW/FW issues. Introduce amdgpu_device_halt() for this
purpose. It will bring hardware to a kind of halt state,
so that no one can touch it any more.

Compare to a simple hang, the system will keep stable
at least for SSH access. Then it should be trivial to
inspect the hardware state and see what's going on.

v2:
 - Set adev->no_hw_access earlier to avoid potential crashes.(Christian)

Suggested-by: Christian Koenig <christian.koenig@amd.com>
Suggested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Christian Koenig <christian.koenig@amd.co>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drivers/gpu/drm/amd/amdgpu/amdgpu.h
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c