drm/amdgpu: Add kernel parameter support for ignoring bad page threshold
authorKent Russell <kent.russell@amd.com>
Tue, 19 Oct 2021 14:05:07 +0000 (10:05 -0400)
committerAlex Deucher <alexander.deucher@amd.com>
Thu, 28 Oct 2021 18:26:12 +0000 (14:26 -0400)
commit68daadf3d673568bb7122b1683fd8b0e27c55d9b
tree2f561e2389f17379a788c4035df738417098e207
parent8483fdfea778aedded76c74659692dee3756b12b
drm/amdgpu: Add kernel parameter support for ignoring bad page threshold

When a GPU hits the bad_page_threshold, it will not be initialized by
the amdgpu driver. This means that the table cannot be cleared, nor can
information gathering be performed (getting serial number, BDF, etc).

If the bad_page_threshold kernel parameter is set to -2,
continue to initialize the GPU, while printing a warning to dmesg that
this action has been done

v2: squash in Luben's fix to restore RAS info reporting

Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Mukul Joshi <Mukul.Joshi@amd.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drivers/gpu/drm/amd/amdgpu/amdgpu.h
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c