drm/amdgpu: Fix usage of UMC fill record in RAS
authorLuben Tuikov <luben.tuikov@amd.com>
Sat, 10 Jun 2023 10:19:15 +0000 (06:19 -0400)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Wed, 19 Jul 2023 14:21:32 +0000 (16:21 +0200)
commit0e2c51a16fcb9e69923906bdaecdbbe1ea4fb8e9
treef3d08f34a20c23aed7d685e9780e941c5b1b5e5c
parent8d68ba92554b79a93f52bea0cf778eb7821c9901
drm/amdgpu: Fix usage of UMC fill record in RAS

[ Upstream commit 71344a718a9fda8c551cdc4381d354f9a9907f6f ]

The fixed commit listed in the Fixes tag below, introduced a bug in
amdgpu_ras.c::amdgpu_reserve_page_direct(), in that when introducing the new
amdgpu_umc_fill_error_record() and internally in that new function the physical
address (argument "uint64_t retired_page"--wrong name) is right-shifted by
AMDGPU_GPU_PAGE_SHIFT. Thus, in amdgpu_reserve_page_direct() when we pass
"address" to that new function, we should NOT right-shift it, since this
results, erroneously, in the page address to be 0 for first
2^(2*AMDGPU_GPU_PAGE_SHIFT) memory addresses.

This commit fixes this bug.

Cc: Tao Zhou <tao.zhou1@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: Alex Deucher <Alexander.Deucher@amd.com>
Fixes: 400013b268cb ("drm/amdgpu: add umc_fill_error_record to make code more simple")
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Link: https://lore.kernel.org/r/20230610113536.10621-1-luben.tuikov@amd.com
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c