platform/kernel/linux-rpi.git
3 years agodrm/amdgpu/smu11.0: convert to IP version checking
Alex Deucher [Tue, 27 Jul 2021 21:28:00 +0000 (17:28 -0400)]
drm/amdgpu/smu11.0: convert to IP version checking

Use IP versions rather than asic_type to differentiate
IP version specific features.

v2: rebase

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu/amdgpu_smu: convert to IP version checking
Alex Deucher [Thu, 16 Sep 2021 20:26:31 +0000 (16:26 -0400)]
drm/amdgpu/amdgpu_smu: convert to IP version checking

Use IP versions rather than asic_type to differentiate
IP version specific features.

v2: rebase
v3: switch some if statements to switch statements
v4: add yellow carp fix (Yifan)
v5: squash in fixes for YC and GS (Alex)

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu/navi10_ih: convert to IP version checking
Alex Deucher [Tue, 27 Jul 2021 20:45:41 +0000 (16:45 -0400)]
drm/amdgpu/navi10_ih: convert to IP version checking

Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu/athub2.1: convert to IP version checking
Alex Deucher [Tue, 27 Jul 2021 20:39:42 +0000 (16:39 -0400)]
drm/amdgpu/athub2.1: convert to IP version checking

Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu/athub2.0: convert to IP version checking
Alex Deucher [Tue, 27 Jul 2021 20:37:18 +0000 (16:37 -0400)]
drm/amdgpu/athub2.0: convert to IP version checking

Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu/vcn3.0: convert to IP version checking
Alex Deucher [Mon, 9 Aug 2021 15:40:48 +0000 (11:40 -0400)]
drm/amdgpu/vcn3.0: convert to IP version checking

Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu/mmhub2.1: convert to IP version checking
Alex Deucher [Tue, 27 Jul 2021 20:23:45 +0000 (16:23 -0400)]
drm/amdgpu/mmhub2.1: convert to IP version checking

Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu/mmhub2.0: convert to IP version checking
Alex Deucher [Tue, 27 Jul 2021 20:21:18 +0000 (16:21 -0400)]
drm/amdgpu/mmhub2.0: convert to IP version checking

Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu/gfxhub2.1: convert to IP version checking
Alex Deucher [Tue, 27 Jul 2021 20:05:41 +0000 (16:05 -0400)]
drm/amdgpu/gfxhub2.1: convert to IP version checking

Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: drive nav10 from the IP discovery table
Alex Deucher [Mon, 26 Jul 2021 20:49:21 +0000 (16:49 -0400)]
drm/amdgpu: drive nav10 from the IP discovery table

Rather than hardcoding based on asic_type, use the IP
discovery table to configure the driver.

Only tested on Navi10 so far.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: Use IP discovery to drive setting IP blocks by default
Alex Deucher [Mon, 26 Jul 2021 20:46:56 +0000 (16:46 -0400)]
drm/amdgpu: Use IP discovery to drive setting IP blocks by default

Drive the asic setup from the IP discovery table rather than
hardcoded settings based on asic type.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu/gmc10.0: convert to IP version checking
Alex Deucher [Wed, 28 Jul 2021 15:15:01 +0000 (11:15 -0400)]
drm/amdgpu/gmc10.0: convert to IP version checking

Use IP versions rather than asic_type to differentiate
IP version specific features.

v2: squash in gmc fixes
v3: rebase

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: bind to any 0x1002 PCI diplay class device
Alex Deucher [Tue, 3 Aug 2021 21:18:53 +0000 (17:18 -0400)]
drm/amdgpu: bind to any 0x1002 PCI diplay class device

Bind to all 0x1002 GPU devices.

For now we explicitly return -ENODEV for generic bindings.
Remove this check once IP discovery based checking is in place.

v2: rebase (Alex)

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: filter out radeon PCI device IDs
Alex Deucher [Tue, 3 Aug 2021 21:17:10 +0000 (17:17 -0400)]
drm/amdgpu: filter out radeon PCI device IDs

Once we claim all 0x1002 PCI display class devices, we will
need to filter out devices owned by radeon.

v2: rename radeon id array to make it more clear that
the devices are not supported by amdgpu.
    add r128, mach64 pci ids as well

Acked-by: Christian König <christian.koenig@amd.com> (v1)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu/gfx10: convert to IP version checking
Alex Deucher [Wed, 28 Jul 2021 15:10:04 +0000 (11:10 -0400)]
drm/amdgpu/gfx10: convert to IP version checking

Use IP versions rather than asic_type to differentiate
IP version specific features.

v2: rebase,  squash in navi10 fixes (Alex)

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu/sdma5.2: convert to IP version checking
Alex Deucher [Wed, 28 Jul 2021 15:06:44 +0000 (11:06 -0400)]
drm/amdgpu/sdma5.2: convert to IP version checking

Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu/sdma5.0: convert to IP version checking
Alex Deucher [Fri, 23 Jul 2021 15:56:14 +0000 (11:56 -0400)]
drm/amdgpu/sdma5.0: convert to IP version checking

Use IP versions rather than asic_type to differentiate
IP version specific features.

v2: rebase

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: add initial IP enumeration via IP discovery table
Alex Deucher [Tue, 20 Jul 2021 22:27:19 +0000 (18:27 -0400)]
drm/amdgpu: add initial IP enumeration via IP discovery table

Add initial support for all navi based parts.

v2: rebase

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu/nv: export common IP functions
Alex Deucher [Mon, 26 Jul 2021 19:11:44 +0000 (15:11 -0400)]
drm/amdgpu/nv: export common IP functions

So they can be driven by IP dicovery table.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: add XGMI HWIP
Alex Deucher [Mon, 26 Jul 2021 19:27:26 +0000 (15:27 -0400)]
drm/amdgpu: add XGMI HWIP

So we can track grab the appropriate XGMI info out of the
IP discovery table.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: fill in IP versions from IP discovery table
Alex Deucher [Tue, 20 Jul 2021 20:57:40 +0000 (16:57 -0400)]
drm/amdgpu: fill in IP versions from IP discovery table

Prerequisite for using IP versions in the driver rather
than asic type.

v2: Use IP_VERSION() macro instead of new function

Reviewed-by: Christian König <christian.koenig@amd.com> (v1)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: store HW IP versions in the driver structure
Alex Deucher [Tue, 20 Jul 2021 20:01:41 +0000 (16:01 -0400)]
drm/amdgpu: store HW IP versions in the driver structure

So we can check the IP versions directly rather than using
asic type.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: add debugfs access to the IP discovery table
Alex Deucher [Tue, 20 Jul 2021 18:53:37 +0000 (14:53 -0400)]
drm/amdgpu: add debugfs access to the IP discovery table

Useful for debugging and new asic validation.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: move headless sku check into harvest function
Alex Deucher [Mon, 9 Aug 2021 15:37:55 +0000 (11:37 -0400)]
drm/amdgpu: move headless sku check into harvest function

Consolidate harvesting information.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: resolve RAS query bug
John Clements [Wed, 29 Sep 2021 07:06:21 +0000 (15:06 +0800)]
drm/amdgpu: resolve RAS query bug

clear error count when persistant harvesting is not enabled

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Only define DP 2.0 symbols if not already defined
Harry Wentland [Wed, 22 Sep 2021 17:17:28 +0000 (13:17 -0400)]
drm/amd/display: Only define DP 2.0 symbols if not already defined

[Why]
For some reason we're defining DP 2.0 definitions inside our
driver. Now that patches to introduce relevant definitions
are slated to be merged into drm-next this is causing conflicts.

In file included from drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c:33:
In file included from ./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgpu.h:70:
In file included from ./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgpu_mode.h:36:
./include/drm/drm_dp_helper.h:1322:9: error: 'DP_MAIN_LINK_CHANNEL_CODING_PHY_REPEATER' macro redefined [-Werror,-Wmacro-redefined]
        ^
./drivers/gpu/drm/amd/amdgpu/../display/dc/dc_dp_types.h:881:9: note: previous definition is here
        ^
1 error generated.

[How]
Guard all display driver defines with #ifndef for now. Once we pull
in the new definitions into amd-staging-drm-next we will follow
up and drop definitions from our driver and provide follow-up
header updates for any addition DP 2.0 definitions required
by our driver.

We also ensure drm_dp_helper.h is included before dc_dp_types.h.

v3: Ensure drm_dp_helper.h is included before dc_dp_types.h

v2: Add one missing endif

Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Fangzhi Zuo <Jerry.Zuo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agoamd/amdkfd: add ras page retirement handling for sq/sdma (v3)
Tao Zhou [Thu, 23 Sep 2021 06:11:22 +0000 (14:11 +0800)]
amd/amdkfd: add ras page retirement handling for sq/sdma (v3)

In ras poison mode, page retirement will be handled by the irq handler of the
module which consumes corrupted data.

v2: rename ras_process_cb to ras_poison_consumption_handler.
    move the handler's implementation from ASIC specific file to common
file.

v3: call gpu reset for xGMI connected mode.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: force exit gfxoff on sdma resume for rmb s0ix
Prike Liang [Wed, 25 Aug 2021 05:36:38 +0000 (13:36 +0800)]
drm/amdgpu: force exit gfxoff on sdma resume for rmb s0ix

In the s2idle stress test sdma resume fail occasionally,in the
failed case GPU is in the gfxoff state.This issue may introduce
by firmware miss handle doorbell S/R and now temporary fix the issue
by forcing exit gfxoff for sdma resume.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: add cyan_skillfish display support
Zhan Liu [Sat, 25 Sep 2021 07:01:48 +0000 (00:01 -0700)]
drm/amd/display: add cyan_skillfish display support

[Why]
add display related cyan_skillfish files in.

makefile controlled by CONFIG_DRM_AMD_DC_DCN201 flag.

v2: squash in clang fixes from Harry, Nathan
v3: squash in missing CONFIG_DRM_AMD_DC check (Alex)

Signed-off-by: Charlene Liu <charlene.liu@amd.com>
Signed-off-by: Zhan Liu <zhan.liu@amd.com>
Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Acked-by: Jun Lei <jun.lei@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: add cyan_skillfish asic header files
Zhan Liu [Sat, 25 Sep 2021 07:51:08 +0000 (00:51 -0700)]
drm/amdgpu: add cyan_skillfish asic header files

This patch is to add cyan_skillfish asic header files.

Signed-off-by: Charlene Liu <charlene.liu@amd.com>
Signed-off-by: Zhan Liu <zhan.liu@amd.com>
Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Acked-by: Jun Lei <jun.lei@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: Add a UAPI flag for hot plug/unplug
Andrey Grodzovsky [Tue, 24 Aug 2021 20:38:20 +0000 (16:38 -0400)]
drm/amdgpu: Add a UAPI flag for hot plug/unplug

To support libdrm tests.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: drm/amdgpu: Handle IOMMU enabled case
Andrey Grodzovsky [Tue, 24 Aug 2021 20:15:48 +0000 (16:15 -0400)]
drm/amdgpu: drm/amdgpu: Handle IOMMU enabled case

Handle all DMA IOMMU group related dependencies before the
group is removed and we try to access it after free.

v2:
Move the actul handling function to TTM

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/amdgpu: Validate ip discovery blob
Ernst Sjöstrand [Sun, 26 Sep 2021 21:27:19 +0000 (23:27 +0200)]
drm/amd/amdgpu: Validate ip discovery blob

We use the number_instance index that we get from the fw discovery blob
to index into an array for example.

Update error messages (Alex)

Signed-off-by: Ernst Sjöstrand <ernstp@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agogpu: amd: replace open-coded offsetof() with builtin
Arnd Bergmann [Mon, 27 Sep 2021 12:20:41 +0000 (14:20 +0200)]
gpu: amd: replace open-coded offsetof() with builtin

The two AMD drivers have their own custom offsetof() implementation
that now triggers a warning with recent versions of clang:

drivers/gpu/drm/radeon/radeon_atombios.c:133:14: error: performing pointer subtraction with a null pointer has undefined behavior [-Werror,-Wnull-pointer-subtraction]

Change all the instances to use the normal offsetof() provided
by the kernel that does not have this problem.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdkfd: fix resource_size.cocci warnings
Yang Li [Sun, 26 Sep 2021 07:16:20 +0000 (15:16 +0800)]
drm/amdkfd: fix resource_size.cocci warnings

Use resource_size function on resource object
instead of explicit computation.

Clean up coccicheck warning:
./drivers/gpu/drm/amd/amdkfd/kfd_migrate.c:905:10-13: ERROR: Missing
resource_size with res

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Reviewed-by: Amos Kong <kongjianjun@gmail.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: fix warning for overflow check
Arnd Bergmann [Mon, 27 Sep 2021 12:58:10 +0000 (14:58 +0200)]
drm/amdgpu: fix warning for overflow check

The overflow check in amdgpu_bo_list_create() causes a warning with
clang-14 on 64-bit architectures, since the limit can never be
exceeded.

drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c:74:18: error: result of comparison of constant 256204778801521549 with expression of type 'unsigned int' is always false [-Werror,-Wtautological-constant-out-of-range-compare]
        if (num_entries > (SIZE_MAX - sizeof(struct amdgpu_bo_list))
            ~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The check remains useful for 32-bit architectures, so just avoid the
warning by using size_t as the type for the count.

Fixes: 920990cb080a ("drm/amdgpu: allocate the bo_list array after the list")
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: check tiling flags when creating FB on GFX8-
Simon Ser [Mon, 27 Sep 2021 15:08:44 +0000 (15:08 +0000)]
drm/amdgpu: check tiling flags when creating FB on GFX8-

On GFX9+, format modifiers are always enabled and ensure the
frame-buffers can be scanned out at ADDFB2 time.

On GFX8-, format modifiers are not supported and no other check
is performed. This means ADDFB2 IOCTLs will succeed even if the
tiling isn't supported for scan-out, and will result in garbage
displayed on screen [1].

Fix this by adding a check for tiling flags for GFX8 and older.
The check is taken from radeonsi in Mesa (see how is_displayable
is populated in gfx6_compute_surface).

Changes in v2: use drm_WARN_ONCE instead of drm_WARN (Michel)

[1]: https://github.com/swaywm/wlroots/issues/3185

Signed-off-by: Simon Ser <contact@emersion.fr>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Harry Wentland <hwentlan@amd.com>
Cc: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/amdgpu: Add missing mp_11_0_8_sh_mask.h header
Tom St Denis [Fri, 24 Sep 2021 14:28:31 +0000 (10:28 -0400)]
drm/amd/amdgpu: Add missing mp_11_0_8_sh_mask.h header

The commit 2766534b766e1b12e0fa0a4e2e26929e808fde71 added the offset
header but didn't add the masks.  This adds the masks based on what
was selected for the offsets.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Pass PCI deviceid into DC
Charlene Liu [Mon, 20 Sep 2021 18:30:02 +0000 (14:30 -0400)]
drm/amd/display: Pass PCI deviceid into DC

[why]
pci deviceid not passed to dal dc, without proper break,
dcn2.x falls into dcn3.x code path

[how]
pass in pci deviceid, and break once dal_version initialized.

Reviewed-by: Zhan Liu <Zhan.Liu@amd.com>
Acked-by: Anson Jacob <Anson.Jacob@amd.com>
Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Update VCP X.Y logging to improve usefulness
George Shen [Thu, 16 Sep 2021 23:59:34 +0000 (19:59 -0400)]
drm/amd/display: Update VCP X.Y logging to improve usefulness

[Why]
Recently debugging efforts have involved setting/checking the
X.Y value used during payload allocation. Current output for
Y was calculated with incorrect bitshift. Y value is also not
human readable.

[How]
Refactor logging into separate function. Fix Y calculation error
and format output to be human readable.

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Acked-by: Anson Jacob <Anson.Jacob@amd.com>
Signed-off-by: George Shen <george.shen@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Handle Y carry-over in VCP X.Y calculation
George Shen [Thu, 16 Sep 2021 23:55:39 +0000 (19:55 -0400)]
drm/amd/display: Handle Y carry-over in VCP X.Y calculation

[Why/How]
Theoretically rare corner case where ceil(Y) results in rounding
up to an integer. If this happens, the 1 should be carried over to
the X value.

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Acked-by: Anson Jacob <Anson.Jacob@amd.com>
Signed-off-by: George Shen <george.shen@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: make verified link cap not exceeding max link cap
Wenjing Liu [Fri, 17 Sep 2021 21:03:02 +0000 (17:03 -0400)]
drm/amd/display: make verified link cap not exceeding max link cap

[why]
There is a chance verified link cap can be greater than max link cap.
This causes software hang because we cannot power up PHY with link rate
that cannot handle.
The change is to guard verfieid link cap from becoming larger than max link cap
our PHy can support.

Reviewed-by: Jimmy Kizito <Jimmy.Kizito@amd.com>
Acked-by: Anson Jacob <Anson.Jacob@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: initialize backlight_ramping_override to false
Josip Pavic [Fri, 17 Sep 2021 15:01:47 +0000 (11:01 -0400)]
drm/amd/display: initialize backlight_ramping_override to false

[Why]
Stack variable params.backlight_ramping_override is uninitialized, so it
contains junk data

[How]
Initialize the variable to false

Reviewed-by: Roman Li <Roman.Li@amd.com>
Acked-by: Anson Jacob <Anson.Jacob@amd.com>
Signed-off-by: Josip Pavic <Josip.Pavic@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Defer LUT memory powerdown until LUT bypass latches
Michael Strauss [Thu, 9 Sep 2021 20:33:52 +0000 (16:33 -0400)]
drm/amd/display: Defer LUT memory powerdown until LUT bypass latches

[WHY]
Blnd, 3dlut, and shaper LUT select registers are double buffered, however
their accompanying LUT memory shutdown registers are not. As a result,
shutting down LUT memory immediately after setting a block to bypass causes
corruption as bypass only happens at next Vupdate.

[HOW]
Re-enable mem low power for CM block
Force optimization on next flip and disable LUT memory during optimization
sequence if LUT select field is then set to bypass

v2: squash in CONFIG_DRM_AMD_DC_DCN fix (Alex)

Reviewed-by: Eric Yang <Eric.Yang2@amd.com>
Acked-by: Anson Jacob <Anson.Jacob@amd.com>
Signed-off-by: Michael Strauss <michael.strauss@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Replace referral of dal with dc
Qingqing Zhuo [Fri, 17 Sep 2021 06:36:24 +0000 (14:36 +0800)]
drm/amd/display: Replace referral of dal with dc

[Why]
DC should be used in place of DAL in
upstream.

[How]
Replace dal with dc in function names.

Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Acked-by: Anson Jacob <Anson.Jacob@amd.com>
Signed-off-by: Qingqing Zhuo <Qingqing.Zhuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: 3.2.155
Aric Cyr [Mon, 20 Sep 2021 02:28:17 +0000 (22:28 -0400)]
drm/amd/display: 3.2.155

This version brings along following fixes:
- Fixes to backlight, LUT, PPS, MST
- Use correct vpg for 128b/132b encoding
- Improved logging for VCP
- Replace referral of dal with dc

Acked-by: Anson Jacob <Anson.Jacob@amd.com>
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: [FW Promotion] Release 0.0.86
Anthony Koo [Sun, 19 Sep 2021 15:37:16 +0000 (11:37 -0400)]
drm/amd/display: [FW Promotion] Release 0.0.86

Acked-by: Anson Jacob <Anson.Jacob@amd.com>
Signed-off-by: Anthony Koo <Anthony.Koo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Add an extra check for dcn10 OPTC data format
Oliver Logush [Tue, 14 Sep 2021 14:05:00 +0000 (10:05 -0400)]
drm/amd/display: Add an extra check for dcn10 OPTC data format

Reviewed-by: Charlene Liu <Charlene.Liu@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Anson Jacob <Anson.Jacob@amd.com>
Signed-off-by: Oliver Logush <oliver.logush@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Add PPS immediate update flag for DCN2
Ilya [Wed, 15 Sep 2021 21:37:59 +0000 (17:37 -0400)]
drm/amd/display: Add PPS immediate update flag for DCN2

[Why]
This change is needed for DCN2 to make use of the immediate_update
flag. With this flag, update to PPS will be immediate, rather than
always taking place on dig_update signal.

[How]
Set AFMT_GENERIC7_FRAME/IMMEDIATE_UPDATE bits depending on flag
value.

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Acked-by: Anson Jacob <Anson.Jacob@amd.com>
Signed-off-by: Ilya <Ilya.Bakoulin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Fix MST link encoder availability check.
Jimmy Kizito [Wed, 15 Sep 2021 19:24:45 +0000 (15:24 -0400)]
drm/amd/display: Fix MST link encoder availability check.

[Why]
MST streams share the same link and should share the same encoder.
The current availability check may erroneously determine that an
encoder is unavailable for MST streams.

[How]
When checking for link encoder availability, check if an encoder
in use shares a link with the stream for which the availability
check is being conducted. If the link is shared, then the link
encoder should be shared too and will be deemed available.

Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Anson Jacob <Anson.Jacob@amd.com>
Signed-off-by: Jimmy Kizito <Jimmy.Kizito@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Fix for link encoder access for MST.
Meenakshikumar Somasundaram [Thu, 2 Sep 2021 18:09:30 +0000 (14:09 -0400)]
drm/amd/display: Fix for link encoder access for MST.

[Why]
Link encoder in the link could be null for certain links.

[How]
If link encoder in the link is null then get the link encoder
from the stream.

Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Anson Jacob <Anson.Jacob@amd.com>
Signed-off-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: add function to convert hw to dpcd lane settings
Wenjing Liu [Fri, 10 Sep 2021 23:18:29 +0000 (19:18 -0400)]
drm/amd/display: add function to convert hw to dpcd lane settings

[why]
Unify the code which handles the conversion between hw lane setting
and dpcd lane setting.

v2: squash in unused variable fixes (Alex)

Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Anson Jacob <Anson.Jacob@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: update cur_lane_setting to an array one for each lane
Wenjing Liu [Fri, 10 Sep 2021 22:00:52 +0000 (18:00 -0400)]
drm/amd/display: update cur_lane_setting to an array one for each lane

[why]
To support per lane lane setting adjustment, we need to change cur_lane_setting
to an array one for each lane as the first step.

Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Anson Jacob <Anson.Jacob@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Add debug support to override the Minimum DRAM Clock
David Galiffi [Mon, 13 Sep 2021 22:05:24 +0000 (18:05 -0400)]
drm/amd/display: Add debug support to override the Minimum DRAM Clock

[Why]
Requested feature to assist with Thermal, Acoustic, Power, and
Performance tuning.

[How]
Add a debug field that will override calculated minimum DRAM clock,
if the debug value is larger than the calculate value.

Reviewed-by: Alvin Lee <Alvin.Lee2@amd.com>
Acked-by: Anson Jacob <Anson.Jacob@amd.com>
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: add vsync notify to dmub for abm pause
Eric Yang [Fri, 10 Sep 2021 17:43:49 +0000 (13:43 -0400)]
drm/amd/display: add vsync notify to dmub for abm pause

[Why]
To prevent unnecessary wake up of DMCUB when ABM is enabled without PSR
enabled, driver will notify DMCUB to stop ABM's vertical interrupts
if vsync is disabled and steady state is reached.

[How]
Send inbox message to notify ABM pause based on vsync on/off

Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Anson Jacob <Anson.Jacob@amd.com>
Signed-off-by: Eric Yang <Eric.Yang2@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Don't enable AFMT for DP audio stream
Michael Strauss [Mon, 13 Sep 2021 17:47:13 +0000 (13:47 -0400)]
drm/amd/display: Don't enable AFMT for DP audio stream

[WHY]
AFMT is unused for DP audio, so powering it on for DP is unnecessary.

[HOW]
APG block should be powered down instead, however HW defaults to shutdown
state when not enabled so no further work is required.

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Acked-by: Anson Jacob <Anson.Jacob@amd.com>
Signed-off-by: Michael Strauss <michael.strauss@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: [FW Promotion] Release 0.0.85
Anthony Koo [Tue, 14 Sep 2021 03:12:35 +0000 (23:12 -0400)]
drm/amd/display: [FW Promotion] Release 0.0.85

Acked-by: Anson Jacob <Anson.Jacob@amd.com>
Signed-off-by: Anthony Koo <Anthony.Koo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: use correct vpg instance for 128b/132b encoding
Wenjing Liu [Mon, 13 Sep 2021 15:25:56 +0000 (11:25 -0400)]
drm/amd/display: use correct vpg instance for 128b/132b encoding

[why]
128b/132b uses the vpg instance assigned to hpo dp stream encoder.
The current vpg used is assigned to dio stream encoder.
This is incorrect and cause display black screen because the
actual vpg is powered off.

Reviewed-by: Michael Strauss <michael.strauss@amd.com>
Acked-by: Anson Jacob <Anson.Jacob@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: correct initial cp_hqd_quantum for gfx9
Hawking Zhang [Sun, 26 Sep 2021 14:19:35 +0000 (22:19 +0800)]
drm/amdgpu: correct initial cp_hqd_quantum for gfx9

didn't read the value of mmCP_HQD_QUANTUM from correct
register offset

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: skip umc ras irq handling in poison mode (v2)
Tao Zhou [Fri, 17 Sep 2021 10:40:57 +0000 (18:40 +0800)]
drm/amdgpu: skip umc ras irq handling in poison mode (v2)

In ras poison mode, umc uncorrectable error will be ignored until
the corrupted data consumed by another ras module (such as gfx, sdma).

v2: update the debug message and replace dev_warn with dev_info.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: set poison supported flag for RAS (v2)
Tao Zhou [Fri, 17 Sep 2021 10:24:09 +0000 (18:24 +0800)]
drm/amdgpu: set poison supported flag for RAS (v2)

Add RAS poison supported flag and tell PSP RAS TA about the info.

v2: rename poison mode to poison supported, we can also disable poison
mode even we support it.
    print value of poison supported if ras feature enablement fails.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: add poison mode query for UMC
Tao Zhou [Fri, 17 Sep 2021 10:18:43 +0000 (18:18 +0800)]
drm/amdgpu: add poison mode query for UMC

Add ras poison mode query interface for UMC.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: add poison mode query for DF (v2)
Tao Zhou [Fri, 17 Sep 2021 10:15:23 +0000 (18:15 +0800)]
drm/amdgpu: add poison mode query for DF (v2)

Add ras poison mode query interface for DF.

v2: replace RREG32_PCIE with RREG32_SOC15.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: Update PSP TA Invoke to use common TA context as input
Candice Li [Thu, 23 Sep 2021 11:37:52 +0000 (19:37 +0800)]
drm/amdgpu: Update PSP TA Invoke to use common TA context as input

Updated invoke to use new common TA structure similarily to load/unload.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Fix Display Flicker on embedded panels
Praful Swarnakar [Wed, 22 Sep 2021 17:31:29 +0000 (23:01 +0530)]
drm/amd/display: Fix Display Flicker on embedded panels

[Why]
ASSR is dependent on Signed PSP Verstage to enable Content
Protection for eDP panels. Unsigned PSP verstage is used
during development phase causing ASSR to FAIL.
As a result, link training is performed with
DP_PANEL_MODE_DEFAULT instead of DP_PANEL_MODE_EDP for
eDP panels that causes display flicker on some panels.

[How]
- Do not change panel mode, if ASSR is disabled
- Just report and continue to perform eDP link training
with right settings further.

Signed-off-by: Praful Swarnakar <Praful.Swarnakar@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: fix gart.bo pin_count leak
Leslie Shi [Thu, 23 Sep 2021 08:05:31 +0000 (16:05 +0800)]
drm/amdgpu: fix gart.bo pin_count leak

gmc_v{9,10}_0_gart_disable() isn't called matched with
correspoding gart_enbale function in SRIOV case. This will
lead to gart.bo pin_count leak on driver unload.

Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Leslie Shi <Yuliang.Shi@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agoMerge tag 'amd-drm-next-5.16-2021-09-27' of https://gitlab.freedesktop.org/agd5f...
Dave Airlie [Tue, 28 Sep 2021 07:08:21 +0000 (17:08 +1000)]
Merge tag 'amd-drm-next-5.16-2021-09-27' of https://gitlab.freedesktop.org/agd5f/linux into drm-next

amd-drm-next-5.16-2021-09-27:

amdgpu:
- RAS improvements
- BACO fixes
- Yellow Carp updates
- Misc code cleanups
- Initial DP 2.0 support
- VCN priority handling
- Cyan Skillfish updates
- Rework IB handling for multimedia engine tests
- Backlight fixes
- DCN 3.1 power saving improvements
- Runtime PM fixes
- Modifier support for DCC image stores for gfx 10.3
- Hotplug fixes
- Clean up stack related warnings in display code
- DP alt mode fixes
- Display rework for better handling FP code
- Debugfs fixes

amdkfd:
- SVM fixes
- DMA map fixes

radeon:
- AGP fix

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210927212653.4575-1-alexander.deucher@amd.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
3 years agoMerge tag 'drm-misc-next-2021-09-23' of git://anongit.freedesktop.org/drm/drm-misc...
Dave Airlie [Fri, 24 Sep 2021 03:48:48 +0000 (13:48 +1000)]
Merge tag 'drm-misc-next-2021-09-23' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for 5.15:

UAPI Changes:

Cross-subsystem Changes:

Core Changes:

Driver Changes:
  - Conversions to dev_err_probe() helper
  - rockchip: Various build improvements, Use
    DRM_BRIDGE_ATTACH_NO_CONNECTOR for LVDS and RGB
  - panel: New panel-edp driver

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20210923074522.zaja7mzxeimxf6g3@gilmour
3 years agodrm/amdgpu: make soc15_common_ip_funcs static
Alex Deucher [Fri, 30 Jul 2021 18:46:53 +0000 (14:46 -0400)]
drm/amdgpu: make soc15_common_ip_funcs static

It's not used outside of soc15.c

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/pm: Update intermediate power state for SI
Lijo Lazar [Thu, 23 Sep 2021 03:58:43 +0000 (11:58 +0800)]
drm/amd/pm: Update intermediate power state for SI

Update the current state as boot state during dpm initialization.
During the subsequent initialization, set_power_state gets called to
transition to the final power state. set_power_state refers to values
from the current state and without current state populated, it could
result in NULL pointer dereference.

For ex: on platforms where PCI speed change is supported through ACPI
ATCS method, the link speed of current state needs to be queried before
deciding on changing to final power state's link speed. The logic to query
ATCS-support was broken on certain platforms. The issue became visible
when broken ATCS-support logic got fixed with commit
f9b7f3703ff9 ("drm/amdgpu/acpi: make ATPX/ATCS structures global (v2)").

Bug:https://gitlab.freedesktop.org/drm/amd/-/issues/1698

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: Remove all code paths under the EAGAIN path in RAS late init
Candice Li [Wed, 15 Sep 2021 07:14:18 +0000 (15:14 +0800)]
drm/amdgpu: Remove all code paths under the EAGAIN path in RAS late init

All code paths under the EAGAIN path in RAS late init are unused.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: Consolidate RAS cmd warning messages
John Clements [Thu, 23 Sep 2021 08:36:09 +0000 (16:36 +0800)]
drm/amdgpu: Consolidate RAS cmd warning messages

Explicity post warning if cmd is issued against unsupported IP

Update to latest RAS TA interface

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdkfd: fix svm_migrate_fini warning
Philip Yang [Mon, 20 Sep 2021 21:25:52 +0000 (17:25 -0400)]
drm/amdkfd: fix svm_migrate_fini warning

Device manager releases device-specific resources when a driver
disconnects from a device, devm_memunmap_pages and
devm_release_mem_region calls in svm_migrate_fini are redundant.

It causes below warning trace after patch "drm/amdgpu: Split
amdgpu_device_fini into early and late", so remove function
svm_migrate_fini.

BUG: https://gitlab.freedesktop.org/drm/amd/-/issues/1718

WARNING: CPU: 1 PID: 3646 at drivers/base/devres.c:795
devm_release_action+0x51/0x60
Call Trace:
    ? memunmap_pages+0x360/0x360
    svm_migrate_fini+0x2d/0x60 [amdgpu]
    kgd2kfd_device_exit+0x23/0xa0 [amdgpu]
    amdgpu_amdkfd_device_fini_sw+0x1d/0x30 [amdgpu]
    amdgpu_device_fini_sw+0x45/0x290 [amdgpu]
    amdgpu_driver_release_kms+0x12/0x30 [amdgpu]
    drm_dev_release+0x20/0x40 [drm]
    release_nodes+0x196/0x1e0
    device_release_driver_internal+0x104/0x1d0
    driver_detach+0x47/0x90
    bus_remove_driver+0x7a/0xd0
    pci_unregister_driver+0x3d/0x90
    amdgpu_exit+0x11/0x20 [amdgpu]

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdkfd: handle svm migrate init error
Philip Yang [Fri, 17 Sep 2021 18:32:14 +0000 (14:32 -0400)]
drm/amdkfd: handle svm migrate init error

If svm migration init failed to create pgmap for device memory, set
pgmap type to 0 to disable device SVM support capability.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: Updated RAS infrastructure
John Clements [Wed, 22 Sep 2021 06:04:52 +0000 (14:04 +0800)]
drm/amdgpu: Updated RAS infrastructure

Update RAS infrastructure to support RAS query for MCA subblocks

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: move amdgpu_virt_release_full_gpu to fini_early stage
Guchun Chen [Sat, 18 Sep 2021 05:43:41 +0000 (13:43 +0800)]
drm/amdgpu: move amdgpu_virt_release_full_gpu to fini_early stage

adev->rmmio is set to be NULL in amdgpu_device_unmap_mmio to prevent
access after pci_remove, however, in SRIOV case, amdgpu_virt_release_full_gpu
will still use adev->rmmio for access after amdgpu_device_unmap_mmio.
The patch is to move such SRIOV calling earlier to fini_early stage.

Fixes: 07775fc13878 ("drm/amdgpu: Unmap all MMIO mappings")
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Leslie Shi <Yuliang.Shi@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Fix wrong format specifier in amdgpu_dm.c
Hayden Goodfellow [Mon, 13 Sep 2021 01:32:09 +0000 (21:32 -0400)]
drm/amd/display: Fix wrong format specifier in amdgpu_dm.c

[Why]
Currently, the 32bit kernel build fails due to an incorrect string
format specifier. ARRAY_SIZE() returns size_t type as it uses sizeof().
However, we specify it in a string as %ld. This causes a compiler error
and causes the 32bit build to fail.

[How]
Change the %ld to %zu as size_t (which sizeof() returns) is an unsigned
integer data type. We use 'z' to ensure it also works with 64bit build.

Reviewed-by: Harry Wentland <Harry.Wentland@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Hayden Goodfellow <Hayden.Goodfellow@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: 3.2.154
Aric Cyr [Mon, 13 Sep 2021 04:05:54 +0000 (00:05 -0400)]
drm/amd/display: 3.2.154

This new DC version brings improvements in the following areas:
- New firmware version
- Fix HPD problems on DCN2
- Fix generic encoder problems and null deferences
- Adjust DCN301 watermark
- Rework dynamic bpp for DCN3x
- Improve link training fallback logic

Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: [FW Promotion] Release 0.0.84
Anthony Koo [Mon, 13 Sep 2021 00:04:13 +0000 (20:04 -0400)]
drm/amd/display: [FW Promotion] Release 0.0.84

Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Anthony Koo <Anthony.Koo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Fix null pointer dereference for encoders
Jimmy Kizito [Sun, 12 Sep 2021 15:21:52 +0000 (11:21 -0400)]
drm/amd/display: Fix null pointer dereference for encoders

[Why]
Links which are dynamically assigned link encoders have their link
encoder set to NULL.

[How]
Check that a pointer to a link_encoder object is non-NULL before using
it.

Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Reviewed-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Jimmy Kizito <Jimmy.Kizito@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Creating a fw boot options bit for an upcoming feature
Meenakshikumar Somasundaram [Fri, 10 Sep 2021 15:18:41 +0000 (11:18 -0400)]
drm/amd/display: Creating a fw boot options bit for an upcoming feature

[Why]
Need a bit for x86 driver to enable a FW boot option for an upcoming
feature.

[How]
Added a bit in dmub_fw_boot_options for an upcoming feature.

Reviewed-by: Jimmy Kizito <jimmy.kizito@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: DIG mapping change is causing a blocker
Liu, Zhan [Fri, 10 Sep 2021 18:50:08 +0000 (14:50 -0400)]
drm/amd/display: DIG mapping change is causing a blocker

[Why]
DIG mapping change is causing a blocker

[How]
Revert the change for now. We will re-implement it later.

Reviewed-by: Jimmy Kizito <jimmy.kizito@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Zhan Liu <Zhan.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Fix B0 USB-C DP Alt mode
Liu, Zhan [Thu, 9 Sep 2021 17:26:37 +0000 (13:26 -0400)]
drm/amd/display: Fix B0 USB-C DP Alt mode

[Why]
Starting from B0, along with RDPCSTX, RDPCSPIPE registers are also used.

[How]
Make sure RDPCSPIPE registers are programmed correctly.

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Zhan Liu <Zhan.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Disable mem low power for CM HW block on DCN3.1
Michael Strauss [Wed, 8 Sep 2021 18:39:09 +0000 (14:39 -0400)]
drm/amd/display: Disable mem low power for CM HW block on DCN3.1

[WHY]
Currently causes visible flicker in some scenarios on OLED eDPs

Reviewed-by: Haonan Wang <haonan.wang2@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Michael Strauss <michael.strauss@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Fix issue with dynamic bpp change for DCN3x
Guo, Bing [Tue, 24 Aug 2021 16:08:22 +0000 (12:08 -0400)]
drm/amd/display: Fix issue with dynamic bpp change for DCN3x

Why:
Screen sometimes would have artifacts or blink once at the time when bpp
is dynamically changed.

How:
1. Changed to update PPS infopacket in frame mode instead of immediate mode
   since other updates for bpp change are double-buffered.
2. Changed double-buffering enablement programming for DCN30 as advised by
ASIC team

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Bing Guo <Bing.Guo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Use adjusted DCN301 watermarks
Nikola Cornij [Wed, 8 Sep 2021 02:09:01 +0000 (22:09 -0400)]
drm/amd/display: Use adjusted DCN301 watermarks

[why]
If DCN30 watermark calc is used for DCN301, the calculated values are
wrong due to the data structure mismatch between DCN30 and DCN301.
However, using the original DCN301 watermark values causes underflow.

[how]
- Add DCN21-style watermark calculations
- Adjust DCN301 watermark values to remove the underflow

Reviewed-by: Zhan Liu <zhan.liu@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Nikola Cornij <nikola.cornij@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Added power down on boot for DCN3
Lai, Derek [Fri, 3 Sep 2021 04:31:17 +0000 (12:31 +0800)]
drm/amd/display: Added power down on boot for DCN3

[Why]
The change of setting a timer callback on boot for 10 seconds is still
working, just lost power down on boot and power down for DCN3.

[How]
Added power down on boot and power down for DCN3.

Reviewed-by: Anthony Koo <anthony.koo@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Derek Lai <Derek.Lai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Fix dynamic encoder reassignment
Jimmy Kizito [Thu, 2 Sep 2021 14:01:02 +0000 (10:01 -0400)]
drm/amd/display: Fix dynamic encoder reassignment

[Why]
Incorrect encoder assignments were being used while applying a new state
to hardware.

(1) When committing a new state to hardware requires resetting the
back-end, the encoder assignments of the current or old state should be
used when disabling the back-end; and the encoder assignments for the
next or new state should be used when re-enabling the back-end.

(2) Link training on hot plug could take over an encoder already in use
by another stream without first disabling it.

[How]

(1) Introduce a resource context 'link_enc_cfg_context' which includes:
- a mode to indicate when transitioning from current to next state.
- transient encoder assignments to use during this state transition.

Update the encoder configuration interface to respond to queries about
encoder assignment based on the mode of operation.

(2) Check if an encoder is already in use before attempting to perform
link training on hot plug.

Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Jimmy Kizito <Jimmy.Kizito@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Fix concurrent dynamic encoder assignment
Jimmy Kizito [Sat, 28 Aug 2021 16:07:11 +0000 (12:07 -0400)]
drm/amd/display: Fix concurrent dynamic encoder assignment

[Why]
Trying to enable multiple displays simultaneously exposed shortcomings
with the algorithm for dynamic link encoder assignment.

The main problems were:
- Assuming stream order remained constant across states would sometimes
lead to invalid DIG encoder assignment.
- Incorrect logic for deciding whether or not a DIG could support a
stream would also sometimes lead to invalid DIG encoder assignment.
- Changes in encoder assignment were wholesale while updating of the
pipe backend is incremental. This would lead to the hardware state not
matching the software state even with valid encoder assignments.

[How]

The following changes fix the identified problems.
- Use stream pointer rather than stream index to track streams across
states.
- Fix DIG compatibility check by examining the link signal type rather
than the stream signal type.
- Modify assignment algorithm to make incremental updates so software
and hardware states remain coherent.

Additionally:
- Add assertions and an encoder assignment validation function
link_enc_cfg_validate() to detect potential problems with encoder
assignment closer to their root cause.
- Reduce the frequency with which the assignment algorithm is executed.
It should not be necessary for fast state validation.

Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Jimmy Kizito <Jimmy.Kizito@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Fix link training fallback logic
Jimmy Kizito [Wed, 25 Aug 2021 23:12:08 +0000 (19:12 -0400)]
drm/amd/display: Fix link training fallback logic

[Why]
Link training should fail if stream bandwidth exceeds link bandwidth.

[How]
Correct fallback logic and use named variables to make intention clear.

Reviewed-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Jimmy Kizito <Jimmy.Kizito@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Fix DCN3 B0 DP Alt Mapping
Liu, Zhan [Thu, 2 Sep 2021 19:08:29 +0000 (15:08 -0400)]
drm/amd/display: Fix DCN3 B0 DP Alt Mapping

[Why]
DCN3 B0 has a mux, which redirects PHYC and PHYD to PHYF and PHYG.

[How]
Fix DIG mapping.

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Zhan Liu <Zhan.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: 3.2.153
Aric Cyr [Tue, 7 Sep 2021 00:42:16 +0000 (20:42 -0400)]
drm/amd/display: 3.2.153

Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: [FW Promotion] Release 0.0.83
Anthony Koo [Sat, 4 Sep 2021 15:58:21 +0000 (11:58 -0400)]
drm/amd/display: [FW Promotion] Release 0.0.83

Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Anthony Koo <Anthony.Koo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Extend w/a for hard hang on HPD to dcn20
Qingqing Zhuo [Wed, 25 Aug 2021 16:29:28 +0000 (12:29 -0400)]
drm/amd/display: Extend w/a for hard hang on HPD to dcn20

[Why]
HPD disable and enable sequences are not mutually exclusive on Linux.
For HPDs that spans under 1s (i.e. HPD low = 1s), part of the disable
sequence (specifically, a request to SMU to lower refclk) could come
right before the call to PHY enablement, causing DMUB to access an
irresponsive PHY and thus a hard hang on the system.

[How]
Disable 48mhz refclk off when there is any HPD status in connected state
for dcn20.

Reviewed-by: Hersen Wu <hersenxs.wu@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Reduce stack size for dml21_ModeSupportAndSystemConfigurationFull
Harry Wentland [Mon, 13 Sep 2021 17:32:50 +0000 (13:32 -0400)]
drm/amd/display: Reduce stack size for dml21_ModeSupportAndSystemConfigurationFull

[Why & How]
With Werror enabled in the kernel we were failing the clang build since
dml21_ModeSupportAndSystemConfigurationFull's stack frame is 1064 when
building with clang, and exceeding the default 1024 stack frame limit.

The culprit seems to be the Pipe struct, so pull the relevant block
out into its own sub-function.

Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Allocate structs needed by dcn_bw_calc_rq_dlg_ttu in pipe_ctx
Harry Wentland [Wed, 8 Sep 2021 00:22:13 +0000 (20:22 -0400)]
drm/amd/display: Allocate structs needed by dcn_bw_calc_rq_dlg_ttu in pipe_ctx

[Why & How]
dcn_bw_calc_rq_dlg_ttu uses a stack frame great than 1024. To solve this
we could allocate the rq_param, dlg_sys_param, and input structs
dynamically. Since this function is inside a kernel_fpu_begin()/end()
call we want to avoid memory allocation. Instead it's much
safer to pre-allocate these on the pipe_ctx.

Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Fixes: 3fe617ccafd6 ("Enable '-Werror' by default for all kernel builds")
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: amd-gfx@lists.freedesktop.org
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: Leo Li <sunpeng.li@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Xinhui Pan <Xinhui.Pan@amd.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: llvm@lists.linux.dev
Acked-by: Christian König <christian.koenig@amd.com>
Build-tested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Fix rest of pass-by-value structs in DML
Harry Wentland [Wed, 8 Sep 2021 17:40:28 +0000 (13:40 -0400)]
drm/amd/display: Fix rest of pass-by-value structs in DML

Passing structs adds a lot of overhead. We don't ever want to pass
anything bigger than primitives by value.

This patch fixes these Coverity IDs:
Addresses-Coverity-ID: 1424031: ("Big parameter passed by value")
Addresses-Coverity-ID: 1424055: ("Big parameter passed by value")
Addresses-Coverity-ID: 1424072: ("Big parameter passed by value")
Addresses-Coverity-ID: 1423779: ("Big parameter passed by value")

Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: amd-gfx@lists.freedesktop.org
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: Leo Li <sunpeng.li@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Xinhui Pan <Xinhui.Pan@amd.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: llvm@lists.linux.dev
Acked-by: Christian König <christian.koenig@amd.com>
Build-tested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Pass all structs in display_rq_dlg_helpers by pointer
Harry Wentland [Wed, 8 Sep 2021 14:21:37 +0000 (10:21 -0400)]
drm/amd/display: Pass all structs in display_rq_dlg_helpers by pointer

Passing structs adds a lot of overhead. We don't ever want to pass
anything bigger than primitives by value.

This patch fixes these Coverity IDs:
Addresses-Coverity-ID: 1423868: ("Big parameter passed by value")
Addresses-Coverity-ID: 1423870: ("Big parameter passed by value")

Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: amd-gfx@lists.freedesktop.org
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: Leo Li <sunpeng.li@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Xinhui Pan <Xinhui.Pan@amd.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: llvm@lists.linux.dev
Acked-by: Christian König <christian.koenig@amd.com>
Build-tested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Pass display_pipe_params_st as const in DML
Harry Wentland [Tue, 7 Sep 2021 23:40:06 +0000 (19:40 -0400)]
drm/amd/display: Pass display_pipe_params_st as const in DML

[Why]
This neither needs to be on the stack nor passed by value
to each function call. In fact, when building with clang
it seems to break the Linux's default 1024 byte stack
frame limit.

[How]
We can simply pass this as a const pointer.

This patch fixes these Coverity IDs
Addresses-Coverity-ID: 1424031: ("Big parameter passed by value")
Addresses-Coverity-ID: 1423970: ("Big parameter passed by value")
Addresses-Coverity-ID: 1423941: ("Big parameter passed by value")
Addresses-Coverity-ID: 1451742: ("Big parameter passed by value")
Addresses-Coverity-ID: 1451887: ("Big parameter passed by value")
Addresses-Coverity-ID: 1454146: ("Big parameter passed by value")
Addresses-Coverity-ID: 1454152: ("Big parameter passed by value")
Addresses-Coverity-ID: 1454413: ("Big parameter passed by value")
Addresses-Coverity-ID: 1466144: ("Big parameter passed by value")
Addresses-Coverity-ID: 1487237: ("Big parameter passed by value")

Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Fixes: 3fe617ccafd6 ("Enable '-Werror' by default for all kernel builds")
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: amd-gfx@lists.freedesktop.org
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: Leo Li <sunpeng.li@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Xinhui Pan <Xinhui.Pan@amd.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: llvm@lists.linux.dev
Acked-by: Christian König <christian.koenig@amd.com>
Build-tested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdkfd: fix dma mapping leaking warning
Philip Yang [Tue, 14 Sep 2021 20:33:40 +0000 (16:33 -0400)]
drm/amdkfd: fix dma mapping leaking warning

For xnack off, restore work dma unmap previous system memory page, and
dma map the updated system memory page to update GPU mapping, this is
not dma mapping leaking, remove the WARN_ONCE for dma mapping leaking.

prange->dma_addr store the VRAM page pfn after the range migrated to
VRAM, should not dma unmap VRAM page when updating GPU mapping or
remove prange. Add helper svm_is_valid_dma_mapping_addr to check VRAM
page and error cases.

Mask out SVM_RANGE_VRAM_DOMAIN flag in dma_addr before calling amdgpu vm
update to avoid BUG_ON(*addr & 0xFFFF00000000003FULL), and set it again
immediately after. This flag is used to know the type of page later to
dma unmapping system memory page.

Fixes: 1d5dbfe6c06a ("drm/amdkfd: classify and map mixed svm range pages in GPU")
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>