review.tizen.org Git - platform/kernel/linux-starfive.git/log

drm/amdkfd: Add migration SMI event

For migration start and end event, output timestamp when migration
starts, ends, svm range address and size, GPU id of migration source and
destination and svm range attributes,

Migration trigger could be prefetch, CPU or GPU page fault and TTM
eviction.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdkfd: Add GPU recoverable fault SMI event

Use ktime_get_boottime_ns() as timestamp to correlate with other
APIs. Output timestamp when GPU recoverable fault starts and ends to
recover the fault, if migration happened or only GPU page table is
updated to recover, fault address, if read or write fault.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdkfd: Enable per process SMI event

Process receive event from same process by default. Add a flag to be
able to receive event from all processes, this requires super user
permission.

Event using pid 0 to send the event to all processes, to keep the
default behavior of existing SMI events.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdkfd: Add KFD SMI event IDs and triggers

Define new system management interface event IDs for migration, GPU
recoverable page fault, user queues eviction, restore and unmap from
GPU events and corresponding event triggers, those will be implemented
in the following patches.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Revert "drm/amdgpu/gmc11: avoid cpu accessing registers to flush VM"

This reverts commit 8748de873fedf4d55bdd99bbb738ee7ddf329792
since drv enabled mes to access registers.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: enable mes to access registers v2

Enable mes to access registers.

v2: squash mes sched ring enablement flag

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/mes: add mes register access interface

Add mes register access routines:
1. read register
2. write register
3. wait register
4. write and wait register

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/mes11: add mes11 misc op

Add misc op commands in mes11.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdkfd: fix cu mask for asics with wgps

GFX10 and up have work group processors (WGP) and WGP mode is the native
compile mode.

KFD and ROCr have no visibility into whether a dispatch is operating
in CU or WGP mode.

Enforce CU masking to be pairwise continguous in enablement and
round robin distribute CUs across the SEs in a pairwise manner to
assume WGP mode at all times.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: add common interface for mes misc op

Add common interface for mes misc op, including accessing register
interface.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/mes11: update mes interface for acessing registers

Update MES firmware api for accessing registers.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: fix documentation warning

Fixes this issue:
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:5094: warning: expecting prototype for amdgpu_device_gpu_recover_imp(). Prototype was for amdgpu_device_gpu_recover() instead

Fixes: cf727044144d ("drm/amdgpu: Rename amdgpu_device_gpu_recover_imp back to amdgpu_device_gpu_recover")
Reviewed-by: Kent Russell <kent.russell@amd.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Re-org and cleanup the redundant code

[Why]
Redundant if-else cases for repeater and non-repeater checks

[How]
Without changing the core logic, rearranged the code by removing
redundant checks

Signed-off-by: Chandan Vurdigere Nataraj <chandan.vurdigerenataraj@amd.com>
Reviewed-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: expose additional modifier for DCN32/321

[Why&How]
Some userspace expect a backwards compatible modifier on DCN32/321. For
hardware with num_pipes more than 16, we expose the most efficient
modifier first. As a fall back method, we need to expose slightly inefficient
modifier AMD_FMT_MOD_TILE_GFX9_64K_R_X after the best option.

Also set the number of packers to fixed value as required per hardware
documentation. This value is cached during hardware initialization and
can be read through the base driver.

Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd: Load TA firmware for DCN321/DCN32

[Why&How]
TA firmware is needed to enable HDCP.

Changes in v2:

Load separate firmware for PSP 13.0.0

Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/amd_shared.h: Add missing doc for PP_GFX_DCS_MASK

This symbol is missing documentation:

drivers/gpu/drm/amd/include/amd_shared.h:224: warning: Enum value 'PP_GFX_DCS_MASK' not described in enum 'PP_FEATURE_MASK'

Document it.

Fixes: 680602d6c2d6 ("drm/amd/pm: enable DCS")
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/amdgpu_dm: fix kernel-doc markups

There are 4 undocumented fields at struct amdgpu_display_manager.

Add documentation for them, fixing those warnings:

drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:544: warning: Function parameter or member 'dmub_outbox_params' not described in 'amdgpu_display_manager'
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:544: warning: Function parameter or member 'num_of_edps' not described in 'amdgpu_display_manager'
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:544: warning: Function parameter or member 'disable_hpd_irq' not described in 'amdgpu_display_manager'
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:544: warning: Function parameter or member 'dmub_aux_transfer_done' not described in 'amdgpu_display_manager'
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:544: warning: Function parameter or member 'delayed_hpd_wq' not described in 'amdgpu_display_manager'

Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: change to_dal_irq_source_dnc32() storage class specifier to static

sparse reports
drivers/gpu/drm/amd/amdgpu/../display/dc/irq/dcn32/irq_service_dcn32.c:39:20: warning: symbol 'to_dal_irq_source_dcn32' was not declared. Should it be static?

to_dal_irq_source_dnc32() is only referenced in irq_service_dnc32.c, so change its
storage class specifier to static.

Fixes: 0efd4374f6b4 ("drm/amd/display: add dcn32 IRQ changes")
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Remove unused globals FORCE_RATE and FORCE_LANE_COUNT

sparse reports
drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link_dp.c:3885:6: warning: symbol 'FORCE_RATE' was not declared. Should it be static?
drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link_dp.c:3886:10: warning: symbol 'FORCE_LANE_COUNT' was not declared. Should it be static?

Neither of thse variables is used in dc_link_dp.c. Reviewing the commit listed in
the fixes tag shows neither was used in the original patch. So remove them.

Fixes: 265280b99822 ("drm/amd/display: add CLKMGR changes for DCN32/321")
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/display: drop set but unused variable

No longer used so drop it.

Fixes: ec457f837890 ("drm/amd/display: Drop unnecessary detect link code")
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Fix typos in amdgpu_stop_pending_resets

Change amdggpu to amdgpu and pedning to pending

Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Removed unused variable ret

Kernel test robot throws below warning ->

drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:
In function 'dc_link_reduce_mst_payload':
drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:3782:32:
warning: variable 'ret' set but not used [-Wunused-but-set-variable]
3782 | enum act_return_status ret;

Removed the unused ret variable.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Souptick Joarder (HPE) <jrdr.linux@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Revert "drm/amdkfd: Free queue after unmap queue success"

This reverts commit ab8529b0cdb271d9b222cbbddb2641f3fca5df8f.

This causes KFDTest KFDMemoryTest.MemoryRegister test failed on gfx9.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/display/dc: Fix null pointer exception

We observed hard hang due to NULL derefrence This issue is seen after
running system all the time after two or three days

struct dc *dc = plane_state->ctx->dc; Randomly in long run we found
plane_state or plane_state->ctx is found NULL which causes exception.

BUG: kernel NULL pointer dereference, address: 0000000000000000
PF: supervisor read access in kernel mode
PF: error_code(0x0000) - not-present page
PGD 1dc7f2067 P4D 1dc7f2067 PUD 222c75067 PMD 0
Oops: 0000 [#1] SMP NOPTI
CPU: 5 PID: 29855 Comm: kworker/u16:4 ...
...
Workqueue: events_unbound commit_work [drm_kms_helper]
RIP: 0010:dcn10_update_pending_status+0x1f/0xee [amdgpu]
Code: 41 5f c3 0f 1f 44 00 00 b0 01 c3 0f 1f 44 00 00 41 55 41 54 55 53 48 8b 1f 4c 8b af f8 00 00 00 48 8b 83 88 03 00 00 48 85 db <4c> 8b 20 0f 84 bf 00 00 00 48 89 fd 48 8b bf b8 00 00 00 48 8b 07
RSP: 0018:ffff942941997ab8 EFLAGS: 00010286
RAX: 0000000000000000 RBX: ffff8d7fd98d2000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff8d7e3e87c708 RDI: ffff8d7f2d8c0690
RBP: ffff8d7f2d8c0000 R08: ffff942941997a34 R09: 00000000ffffffff
R10: 0000000000005000 R11: 00000000000000f0 R12: ffff8d7f2d8c0690
R13: ffff8d8035a41680 R14: 00000000000186a0 R15: ffff8d7f2d8c1dd8
FS: 0000000000000000(0000) GS:ffff8d8037340000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000000148030000 CR4: 00000000003406e0
Call Trace:
dc_commit_state+0x6a2/0x7f0 [amdgpu]
amdgpu_dm_atomic_commit_tail+0x460/0x19bb [amdgpu]

Tested-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Rahul Kumar <rahul.kumar1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Follow up change to previous drm scheduler change.

Align refcount behaviour for amdgpu_job embedded HW fence with
classic pointer style HW fences by increasing refcount each
time emit is called so amdgpu code doesn't need to make workarounds
using amdgpu_job.job_run_counter to keep the HW fence refcount balanced.

Also since in the previous patch we resumed setting s_fence->parent to NULL
in drm_sched_stop switch to directly checking if job->hw_fence is
signaled to short circuit reset if already signed.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Tested-by: Yiqing Yao <yiqing.yao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/sched: Partial revert of 'drm/sched: Keep s_fence->parent pointer'

Problem:
This patch caused negative refcount as described in [1] because
for that case parent fence did not signal by the time of drm_sched_stop and hence
kept in pending list the assumption was they will not signal and
so fence was put to account for the s_fence->parent refcount but for
amdgpu which has embedded HW fence (always same parent fence)
drm_sched_fence_release_scheduled was always called and would
still drop the count for parent fence once more. For jobs that
never signaled this imbalance was masked by refcount bug in
amdgpu_fence_driver_clear_job_fences that would not drop
refcount on the fences that were removed from fence drive
fences array (against prevois insertion into the array in
get in amdgpu_fence_emit).

Fix:
Revert this patch and by setting s_job->s_fence->parent to NULL
as before prevent the extra refcount drop in amdgpu when
drm_sched_fence_release_scheduled is called on job release.

Also - align behaviour in drm_sched_resubmit_jobs_ext with that of
drm_sched_main when submitting jobs - take a refcount for the
new parent fence pointer and drop refcount for original kref_init
for new HW fence creation (or fake new HW fence in amdgpu - see next patch).

[1] - https://lore.kernel.org/all/731b7ff1-3cc9-e314-df2a-7c51b76d4db0@amd.com/t/#r00c728fcc069b1276642c325bfa9d82bf8fa21a3

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Tested-by: Yiqing Yao <yiqing.yao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Prevent race between late signaled fences and GPU reset.

Problem:
After we start handling timed out jobs we assume there fences won't be
signaled but we cannot be sure and sometimes they fire late. We need
to prevent concurrent accesses to fence array from
amdgpu_fence_driver_clear_job_fences during GPU reset and amdgpu_fence_process
from a late EOP interrupt.

Fix:
Before accessing fence array in GPU disable EOP interrupt and flush
all pending interrupt handlers for amdgpu device's interrupt line.

v2: Switch from irq_get/put to full enable/disable_irq for amdgpu

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Add put fence in amdgpu_fence_driver_clear_job_fences

This function should drop the fence refcount when it extracts the
fence from the fence array, just as it's done in amdgpu_fence_process.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Remove useless amdgpu_display_freesync_ioctl() declaration

Signed-off-by: Leslie Shi <Yuliang.Shi@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: add mc wptr addr support for mes

MES requires mc wptr address for usermode queues.
Export bo gart address for mc wptr address.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display : Log DP link training failure reason

[Why]
Existing logs doesn't print DP LT failure reason

[How]
Update the existing log with DP LT failure reason

Signed-off-by: Chandan Vurdigere Nataraj <chandan.vurdigerenataraj@amd.com>
Reviewed-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: enable VR0 HOT support for SMU 13.0.0

Enable VR0 Hot support for SMU 13.0.0.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: update GFX11 cs settings

Update GFX11 cs related settings.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/display: reduce stack size in dml32_ModeSupportAndSystemConfigurationFull()

Move more stack variable in to dummy vars structure on the heap.

Fixes stack frame size errors:
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c: In function 'dml32_ModeSupportAndSystemConfigurationFull':
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:3833:1: error: the frame size of 2720 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
3833 | } // ModeSupportAndSystemConfigurationFull
| ^

Fixes: dda4fb85e433 ("drm/amd/display: DML changes for DCN32/321")
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Rodrigo Siqueira Jordao <Rodrigo.Siqueira@amd.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Revert "drm/amd/display: ignore modifiers when checking for format support"

This reverts commit 5089c4a8ebea3c3ad9eedf038dad7098ebc06131.

This breaks validation and enumeration of display capable modifiers.

The early return true means the rest of the validation code never gets
executed, and we need that to enumerate the right modifiers to userspace
for the format.

The modifiers that are in the initial list generated for a plane are the
superset for all formats and we need the proper checks in this function
to filter some of them out for formats with which they're invalid to be
used.

Furthermore, the safety contract here is that we validate the incoming
modifiers to ensure the kernel can handle them and the display hardware
can handle them. This includes e.g. rejecting multi-plane images with DCC.

Note that the legacy swizzle mechanism allows encoding more swizzles, and
at fb creation time we convert them to modifiers and reject those with
no corresponding modifiers. If we are seeing rejections I'm happy to
help define modifiers that correspond to those, or if absolutely needed
implement a fallback path to allow for less strict validation of the
legacy path.

However, I'd like to revert this patch, since any of these is going to
be a significant rework of the patch, and I'd rather not the regression
gets into a release or forgotten in the meantime.

Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/display: Fix spelling mistake "supporing" -> "supporting"

There is a spelling mistake in a dml_print message. Fix it.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdkfd: Fix spelling mistake "mechanim" -> "mechanism"

There is a spelling mistake in a pr_debug message. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Revert "drm/amdgpu/display: set vblank_disable_immediate for DC"

This reverts commit 92020e81ddbeac351ea4a19bcf01743f32b9c800.

This causes stuttering and timeouts with DMCUB for some users
so revert it until we understand why and safely enable it
to save power.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1887
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>

drm/amdgpu: drop unexpected word 'for' in comments

there is an unexpected word 'for' in the comments that need to be dropped

file - drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
line - 245

* position and also advance the position for for Vega10

changed to:

* position and also advance the position for Vega10

Signed-off-by: Jiang Jian <jiangjian@cdjrlc.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Fix indentation in dcn32_get_vco_frequency_from_reg()

Clang warns:

  drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c:549:4: warning: misleading indentation; statement is not part of the previous 'else' [-Wmisleading-indentation]
                          pll_req = dc_fixpt_from_int(pll_req_reg & clk_mgr->clk_mgr_mask->FbMult_int);
                          ^
  drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c:542:3: note: previous statement is here
                  else
                  ^
  1 warning generated.

Indent this statement to the left, as it was clearly intended to be
called unconditionally, which will fix the warning.

Link: https://github.com/ClangBuiltLinux/linux/issues/1655
Fixes: 3e838f7ccf64 ("drm/amd/display: Get VCO frequency from registers")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Update mes_v11_api_def.h

Update MES API to support oversubscription without aggregated doorbell
for usermode queues.

v2: Change oversubscription_no_aggregated_en to is_kfd_process (align
with MES)

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Jack Xiao <Jack.Xiao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdkfd: Enable GFX11 usermode queue oversubscription

Starting with GFX11, MES requires wptr BOs to be GTT allocated/mapped to
GART for usermode queues in order to support oversubscription. In the
case that work is submitted to an unmapped queue, MES must have a GART
wptr address to determine whether the queue should be mapped.

This change is accompanied with changes in MES and is applicable for
MES_API_VERSION >= 2.

v3:
- Use amdgpu_vm_bo_lookup_mapping for wptr_bo mapping lookup
- Move wptr_bo refcount increment to amdgpu_amdkfd_map_gtt_bo_to_gart
- Remove list_del_init from amdgpu_amdkfd_map_gtt_bo_to_gart
- Cleanup/fix create_queue wptr_bo error handling
v4:
- Add MES version shift/mask defines to amdgpu_mes.h
- Change version check from MES_VERSION to MES_API_VERSION
- Add check in kfd_ioctl_create_queue before wptr bo pin/GART map to
ensure bo is a single page.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Fetch MES scheduler/KIQ versions

Store MES scheduler and MES KIQ version numbers in amdgpu_mes for GFX11.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Jack Xiao <Jack.Xiao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: To flush tlb for MMHUB of RAVEN series

amdgpu: [mmhub0] no-retry page fault (src_id:0 ring:40 vmid:8 pasid:32769, for process test_basic pid 3305 thread test_basic pid 3305)
amdgpu: in page starting at address 0x00007ff990003000 from IH client 0x12 (VMC)
amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00840051
amdgpu: Faulty UTCL2 client ID: MP1 (0x0)
amdgpu: MORE_FAULTS: 0x1
amdgpu: WALKER_ERROR: 0x0
amdgpu: PERMISSION_FAULTS: 0x5
amdgpu: MAPPING_ERROR: 0x0
amdgpu: RW: 0x1

When memory is allocated by kfd, no one triggers the tlb flush for MMHUB0.
There is page fault from MMHUB0.

v2:fix indentation
v3:change subject and fix indentation

Signed-off-by: Ruili Ji <ruiliji2@amd.com>
Reviewed-by: Philip Yang <philip.yang@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

gpu/drm/radeon: Fix typo in comments

Remove the repeated word 'and' from comments

Signed-off-by: Jiang Jian <jiangjian@cdjrlc.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/fourcc: fix integer type usage in uapi header

Kernel uapi headers are supposed to use __[us]{8,16,32,64} types defined
by <linux/types.h> as opposed to 'uint32_t' and similar. See [1] for the
relevant discussion about this topic. In this particular case, the usage
of 'uint64_t' escaped headers_check as these macros are not being called
here. However, the following program triggers a compilation error:

  #include <drm/drm_fourcc.h>

  int main()
  {
   unsigned long x = AMD_FMT_MOD_CLEAR(RB);
   return 0;
  }

gcc error:
  drm.c:5:27: error: ‘uint64_t’ undeclared (first use in this function)
      5 |         unsigned long x = AMD_FMT_MOD_CLEAR(RB);
        |                           ^~~~~~~~~~~~~~~~~

This patch changes AMD_FMT_MOD_{SET,CLEAR} macros to use the correct
integer types, which fixes the above issue.

  [1] https://lkml.org/lkml/2019/6/5/18

Fixes: 8ba16d599374 ("drm/fourcc: Add AMD DRM modifiers.")
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Reviewed-by: Simon Ser <contact@emersion.fr>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/vcn: fix no previous prototype warning

Declare 'static', as the function is not intended to be used
outside of this translation unit.

Fixes: 4ed49c954e35 ("drm/amdgpu/vcn: add unified queue ib test")
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

amdgpu/pm: Fix possible array out-of-bounds if SCLK levels != 2

[v2]
simplified fix after Lijo's feedback
removed clocks.num_levels from calculation of loop count
   removed unsafe accesses to shim table freq_values
retained corner case output only min,now if
   clocks.num_levels == 1 && now > min

[v1]
added a check to populate and use SCLK shim table freq_values only
   if using dpm_level == AMD_DPM_FORCED_LEVEL_MANUAL or
                         AMD_DPM_FORCED_LEVEL_PERF_DETERMINISM
removed clocks.num_levels from calculation of shim table size
removed unsafe accesses to shim table freq_values
   output gfx_table values if using other dpm levels
added check for freq_match when using freq_values for when now == min_clk

== Test ==
LOGFILE=aldebaran-sclk.test.log
AMDGPU_PCI_ADDR=`lspci -nn | grep "VGA\|Display" | cut -d " " -f 1`
AMDGPU_HWMON=`ls -la /sys/class/hwmon | grep $AMDGPU_PCI_ADDR | awk '{print $9}'`
HWMON_DIR=/sys/class/hwmon/${AMDGPU_HWMON}

lspci -nn | grep "VGA\|Display"  > $LOGFILE
FILES="pp_od_clk_voltage
pp_dpm_sclk"

for f in $FILES
do
  echo === $f === >> $LOGFILE
  cat $HWMON_DIR/device/$f >> $LOGFILE
done
cat $LOGFILE

Signed-off-by: Darren Powell <darren.powell@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

amdgpu/pm: Fix incorrect variable for size of clocks array

[v2]
No Changes, added RB
[v1]
Size of pp_clock_levels_with_latency is PP_MAX_CLOCK_LEVELS, not MAX_NUM_CLOCKS.
Both are currently defined as 16, modifying in case one value is modified in future
Changed code in both arcturus and aldabaran.

Also removed unneeded var count, and used min_t function

Signed-off-by: Darren Powell <darren.powell@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdkfd: Free queue after unmap queue success

After queue unmap or remove from MES successfully, free queue sysfs
entries, doorbell and remove from queue list. Otherwise, application may
destroy queue again, cause below kernel warning or crash backtrace.

For outstanding queues, either application forget to destroy or failed
to destroy, kfd_process_notifier_release will remove queue sysfs
entries, kfd_process_wq_release will free queue doorbell.

v2: decrement_queue_count for MES queue

refcount_t: underflow; use-after-free.
WARNING: CPU: 7 PID: 3053 at lib/refcount.c:28
  Call Trace:
   kobject_put+0xd6/0x1a0
   kfd_procfs_del_queue+0x27/0x30 [amdgpu]
   pqm_destroy_queue+0xeb/0x240 [amdgpu]
   kfd_ioctl_destroy_queue+0x32/0x70 [amdgpu]
   kfd_ioctl+0x27d/0x500 [amdgpu]
   do_syscall_64+0x35/0x80

WARNING: CPU: 2 PID: 3053 at drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_device_queue_manager.c:400
  Call Trace:
   deallocate_doorbell.isra.0+0x39/0x40 [amdgpu]
   destroy_queue_cpsch+0xb3/0x270 [amdgpu]
   pqm_destroy_queue+0x108/0x240 [amdgpu]
   kfd_ioctl_destroy_queue+0x32/0x70 [amdgpu]
   kfd_ioctl+0x27d/0x500 [amdgpu]

general protection fault, probably for non-canonical address
0xdead000000000108:
Call Trace:
  pqm_destroy_queue+0xf0/0x200 [amdgpu]
  kfd_ioctl_destroy_queue+0x2f/0x60 [amdgpu]
  kfd_ioctl+0x19b/0x600 [amdgpu]

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Graham Sider <Graham.Sider@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdkfd: Add queue to MES if it becomes active

We remove the user queue from MES scheduler to update queue properties.
If the queue becomes active after updating, add the user queue to MES
scheduler, to be able to handle command packet submission.

v2: don't break pqm_set_gws

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Graham Sider <Graham.Sider@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: fix incorrect comparison in DML

[Why&How]
GCC 12 catches the following incorrect comparison in the if arm

drivers/gpu/drm/amd/amdgpu/../dal-dev/dc/dml/dcn32/display_mode_vba_32.c: In function ‘dml32_ModeSupportAndSystemConfigurationFull’:
drivers/gpu/drm/amd/amdgpu/../dal-dev/dc/dml/dcn32/display_mode_vba_32.c:3740:33: error: the comparison will always evaluate as ‘true’ for the address of ‘USRRetrainingSupport’ will never be NULL [-Werror=address]
3740 | || &mode_lib->vba.USRRetrainingSupport[i][j])) {
| ^~
In file included from ./drivers/gpu/drm/amd/amdgpu/../dal-dev/dc/dml/display_mode_lib.h:32,
from ./drivers/gpu/drm/amd/amdgpu/../dal-dev/dc/dc.h:45,
from drivers/gpu/drm/amd/amdgpu/../dal-dev/dc/dml/dcn32/display_mode_vba_32.c:30:
./drivers/gpu/drm/amd/amdgpu/../dal-dev/dc/dml/display_mode_vba.h:1175:14: note: ‘USRRetrainingSupport’ declared here
1175 | bool USRRetrainingSupport[DC__VOLTAGE_STATES][2];
|

Fix this by remove preceding & so that value is compared instead of
address

Fixes: dda4fb85e433 ("drm/amd/display: DML changes for DCN32/321")
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: fix array index in DML

[Why&How]
When the a 3d array is used by indexing with only one dimension in an if
condition, the addresses get compared instead of the intended value stored in the
array. GCC 12.1 caught this error:

drivers/gpu/drm/amd/amdgpu/../dal-dev/dc/dml/dcn32/display_mode_vba_32.c: In function ‘DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation’:
drivers/gpu/drm/amd/amdgpu/../dal-dev/dc/dml/dcn32/display_mode_vba_32.c:1007:45: error: the comparison will always evaluate as ‘true’ for the address of ‘use_one_row_for_frame_flip’ will never be NULL [-Werror=address]
1007 | if (v->use_one_row_for_frame_flip[k]) {
| ^
In file included from ./drivers/gpu/drm/amd/amdgpu/../dal-dev/dc/dml/display_mode_lib.h:32,
from ./drivers/gpu/drm/amd/amdgpu/../dal-dev/dc/dc.h:45,
from drivers/gpu/drm/amd/amdgpu/../dal-dev/dc/dml/dcn32/display_mode_vba_32.c:30:
./drivers/gpu/drm/amd/amdgpu/../dal-dev/dc/dml/display_mode_vba.h:605:14: note: ‘use_one_row_for_frame_flip’ declared here
605 | bool use_one_row_for_frame_flip[DC__VOLTAGE_STATES][2][DC__NUM_DPP__MAX];
|

Fix this by explicitly specifying the last two indices.

Fixes: dda4fb85e433 ("drm/amd/display: DML changes for DCN32/321")
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: vm - drop unexpected word "the" in the comments

there is an unexpected word "the" in the comments that need to be dropped

file: drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
line: 57
* the kernel tells the the ring what VMID to use for that command
changed to
* the kernel tells the ring what VMID to use for that command

Signed-off-by: Jiang Jian <jiangjian@cdjrlc.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: fix adev variable used in amdgpu_device_gpu_recover()

Use the correct adev variable for the drm_fb_helper in
amdgpu_device_gpu_recover(). Noticed by inspection.

Fixes: 087451f372bf ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's.")
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/radeon: Drop CONFIG_BACKLIGHT_CLASS_DEVICE ifdefs

The DRM_RADEON Kconfig code contains:

select BACKLIGHT_CLASS_DEVICE

So the condition these ifdefs test for is always true, drop them.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdkfd: correct sdma queue number of sdma 6.0.1

sdma 6.0.1 has 8 queues instead of 2.

Fixes: 26776a7031c423 ("drm/amdkfd: add GC 11.0.1 KFD support")
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tim Huang <Tim.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Drop CONFIG_BACKLIGHT_CLASS_DEVICE ifdefs

The DRM_AMDGPU Kconfig code contains:

select BACKLIGHT_CLASS_DEVICE

So the condition these ifdefs test for is always true, drop them.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

amd/display/dc: Fix COLOR_ENCODING and COLOR_RANGE doing nothing for DCN20+

For DCN20 and above, the code that actually hooks up the provided
input_color_space got lost at some point.

Fixes COLOR_ENCODING and COLOR_RANGE doing nothing on DCN20+.
Tested using Steam Remote Play Together + gamescope.

Update other DCNs the same wasy DCN1.x was updates in
commit a1e07ba89d49 ("drm/amd/display: Use plane->color_space for dpp if specified")

Fixes: a1e07ba89d49 ("drm/amd/display: Use plane->color_space for dpp if specified")
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: 3.2.191

This DC patchset brings improvements in multiple areas. In summary, we
highlight:

- Remove unnecessary code;
- Small fixes (compilation warnings, typos, etc);
- Improvements in the DPMS code;
- Fix eDP issues
- Improvements in the MST code

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Drop duplicate define

We already have DALSMC_MSG_TransferTableDram2Smu in the file dalsmc.h;
for this reason, we don't need this definition in the smu msg file.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Update hook dcn32_funcs

In DCN32 clk hook functions, we are using the wrong reference for
get_dp_ref_clk_frequency and missing the get_dtb_ref_clk_frequency
reference. This commit adds those references.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Implement a pme workaround function

[Why]
For DCN32 we do not have a pme workaround function defined that sends a
BacoAudio message. Default code had uses the DCN30 function for pme
workaround. PMFW headers are inconsistent with their message ID
definitions which cause ID's to clash leading to inconsistent system
behaviour. There is a clash with FCLK message due to inconsitent PMFW
headers.

[How]
Implement a new BacoAudio function to workaround the problem of
inconsistent PMFW headers in order to avoid BacoAudio message clasing
with FCLK Enable message.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Chaitanya Dhere <chaitanya.dhere@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Get VCO frequency from registers

Add support to get VCO frequency from registers.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Aurabindo Jayamohanan Pillai <Aurabindo.Pillai@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Update SW state correctly for FCLK

FCLK not supported for DCN321, but still need to update the software
state accordingly to prevent unneeded full updates in driver

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alvin Lee <Alvin.Lee2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Fix divide-by-zero in DPPCLK and DISPCLK calculation

[Why]
Certain use cases will pass in zero in the new_clocks parameter for all
clocks. This results in a divide-by-zero error when attempting to round
up the new clock.

When new_clocks are zero, no rounding is required, so we can skip it.

[How]
Guard the division calculation with a check to make sure clocks are not
zero.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: George Shen <george.shen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Update DPPCLK programming sequence

[Description]
- When lowering DPPCLK, we want to program the DPP DTO before updating
the DPP refclk.
- Also update DPPCLK to the exact frequency that will be set after clock
divider has been programmed. This will prevent rounding errors when
making the request to PMFW (we need DPP DTO to match exactly with the
exact DPP refclk).

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alvin Lee <Alvin.Lee2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Check minimum disp_clk and dpp_clk debug option

Our debug struct has the min_disp_clk_khz and min_dpp_clk_khz options,
which we ignore in the DCN32. This commit introduces those checks and
the necessary calculation.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Aurabindo Jayamohanan Pillai <Aurabindo.Pillai@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Fix in dp link-training when updating payload allocation table

[Why & How]
Check if aux is not accessible before updating payload allocation table.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Acked-by: Alan Liu <HaoPing.Liu@amd.com>
Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: extract update stream allocation to link_hwss

[Why & How]
Extract update stream allocation table into link hwss as part of the
link hwss refactor work.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: George Shen <George.Shen@amd.com>
Acked-by: Alan Liu <HaoPing.Liu@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Remove unused vendor specific w/a

[Why & How]
Old vendor specific w/a are no longer needed and unused. Clean up
codebase by removing them.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com>
Acked-by: Alan Liu <HaoPing.Liu@amd.com>
Signed-off-by: George Shen <george.shen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Handle downstream LTTPR with fixed VS sequence

[Why]
Several issues were discovered that caused link
training to fail when an LTTPR device is
connected downstream for the fixed VS sequence.

[How]
The following were added:
- workaround to configure AUX timeout
for fixed VS sequence
- additional delay before disabling
fixed VS intercept
- detection of fixed VS deadlock state and
performing DPCD sequence to recover

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Meenakshikumar Somasundaram <Meenakshikumar.Somasundaram@amd.com>
Acked-by: Alan Liu <HaoPing.Liu@amd.com>
Signed-off-by: George Shen <George.Shen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Fix typo in override_lane_settings

[Why]
The function currently skips overriding the drive
settings of the first lane.

[How]
Change for loop to start at 0 instead of 1.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com>
Acked-by: Alan Liu <HaoPing.Liu@amd.com>
Signed-off-by: George Shen <george.shen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Change initializer to single brace

[Why & How]
Change struct initializer from multiple brace to single brace.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Acked-by: Alan Liu <HaoPing.Liu@amd.com>
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: rename lane_settings to hw_lane_settings

[why]
This is one of the major steps to decouple hw lane settings
from dpcd lane settings.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Alan Liu <HaoPing.Liu@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Fix in overriding DP drive settings

[Why & How]
Check always_match_dpcd_with_hw_lane_settings bit before
overriding the DP drive settings

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com>
Acked-by: Alan Liu <HaoPing.Liu@amd.com>
Signed-off-by: George Shen <george.shen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Enrich the log in MST payload update

[Why & How]
Enrich the log to provide more informatio in MST payload update.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Ariel Bernstein <Eric.Bernstein@amd.com>
Acked-by: Alan Liu <HaoPing.Liu@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Change HDMI judgement condition.

[Why & How]
Use dc_is_hdmi_signal to determine signal type.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Acked-by: Alan Liu <HaoPing.Liu@amd.com>
Signed-off-by: JinZe.Xu <JinZe.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Fix DC warning at driver load

[Why]
Wrong index was checked for dcfclk_mhz, causing false warning.

[How]
Fix the assertion index.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Add SMU logging code

[WHY]
Logging for SMU response value after the wait allows us to know
immediately what the response value was. Makes it easier to debug should
the value be anything other than OK.

[HOW]
Using the the already available DC SMU logging functions.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Charlene Liu <Charlene.Liu@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Saaem Rizvi <SyedSaaem.Rizvi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Turn off internal backlight when plugging external monitor

[why]
For VG, we want to turn off power/backlight of the intenral panel when
plugging in external monitor and going to "external monitor only" mode.

[how]
For turning off power of the internal panel, ignore the config flag whic
bypasses power sequencing for eDP panels.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Charlene Liu <Charlene.Liu@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Sung Joon Kim <sungkim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Fix eDP not light up on resume

[why]
Only on VG, if external display is disconnected during S3 suspend, the
internal panel doesn't light up on resume because we set the power state
using an unsupported DPCD register SET_POWER. To check the register is
supported, we need to check SET_POWER_CAPABLE first which is
eDP-specific DPCD register field.

[how]
Check the SET_POWER_CAPABLE register field and decide the control of the
eDP power state based on the read register value.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Agustin Gutierrez <Agustin.Gutierrez@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Sung Joon Kim <sungkim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: add mst port output bw check

[Why]
when connect one 4k@144hz dp to dsc mst hub, 4k@144hz mode is in valid
mode list. but some mst hub port output bandwidth does not support
4k@144hz.

[How]
add mst port output bandwidth checks, include full_pbn, branch max
throughput mps.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Jerry Zuo <Jerry.Zuo@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: hersen wu <hersenxs.wu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Drop unnecessary detect link code

Delete unnecessary codes in detect_link_and_local_sink. We already have
correct stop logic in dc_link_detect.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Ian Chen <ian.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Take emulated dc_sink into account for HDCP

[Why]
While updating the config of hdcp, we use the sink_singal type of the
dc_sink to decide the HDCP operation mode. However, it doesn't consider
the case when the sink is a emulated one.

[How]
Take dc_em_sink into account while updating HDCP config.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Aurabindo Jayamohanan Pillai <Aurabindo.Pillai@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Release remote dc_sink under mst scenario

[Why]
Observe that we have several problems while releasing remote dc_sink
under mst cases.

- When unplug mst branch device from the source, we now try to free all
  remote dc_sinks in dm_helpers_dp_mst_stop_top_mgr(). However, there are
  bugs while we're releasing dc_sinks here. First of all,
  link->remote_sinks[] array get shuffled within
  dc_link_remove_remote_sink(). As the result, increasing the array index
  within the releasing loop is wrong. Secondly, it tries to call
  dc_sink_release() to release the dc_sink of the same aconnector every
  time in the loop. Which can't release dc_sink of all aconnector in the
  mst topology.
- There is no code path for us to release remote dc_sink for disconnected
  sst monitor which unplug event is notified by CSN sideband message. Which
  means we'll use stale dc_sink data to represent later on connected
  monitor. Also, has chance to break the maximum remote dc_sink number
  constraint.

[How]
Distinguish unplug event of mst scenario into 2 cases.

* Unplug sst/legacy stream sink off the mst topology
- Release related remote dc_sink in detec_ctx().

* Unplug mst branch device off the mst topology
- Release related remote dc_sink in early_unregister()

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Aurabindo Jayamohanan Pillai <Aurabindo.Pillai@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Acked-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Revert "drm/amd/display: turn DPMS off on connector unplug"

This reverts commit 3c4d55c9b9becedd8d31a7c96783a364533713ab.

Revert the commit because:
- It's incomplete of the function dm_set_dpms_off() for mst case.  For
  stream sinks whithin the same mst topology, they share the same dc_link.
  dm_set_dpms_off() tries to update one mst stream only which is
  incomplete.
- Setting dpms off should be triggered by usermode. Besdies, it seems
  usermode does release relevant resource for mst & non-mst case when
  unplug connecotr now.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Aurabindo Jayamohanan Pillai <Aurabindo.Pillai@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Acked-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Revert "drm/amd/display: Add flag to detect dpms force off during HPD"

This reverts commit 035f54969bb2c1a5ced52f43e4ef393e0c0f6bfa.

The reverted commit was trying to fix side effect brought by
commit 3c4d55c9b9be ("drm/amd/display: turn DPMS off on connector unplug")

However,
* This reverted commit will have mst case never call dm_set_dpms_off()
  which conflicts the idea of original commit 3c4d55c9b9be ("drm/amd/display: turn DPMS off on connector unplug")
  That's due to dm_crtc_state is always null since the input parameter
  aconnector is the root device (source) of mst topology.  It's not an
  end stream sink within the mst topology.
* Setting dpms off should be triggered by usermode. Besdies, it seems
  usermode does release relevant resource for mst & non-mst case when
  unplug connecotr now. Which means we no longer need both commits now:
  commit 3c4d55c9b9be ("drm/amd/display: turn DPMS off on connector unplug")
  commit 035f54969bb2 ("drm/amd/display: Add flag to detect dpms force off during HPD")

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Aurabindo Jayamohanan Pillai <Aurabindo.Pillai@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Acked-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd: Revert "drm/amd/display: keep eDP Vdd on when eDP stream is already enabled"

A variety of Lenovo machines with Rembrandt APUs and OLED panels have
stopped showing the display at login. This behavior clears up after
leaving it idle and moving the mouse or touching keyboard.

It was bisected to be caused by commit 559e2655220d ("drm/amd/display:
keep eDP Vdd on when eDP stream is already enabled"). Revert this commit
to fix the issue.

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2047
Reported-by: Aaron Ma <aaron.ma@canonical.com>
Fixes: 559e2655220d ("drm/amd/display: keep eDP Vdd on when eDP stream is already enabled")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Mark Pearson <markpearson@lenovo.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Remove compiler warning

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Aric Cyr <Aric.Cyr@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Cruise Hung <Cruise.Hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: add LSDMA block for LSDMA v6.0.1

This patch adds LSDMA ip block for LSDMA v6.0.1.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: add missing reg defs for DCN3x HUBBUB

[Why&How]
The omitted register definition caused call traces like:

[    3.811215] WARNING: CPU: 7 PID: 794 at drivers/gpu/drm/amd/amdgpu/../display/dc/dc_helper.c:120 set_reg_field_values.constprop.0+0xc7/0xe0 [amdgpu]
[    3.811406] Modules linked in: amdgpu(+) drm_ttm_helper ttm iommu_v2 gpu_sched drm_kms_helper cfbfillrect syscopyarea cfbimgblt sysfillrect sysimgblt fb_sys_fops cfbcopyarea drm i2c_piix4 drm_panel_orientation_quirks
[    3.811419] CPU: 7 PID: 794 Comm: systemd-udevd Not tainted 5.16.0-kfd+ #132
[    3.811422] Hardware name: System manufacturer System Product Name/ROG STRIX B450-F GAMING, BIOS 3003 12/09/2019
[    3.811425] RIP: 0010:set_reg_field_values.constprop.0+0xc7/0xe0 [amdgpu]
[    3.811615] Code: 08 49 89 51 08 8b 08 48 8d 42 08 49 89 41 08 44 8b 02 48 8d 50 08 0f b6 c9 49 89 51 08 8b 00 45 85 c0 75 b3 0f 0b eb af 5d c3 <0f> 0b e9 48 ff ff ff 49 8b 51 08 eb d0 49 8b 41 08 eb d5 66 0f 1f
[    3.811619] RSP: 0018:ffffb8c1c04cf640 EFLAGS: 00010246
[    3.811621] RAX: 0000000000000000 RBX: ffff96f2100d8800 RCX: 0000000000000000
[    3.811623] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffb8c1c04cf650
[    3.811625] RBP: ffffb8c1c04cf640 R08: 000000000000047f R09: ffffb8c1c04cf658
[    3.811627] R10: ffff96f5161ff000 R11: ffff96f5161ff000 R12: ffff96f204afb9c0
[    3.811629] R13: 0000000000000000 R14: ffff96f202b94c00 R15: ffffb8c1c04cf718
[    3.811631] FS:  00007fe07c2e2880(0000) GS:ffff96f5059c0000(0000) knlGS:0000000000000000
[    3.811634] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    3.811636] CR2: 0000559634ab57b8 CR3: 0000000120674000 CR4: 00000000003506e0
[    3.811637] Call Trace:
[    3.811640]  <TASK>
[    3.811642]  generic_reg_update_ex+0x69/0x200 [amdgpu]
[    3.811831]  ? _printk+0x58/0x6f
[    3.811836]  dcn32_init_crb+0x18f/0x1b0 [amdgpu]
[    3.812031]  dcn32_init_hw+0x379/0x6a0 [amdgpu]
[    3.812223]  dc_hardware_init+0xba/0x100 [amdgpu]
[    3.812415]  amdgpu_dm_init.isra.0.cold+0x166/0x1867 [amdgpu]
[    3.812616]  ? dev_vprintk_emit+0x139/0x15d
[    3.812621]  ? dev_printk_emit+0x4e/0x65
[    3.812624]  dm_hw_init+0x12/0x30 [amdgpu]
[    3.812820]  amdgpu_device_init.cold+0x130d/0x178c [amdgpu]
[    3.813017]  ? pci_read_config_word+0x25/0x40
[    3.813021]  amdgpu_driver_load_kms+0x1a/0x130 [amdgpu]
[    3.813178]  amdgpu_pci_probe+0x130/0x330 [amdgpu]

Fixes: 4f29f9cf092b ("drm/amd: add register headers for DCN32/321")
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd: Don't show warning on reading vbios values for SMU13 3.1

Some APUs with SMU13 are showing the following message:
`amdgpu 0000:63:00.0: amdgpu: Unexpected and unhandled version: 3.1`

This warning isn't relevant for smu info 3.1, as no bootup information
is present in the table.

Fixes: 593a54f18031 ("drm/amd/pm: correct the way for retrieving bootup clocks")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: skip to set mp1 unload state in special case

set mp1 unload state will cause the SMC FW can't accept any SMU message,
skip to set mp1 unload state to avoid following case fail:
- runtime pm case.
- gpu reset case.

Fixes: 72aeb6ee0c78 ("drm/amd/pm: fix driver reload SMC firmware fail issue for smu13")
Signed-off-by: Yang Wang <KevinYang.Wang@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc11: avoid cpu accessing registers to flush VM

Due to gfxoff on, cpu accessing registers is not expected.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/pm: adjust EccInfo_t struct

The EccInfo_t struct in driver_if.h is as below in official release
verion 68.55.0
typedef struct {
   uint64_t mca_umc_status;
   uint64_t mca_umc_addr;

   uint16_t ce_count_lo_chip;
   uint16_t ce_count_hi_chip;

   uint32_t eccPadding;

   uint64_t mca_ceumc_addr;
} EccInfo_t;
It's different from the debug version druing develop print correctable
error address, so adjust EccInfo_t struct.

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Adjust logic around GTT size (v3)

Certain GL unit tests for large textures can cause problems
with the OOM killer since there is no way to link this memory
to a process. This was originally mitigated (but not necessarily
eliminated) by limiting the GTT size. The problem is this limit
is often too low for many modern games so just make the limit 1/2
of system memory. The OOM accounting needs to be addressed, but
we shouldn't prevent common 3D applications from being usable
just to potentially mitigate that corner case.

Set default GTT size to max(3G, 1/2 of system ram) by default.

v2: drop previous logic and default to 3/4 of ram
v3: default to half of ram to align with ttm
v4: fix spelling in comment (Kent)

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1942
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/radeon: fix incorrrect SPDX-License-Identifiers

radeon is MIT. This were incorrectly changed in
commit b24413180f56 ("License cleanup: add SPDX GPL-2.0 license identifier to files with no license")
and
commit d198b34f3855 (".gitignore: add SPDX License Identifier")
and:
commit ec8f24b7faaf ("treewide: Add SPDX license identifier - Makefile/Kconfig")

Fixes: d198b34f3855 (".gitignore: add SPDX License Identifier")
Fixes: ec8f24b7faaf ("treewide: Add SPDX license identifier - Makefile/Kconfig")
Fixes: b24413180f56 ("License cleanup: add SPDX GPL-2.0 license identifier to files with no license")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2053
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Remove break for VMID loop TLB flush on MES

Loop through all VMIDs for gmc_v10_0_flush_gpu_tlb_pasid and
gmc_v11_0_flush_gpu_tlb_pasid (only if using MES for gmc_v10). This is
required for MES due to use_different_vmid_compute causing SDMA queues
to be assigned different VMIDs than compute for the same PASID.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/vcn: adjust unified queue code format

Fixed some errors and warnings found by checkpatch.pl.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>