Evan Quan [Fri, 14 Aug 2020 08:34:15 +0000 (16:34 +0800)]
drm/amd/pm: drop redundant MEM_TYPE_* macros
As these are already defined in amdgpu_atombios.h. Otherwise, we may
hit "redefined" compile warning.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Fri, 14 Aug 2020 04:15:44 +0000 (12:15 +0800)]
drm/amd/powerplay: suppress the kernel test robot warning
Suppress the warning below:
In file included from drivers/gpu/drm/amd/amdgpu/../powerplay/smu_cmn.c:
>> drivers/gpu/drm/amd/powerplay/smu_cmn.c:485:9: warning: Identical condition 'ret', second condition is always false [identicalConditionAfterEarlyExit]
return ret;
^
drivers/gpu/drm/amd/powerplay/smu_cmn.c:477:6: note: first condition
if (ret)
^
drivers/gpu/drm/amd/powerplay/smu_cmn.c:485:9: note: second condition
return ret;
^
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Guchun Chen [Thu, 13 Aug 2020 07:00:56 +0000 (15:00 +0800)]
drm/amdgpu: guard ras debugfs creation/removal based on CONFIG_DEBUG_FS
It can avoid potential build warn/error when
CONFIG_DEBUG_FS is not set.
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Guchun Chen [Thu, 13 Aug 2020 06:35:35 +0000 (14:35 +0800)]
drm/amdgpu: fix NULL pointer access issue when unloading driver
When unloading driver by "modprobe -r amdgpu", one NULL pointer
dereference bug occurs in ras debugfs releasing. The cause is the
duplicated debugfs_remove, as drm debugfs_root dir has been cleaned
up already by drm_minor_unregister.
BUG: kernel NULL pointer dereference, address:
00000000000000a0
PGD 0 P4D 0
Oops: 0002 [#1] SMP PTI
CPU: 11 PID: 1526 Comm: modprobe Tainted: G OE 5.6.0-guchchen #1
Hardware name: System manufacturer System Product Name/TUF Z370-PLUS GAMING II, BIOS 0411 09/21/2018
RIP: 0010:down_write+0x15/0x40
Code: eb de e8 7e 17 72 ff cc cc cc cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 53 48 89 fb e8 92
d8 ff ff 31 c0 ba 01 00 00 00 <f0> 48 0f b1 13 75 0f 65 48 8b 04 25 c0 8b 01 00 48 89 43 08 5b c3
RSP: 0018:
ffffb1590386fcd0 EFLAGS:
00010246
RAX:
0000000000000000 RBX:
00000000000000a0 RCX:
0000000000000000
RDX:
0000000000000001 RSI:
ffffffff85b2fcc2 RDI:
00000000000000a0
RBP:
ffffb1590386fd30 R08:
ffffffff85b2fcc2 R09:
000000000002b3c0
R10:
ffff97a330618c40 R11:
00000000000005f6 R12:
ffff97a3481beb40
R13:
00000000000000a0 R14:
ffff97a3481beb40 R15:
0000000000000000
FS:
00007fb11a717540(0000) GS:
ffff97a376cc0000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
00000000000000a0 CR3:
00000004066d6006 CR4:
00000000003606e0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
Call Trace:
simple_recursive_removal+0x63/0x370
? debugfs_remove+0x60/0x60
debugfs_remove+0x40/0x60
amdgpu_ras_fini+0x82/0x230 [amdgpu]
? __kernfs_remove.part.17+0x101/0x1f0
? kernfs_name_hash+0x12/0x80
amdgpu_device_fini+0x1c0/0x580 [amdgpu]
amdgpu_driver_unload_kms+0x3e/0x70 [amdgpu]
amdgpu_pci_remove+0x36/0x60 [amdgpu]
pci_device_remove+0x3b/0xb0
device_release_driver_internal+0xe5/0x1c0
driver_detach+0x46/0x90
bus_remove_driver+0x58/0xd0
pci_unregister_driver+0x29/0x90
amdgpu_exit+0x11/0x25 [amdgpu]
__x64_sys_delete_module+0x13d/0x210
do_syscall_64+0x5f/0x250
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Wed, 12 Aug 2020 15:48:26 +0000 (17:48 +0200)]
drm/amdgpu: revert "fix system hang issue during GPU reset"
The whole approach wasn't thought through till the end.
We already had a reset lock like this in the past and it caused the same problems like this one.
Completely revert the patch for now and add individual trylock protection to the hardware access functions as necessary.
This reverts commit
df9c8d1aa278c435c30a69b8f2418b4a52fcb929.
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Wed, 12 Aug 2020 04:37:03 +0000 (12:37 +0800)]
drm/amd/powerplay: enable Sienna Cichlid mgpu fan boost feature
Support Sienna Cichlid mgpu fan boost enablement.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Wed, 12 Aug 2020 04:29:16 +0000 (12:29 +0800)]
drm/amd/powerplay: enable Navi1X mgpu fan boost feature(V2)
Support Navi1X mgpu fan boost enablement.
V2: rich the comment and correct the revision id check
Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Wed, 12 Aug 2020 04:08:56 +0000 (12:08 +0800)]
drm/amd/powerplay: enable swSMU mgpu fan boost support
Enable mgpu fan boost feature on swSMU routines.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Wed, 12 Aug 2020 03:53:47 +0000 (11:53 +0800)]
drm/amd/powerplay: optimize the interface for mgpu fan boost enablement
Cover the implementation details from outside(of power). Also preparing
for expanding this to swSMU.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Kevin Wang [Thu, 6 Aug 2020 15:41:47 +0000 (23:41 +0800)]
drm/amdgpu: fix uninit-value in arcturus_log_thermal_throttling_event()
when function arcturus_get_smu_metrics_data() call failed,
it will cause the variable "throttler_status" isn't initialized before use.
warning:
powerplay/arcturus_ppt.c:2268:24: warning: ‘throttler_status’ may be used uninitialized in this function [-Wmaybe-uninitialized]
2268 | if (throttler_status & logging_label[throttler_idx].feature_mask) {
Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jiansong Chen [Wed, 12 Aug 2020 07:57:32 +0000 (15:57 +0800)]
drm/amdgpu: disable gfxoff for navy_flounder
gfxoff is temporarily disabled for navy_flounder,
since at present the feature has broken some basic
amdgpu test.
Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Mon, 10 Aug 2020 05:27:56 +0000 (13:27 +0800)]
drm/amd/powerplay: bump NAVI12 driver if version
To fit the latest SMU firmware.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Thu, 6 Aug 2020 08:49:19 +0000 (16:49 +0800)]
drm/amd/powerplay: maximum the code sharing around metrics table retrieving
Instead of having one copy in each ASIC.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Thu, 6 Aug 2020 07:38:25 +0000 (15:38 +0800)]
drm/amd/powerplay: update the metrics table cache interval as 1ms
To make the setting same as Arcturus/Navi1x/Sienna_Cichlid.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Oak Zeng [Fri, 7 Aug 2020 03:17:35 +0000 (22:17 -0500)]
drm/amdgpu: Use function pointer for some mmhub functions
Add more function pointers to amdgpu_mmhub_funcs. ASIC specific
implementation of most mmhub functions are called from a general
function pointer, instead of calling different function for
different ASIC. Simplify the code by deleting duplicate functions
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Nirmoy Das [Tue, 11 Aug 2020 14:10:19 +0000 (16:10 +0200)]
drm/amdgpu: pass NULL pointer instead of 0
Fixes:
c030f2e4166c3f55 ("drm/amdgpu: add amdgpu_ras.c to support ras (v2)")
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dennis Li [Thu, 6 Aug 2020 06:48:15 +0000 (14:48 +0800)]
drm/amdgpu: annotate a false positive recursive locking
[ 584.110304] ============================================
[ 584.110590] WARNING: possible recursive locking detected
[ 584.110876] 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1 Tainted: G OE
[ 584.111164] --------------------------------------------
[ 584.111456] kworker/38:1/553 is trying to acquire lock:
[ 584.111721]
ffff9b15ff0a47a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[ 584.112112]
but task is already holding lock:
[ 584.112673]
ffff9b1603d247a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[ 584.113068]
other info that might help us debug this:
[ 584.113689] Possible unsafe locking scenario:
[ 584.114350] CPU0
[ 584.114685] ----
[ 584.115014] lock(&adev->reset_sem);
[ 584.115349] lock(&adev->reset_sem);
[ 584.115678]
*** DEADLOCK ***
[ 584.116624] May be due to missing lock nesting notation
[ 584.117284] 4 locks held by kworker/38:1/553:
[ 584.117616] #0:
ffff9ad635c1d348 ((wq_completion)events){+.+.}, at: process_one_work+0x21f/0x630
[ 584.117967] #1:
ffffac708e1c3e58 ((work_completion)(&con->recovery_work)){+.+.}, at: process_one_work+0x21f/0x630
[ 584.118358] #2:
ffffffffc1c2a5d0 (&tmp->hive_lock){+.+.}, at: amdgpu_device_gpu_recover+0xae/0x1030 [amdgpu]
[ 584.118786] #3:
ffff9b1603d247a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[ 584.119222]
stack backtrace:
[ 584.119990] CPU: 38 PID: 553 Comm: kworker/38:1 Kdump: loaded Tainted: G OE 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1
[ 584.120782] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, BIOS 3.1 05/23/2019
[ 584.121223] Workqueue: events amdgpu_ras_do_recovery [amdgpu]
[ 584.121638] Call Trace:
[ 584.122050] dump_stack+0x98/0xd5
[ 584.122499] __lock_acquire+0x1139/0x16e0
[ 584.122931] ? trace_hardirqs_on+0x3b/0xf0
[ 584.123358] ? cancel_delayed_work+0xa6/0xc0
[ 584.123771] lock_acquire+0xb8/0x1c0
[ 584.124197] ? amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[ 584.124599] down_write+0x49/0x120
[ 584.125032] ? amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[ 584.125472] amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[ 584.125910] ? amdgpu_ras_error_query+0x1b8/0x2a0 [amdgpu]
[ 584.126367] amdgpu_ras_do_recovery+0x159/0x190 [amdgpu]
[ 584.126789] process_one_work+0x29e/0x630
[ 584.127208] worker_thread+0x3c/0x3f0
[ 584.127621] ? __kthread_parkme+0x61/0x90
[ 584.128014] kthread+0x12f/0x150
[ 584.128402] ? process_one_work+0x630/0x630
[ 584.128790] ? kthread_park+0x90/0x90
[ 584.129174] ret_from_fork+0x3a/0x50
Each adev has owned lock_class_key to avoid false positive
recursive locking.
v2:
1. register adev->lock_key into lockdep, otherwise lockdep will
report the below warning
[ 1216.705820] BUG: key
ffff890183b647d0 has not been registered!
[ 1216.705924] ------------[ cut here ]------------
[ 1216.705972] DEBUG_LOCKS_WARN_ON(1)
[ 1216.705997] WARNING: CPU: 20 PID: 541 at kernel/locking/lockdep.c:3743 lockdep_init_map+0x150/0x210
v3:
change to use down_write_nest_lock to annotate the false dead-lock
warning.
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Wenhui Sheng [Tue, 11 Aug 2020 03:02:03 +0000 (11:02 +0800)]
drm/amdgpu: add debugfs interface for RAP test
After amdgpu driver loading successfully, we can use
RAP debugfs interface <debugfs_dir>/dri/xxx/rap_test
to trigger RAP test.
Currently only L0 validate test is supported.
v2: refine amdgpu_rap.h
Signed-off-by: Wenhui Sheng <Wenhui.Sheng@amd.com>
Reviewed-by: Guchun Chen <Guchun.Chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Wenhui Sheng [Fri, 17 Jul 2020 08:55:20 +0000 (16:55 +0800)]
drm/amdgpu: enable RAP TA load
Enable the RAP TA loading path and add RAP test
trigger interface.
v2: fix potential mem leak issue
Signed-off-by: Wenhui Sheng <Wenhui.Sheng@amd.com>
Reviewed-by: Guchun Chen <Guchun.Chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Wenhui Sheng [Thu, 23 Jul 2020 02:57:06 +0000 (10:57 +0800)]
drm/amdgpu: add RAP TA header file
The RAP TA contains tests used to verify if
RAP(Register Access Policy), or otherwise known
as Security Policy is applied correctly
by PSP BL&TOS.
The RAP test is a measure to ensure that we reduce
the avenue of complexity and mistakes when dealing
with RAP in post-si execution, where debugging failures
related to RAP is quite difficult and expensive.
v2: add introduction for RAP TA
Signed-off-by: Wenhui Sheng <Wenhui.Sheng@amd.com>
Reviewed-by: Guchun Chen <Guchun.Chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Tianci.Yin [Mon, 20 Jul 2020 07:47:37 +0000 (15:47 +0800)]
drm/amdgpu: reconfigure spm golden settings on Navi1x after GFXOFF exit(v3)
On Navi1x, the SPM golden settings are lost after GFXOFF
enter/exit, so reconfigure the golden settings after GFXOFF
exit.
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Tianci.Yin [Fri, 19 Jun 2020 08:01:11 +0000 (16:01 +0800)]
drm/amdgpu: add interface amdgpu_gfx_init_spm_golden for Navi1x
On Navi1x, the SPM golden settings are lost after GFXOFF
enter/exit, so reconfiguration is needed. Make the
configuration code as an interface for future use.
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Guchun Chen [Tue, 4 Aug 2020 07:05:01 +0000 (15:05 +0800)]
drm/amdgpu: add debugfs node to toggle ras error cnt harvest
Before ras recovery is issued, user could operate this debugfs
node to enable/disable the harvest of all RAS IPs' ras error
count registers, which will help keep hardware's registers'
status instead of cleaning up them.
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Guchun Chen [Tue, 4 Aug 2020 07:00:53 +0000 (15:00 +0800)]
drm/amdgpu: bypass querying ras error count registers
Once ras recovery is issued by ras sync flood interrupt or
ras controller interrupt, add this guard to bypass or execute
ras error count register harvest of all IPs.
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Arunpravin [Thu, 6 Aug 2020 09:04:33 +0000 (14:34 +0530)]
drm/amdgpu: Enable P2P dmabuf over XGMI
Access the exported P2P dmabuf over XGMI, if available.
Otherwise, fall back to the existing PCIe method.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Arunpravin <apaneers@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Qinglang Miao [Mon, 10 Aug 2020 12:59:19 +0000 (20:59 +0800)]
drm/amd/display: convert to use le16_add_cpu()
Convert cpu_to_le16(le16_to_cpu(E1) + E2) to use le16_add_cpu().
Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Mon, 10 Aug 2020 15:55:32 +0000 (11:55 -0400)]
drm/amdgpu/display: drop unused function
This is not longer used as of the latest rework of this
code so drop it to avoid a unused function warning.
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Daniel Kolesa [Sat, 8 Aug 2020 20:44:58 +0000 (22:44 +0200)]
drm/amd/display: add DCN support for aarch64
This adds ARM64 support into the DCN. This mainly enables support
for Navi graphics cards. The dcn10 changes haven't been tested,
since I don't have the relevant hardware available, but there
is no way to conditionally disable them, so I've done them anyway.
Signed-off-by: Daniel Kolesa <daniel@octaforge.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Daniel Kolesa [Sat, 8 Aug 2020 20:42:35 +0000 (22:42 +0200)]
drm/amdgpu/display: use GFP_ATOMIC in dcn20_validate_bandwidth_internal
GFP_KERNEL may and will sleep, and this is being executed in
a non-preemptible context; this will mess things up since it's
called inbetween DC_FP_START/END, and rescheduling will result
in the DC_FP_END later being called in a different context (or
just crashing if any floating point/vector registers/instructions
are used after the call is resumed in a different context).
Signed-off-by: Daniel Kolesa <daniel@octaforge.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jaehyun Chung [Thu, 30 Jul 2020 20:31:29 +0000 (16:31 -0400)]
drm/amd/display: Blank stream before destroying HDCP session
[Why]
Stream disable sequence incorretly destroys HDCP session while stream is
not blanked and while audio is not muted. This sequence causes a flash
of corruption during mode change and an audio click.
[How]
Change sequence to blank stream before destroying HDCP session. Audio will
also be muted by blanking the stream.
Cc: stable@vger.kernel.org
Signed-off-by: Jaehyun Chung <jaehyun.chung@amd.com>
Reviewed-by: Alvin Lee <Alvin.Lee2@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Stylon Wang [Tue, 28 Jul 2020 07:10:35 +0000 (15:10 +0800)]
drm/amd/display: Fix EDID parsing after resume from suspend
[Why]
Resuming from suspend, CEA blocks from EDID are not parsed and no video
modes can support YUV420. When this happens, output bpc cannot go over
8-bit with 4K modes on HDMI.
[How]
In amdgpu_dm_update_connector_after_detect(), drm_add_edid_modes() is
called after drm_connector_update_edid_property() to fully parse EDID
and update display info.
Cc: stable@vger.kernel.org
Signed-off-by: Stylon Wang <stylon.wang@amd.com>
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alvin Lee [Thu, 30 Jul 2020 03:08:59 +0000 (23:08 -0400)]
drm/amd/display: Disconnect pipe separetely when disable pipe split
[Why]
When changing pixel formats for HDR (e.g. ARGB -> FP16)
there are configurations that change from 2 pipes to 1 pipe.
In these cases, it seems that disconnecting MPCC and doing
a surface update at the same time(after unlocking) causes
some registers to be updated slightly faster than others
after unlocking (e.g. if the pixel format is updated to FP16
before the new surface address is programmed, we get
corruption on the screen because the pixel formats aren't
matching). We separate disconnecting MPCC from the rest
of the pipe programming sequence to prevent this.
[How]
Move MPCC disconnect into separate operation than the
rest of the pipe programming.
Signed-off-by: Alvin Lee <alvin.lee2@amd.com>
Reviewed-by: Aric Cyr <Aric.Cyr@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Anthony Koo [Wed, 29 Jul 2020 21:43:10 +0000 (17:43 -0400)]
drm/amd/display: Switch to immediate mode for updating infopackets
[Why]
Using FRAME_UPDATE will result in infopacket to be potentially updated
one frame late.
In commit stream scenarios for previously active stream, some stale
infopacket data from previous config might be erroneously sent out on
initial frame after stream is re-enabled.
[How]
Switch to using IMMEDIATE_UPDATE mode
Signed-off-by: Anthony Koo <Anthony.Koo@amd.com>
Reviewed-by: Ashley Thomas <Ashley.Thomas2@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Anthony Koo [Wed, 29 Jul 2020 21:33:27 +0000 (17:33 -0400)]
drm/amd/display: Fix LFC multiplier changing erratically
[Why]
1. There is a calculation that is using frame_time_in_us instead of
last_render_time_in_us to calculate whether choosing an LFC multiplier
would cause the inserted frame duration to be outside of range.
2. We do not handle unsigned integer subtraction correctly and it underflows
to a really large value, which causes some logic errors.
[How]
1. Fix logic to calculate 'within range' using last_render_time_in_us
2. Split out delta_from_mid_point_delta_in_us calculation to ensure
we don't underflow and wrap around
Signed-off-by: Anthony Koo <Anthony.Koo@amd.com>
Reviewed-by: Aric Cyr <Aric.Cyr@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Xiaodong Yan [Tue, 28 Jul 2020 10:12:45 +0000 (18:12 +0800)]
drm/amd/display: mpcc black color should not be impacted by pixel encoding format
[Why]
The format in MPCC should be 444
[How]
do not modify the mpcc black color according to pixel encoding format
Signed-off-by: Xiaodong Yan <Xiaodong.Yan@amd.com>
Reviewed-by: Eric Yang <eric.yang2@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alvin Lee [Wed, 29 Jul 2020 21:49:14 +0000 (17:49 -0400)]
drm/amd/display: Revert regression
[Why]
Caused pipe split regression
Signed-off-by: Alvin Lee <alvin.lee2@amd.com>
Reviewed-by: Aric Cyr <Aric.Cyr@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Aric Cyr [Tue, 28 Jul 2020 01:21:16 +0000 (21:21 -0400)]
drm/amd/display: Fix incorrect backlight register offset for DCN
[Why]
Typo in backlight refactor inctroduced wrong register offset.
[How]
Change DCE to DCN register map for PWRSEQ_REF_DIV
Cc: stable@vger.kernel.org
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Reviewed-by: Ashley Thomas <Ashley.Thomas2@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Joshua Aberback [Thu, 16 Jul 2020 16:39:27 +0000 (12:39 -0400)]
drm/amd/display: Adjust static-ness of resource functions
[Why]
Register definitions are asic-specific, so functions that use registers of
a particular asic should be static, to be exposed in asic-specific function
pointer structures.
[How]
- make register-definition-using functions static
- make some functions non-static, for future use
- remove duplicate function definition
Signed-off-by: Joshua Aberback <joshua.aberback@amd.com>
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Monk Liu [Mon, 10 Aug 2020 06:12:06 +0000 (14:12 +0800)]
drm/amdgpu: fix reload KMD hang on GFX10 KIQ
GFX10 KIQ will hang if we try below steps:
modprobe amdgpu
rmmod amdgpu
modprobe amdgpu sched_hw_submission=4
Due to KIQ is always living there even after KMD unloaded
thus when doing the realod KIQ will crash upon its register
being programed by different values with the previous loading
(the config like HQD addr, ring size, is easily changed if we alter
the sched_hw_submission)
the fix is we must inactive KIQ first before touching any
of its registgers
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
shiwu.zhang [Fri, 7 Aug 2020 08:43:59 +0000 (16:43 +0800)]
drm/amdgpu: update gc golden register for arcturus
Update golden setting to improve performance on HPC
and ML apps
Signed-off-by: shiwu.zhang <shiwu.zhang@amd.com>
Tested-by: gang.long <gang.long@amd.com>
Reviewed-by: guchun.chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Fri, 7 Aug 2020 09:01:47 +0000 (17:01 +0800)]
drm/amd/powerplay: correct UVD/VCE PG state on custom pptable uploading
The UVD/VCE PG state is managed by UVD and VCE IP. It's error-prone to
assume the bootup state in SMU based on the dpm status.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Fri, 7 Aug 2020 07:03:40 +0000 (15:03 +0800)]
drm/amd/powerplay: correct Vega20 cached smu feature state
Correct the cached smu feature state on pp_features sysfs
setting.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Liu ChengZhe [Thu, 6 Aug 2020 06:54:08 +0000 (14:54 +0800)]
drm/amdgpu: Skip some registers config for SRIOV
Some registers are not accessible to virtual function setup, so
skip their initialization when in VF-SRIOV mode.
v2: move SRIOV VF check into specify functions;
modify commit description and comment.
Signed-off-by: Liu ChengZhe <ChengZhe.Liu@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jay Cornwall [Fri, 24 Jul 2020 23:58:48 +0000 (16:58 -0700)]
drm/amdkfd: Fix spurious debug exception on gfx10
s_barrier triggers a debug exception when issued with PRIV=1,
DEBUG_EN=1. This causes spurious notifications to rocm-gdb.
Clear MODE before issuing s_barrier and restore MODE afterwards
in the context restore handler.
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Tested-by: Laurent Morichetti <laurent.morichetti@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Felix Kuehling [Fri, 7 Aug 2020 22:23:56 +0000 (18:23 -0400)]
Revert "drm/amdkfd: Unify gfx9/gfx10 context save area layouts"
This reverts commit
0a5baee415000a3e18730ac98e19d046c3cebbe6.
The change introduced a regression on some chips. Reverting until
a proper solution can be found.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Felix Kuehling [Fri, 7 Aug 2020 22:22:27 +0000 (18:22 -0400)]
Revert "drm/amdkfd: Fix spurious debug exception on gfx10"
This reverts commit
ea368183ae900e376b66d3f23da22acde48e385a.
Needed due to conflicts when reverting "drm/amdkfd: Unify gfx9/gfx10
context save area layouts".
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christophe JAILLET [Sun, 9 Aug 2020 20:34:06 +0000 (22:34 +0200)]
drm: amdgpu: Use the correct size when allocating memory
When '*sgt' is allocated, we must allocated 'sizeof(**sgt)' bytes instead
of 'sizeof(*sg)'.
The sizeof(*sg) is bigger than sizeof(**sgt) so this wastes memory but
it won't lead to corruption.
Fixes:
f44ffd677fb3 ("drm/amdgpu: add support for exporting VRAM using DMA-buf v3")
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sandeep Raghuraman [Thu, 6 Aug 2020 17:22:20 +0000 (22:52 +0530)]
drm/amdgpu: Fix bug where DPM is not enabled after hibernate and resume
Reproducing bug report here:
After hibernating and resuming, DPM is not enabled. This remains the case
even if you test hibernate using the steps here:
https://www.kernel.org/doc/html/latest/power/basic-pm-debugging.html
I debugged the problem, and figured out that in the file hardwaremanager.c,
in the function, phm_enable_dynamic_state_management(), the check
'if (!hwmgr->pp_one_vf && smum_is_dpm_running(hwmgr) && !amdgpu_passthrough(adev) && adev->in_suspend)'
returns true for the hibernate case, and false for the suspend case.
This means that for the hibernate case, the AMDGPU driver doesn't enable DPM
(even though it should) and simply returns from that function.
In the suspend case, it goes ahead and enables DPM, even though it doesn't need to.
I debugged further, and found out that in the case of suspend, for the
CIK/Hawaii GPUs, smum_is_dpm_running(hwmgr) returns false, while in the case of
hibernate, smum_is_dpm_running(hwmgr) returns true.
For CIK, the ci_is_dpm_running() function calls the ci_is_smc_ram_running() function,
which is ultimately used to determine if DPM is currently enabled or not,
and this seems to provide the wrong answer.
I've changed the ci_is_dpm_running() function to instead use the same method that
some other AMD GPU chips do (e.g Fiji), which seems to read the voltage controller.
I've tested on my R9 390 and it seems to work correctly for both suspend and
hibernate use cases, and has been stable so far.
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=208839
Signed-off-by: Sandeep Raghuraman <sandy.8925@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dennis Li [Tue, 4 Aug 2020 04:32:13 +0000 (12:32 +0800)]
drm/amdgpu: unlock mutex on error
Make sure to unlock the mutex when error happen
v2:
1. correct syntax error in the commit comments
2. remove change-Id
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Wed, 5 Aug 2020 09:24:41 +0000 (17:24 +0800)]
drm/amd/powerplay: put VCN/JPEG into PG ungate state before dpm table setup(V3)
As VCN related dpm table setup needs VCN be in PG ungate state. Same logics
applies to JPEG.
V2: fix paste typo
V3: code cosmetic
Signed-off-by: Evan Quan <evan.quan@amd.com>
Tested-by: Matt Coffin <mcoffin13@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Mon, 3 Aug 2020 03:15:14 +0000 (11:15 +0800)]
drm/amd/powerplay: update swSMU VCN/JPEG PG logics
Add lock protections and avoid unnecessary actions
if the PG state is already the same as required.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Tested-by: Matt Coffin <mcoffin13@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Likun Gao [Thu, 6 Aug 2020 09:37:28 +0000 (17:37 +0800)]
drm/amdgpu: use mode1 reset by default for sienna_cichlid
Swith default gpu reset method for sienna_cichlid to MODE1 reset.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Nicholas Kazlauskas [Tue, 28 Jul 2020 15:08:02 +0000 (11:08 -0400)]
drm/amd/display: Drop dm_determine_update_type_for_commit
[Why]
This was added in the past to solve the issue of not knowing when
to stall for medium and full updates in DM.
Since DC is ultimately decides what requires bandwidth changes we
wanted to make use of it directly to determine this.
The problem is that we can't actually pass any of the stream or surface
updates into DC global validation, so we don't actually check if the new
configuration is valid - we just validate the old existing config
instead and stall for outstanding commits to finish.
There's also the problem of grabbing the DRM private object for
pageflips which can lead to page faults in the case where commits
execute out of order and free a DRM private object state that was
still required for commit tail.
[How]
Now that we reset the plane in DM with the same conditions DC checks
we can have planes go through DC validation and we know when we need
to check and stall based on whether the stream or planes changed.
We mark lock_and_validation_needed whenever we've done this, so just
go back to using that instead of dm_determine_update_type_for_commit.
Since we'll skip resetting the plane for a pageflip we will no longer
grab the DRM private object for pageflips as well, avoiding the
page fault issued caused by pageflipping under load with commits
executing out of order.
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Nicholas Kazlauskas [Tue, 28 Jul 2020 14:48:21 +0000 (10:48 -0400)]
drm/amd/display: Reset plane for anything that's not a FAST update
[Why]
MEDIUM or FULL updates can require global validation or affect
bandwidth. By treating these all simply as surface updates we aren't
actually passing this through DC global validation.
[How]
There's currently no way to pass surface updates through DC global
validation, nor do I think it's a good idea to change the interface
to accept these.
DC global validation itself is currently stateless, and we can move
our update type checking to be stateless as well by duplicating DC
surface checks in DM based on DRM properties.
We wanted to rely on DC automatically determining this since DC knows
best, but DM is ultimately what fills in everything into DC plane
state so it does need to know as well.
There are basically only three paths that we exercise in DM today:
1) Cursor (async update)
2) Pageflip (fast update)
3) Full pipe programming (medium/full updates)
Which means that anything that's more than a pageflip really needs to
go down path #3.
So this change duplicates all the surface update checks based on DRM
state instead inside of should_reset_plane().
Next step is dropping dm_determine_update_type_for_commit and we no
longer require the old DC state at all for global validation.
Optimization can come later so we don't reset DC planes at all for
MEDIUM udpates and avoid validation, but we might require some extra
checks in DM to achieve this.
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Hersen Wu <hersenxs.wu@amd.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Nicholas Kazlauskas [Thu, 6 Aug 2020 19:48:10 +0000 (15:48 -0400)]
drm/amd/display: Use validated tiling_flags and tmz_surface in commit_tail
[Why]
So we're not racing with userspace or deadlocking DM.
[How]
These flags are now stored on dm_plane_state itself and acquried and
validated during commit_check, so just use those instead.
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Nicholas Kazlauskas [Tue, 28 Jul 2020 14:03:10 +0000 (10:03 -0400)]
drm/amd/display: Avoid using unvalidated tiling_flags and tmz_surface in prepare_planes
[Why]
We're racing with userspace as the flags could potentially change
from when we acquired and validated them in commit_check.
[How]
We unfortunately can't drop this function in its entirety from
prepare_planes since we don't know the afb->address at commit_check
time yet.
So instead of querying new tiling_flags and tmz_surface use the ones
from the plane_state directly.
While we're at it, also update the force_disable_dcc option based
on the state from atomic check.
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Nicholas Kazlauskas [Tue, 28 Jul 2020 13:59:53 +0000 (09:59 -0400)]
drm/amd/display: Reset plane when tiling flags change
[Why]
Enabling or disable DCC or switching between tiled and linear formats
can require bandwidth updates.
They're currently skipping all DC validation by being treated as purely
surface updates.
[How]
Treat tiling_flag changes (which encode DCC state) as a condition for
resetting the plane.
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Hersen Wu <hersenxs.wu@amd.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Nicholas Kazlauskas [Tue, 28 Jul 2020 13:44:26 +0000 (09:44 -0400)]
drm/amd/display: Store tiling_flags and tmz_surface on dm_plane_state
[Why]
Store these in advance so we can reuse them later in commit_tail without
having to reserve the fbo again.
These will also be used for checking for tiling changes when deciding
to reset the plane or not.
[How]
This change should mostly be a refactor. Only commit check is affected
for now and I'll drop the get_fb_info calls in prepare_planes and
commit_tail after.
This runs a prepass loop once we think that all planes have been added
to the context and replaces the get_fb_info calls with accessing the
dm_plane_state instead.
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Likun Gao [Thu, 6 Aug 2020 06:41:06 +0000 (14:41 +0800)]
drm/amd/powerplay: update driver if file for sienna_cichlid
Update drive if file for sienna_cichlid.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Pierre-Eric Pelloux-Prayer [Thu, 30 Jul 2020 13:54:59 +0000 (15:54 +0200)]
drm/amdgpu: new ids flag for tmz (v2)
Allows UMD to know if TMZ is supported and enabled.
This commit also bumps KMS_DRIVER_MINOR because if we don't
UMD can't tell if "ids_flags & AMDGPU_IDS_FLAGS_TMZ == 0" means
"tmz is not enabled" or "tmz may be enabled but the kernel doesn't
report it".
v2: use amdgpu_is_tmz() and reworded commit message.
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Thu, 30 Jul 2020 07:28:40 +0000 (15:28 +0800)]
drm/amd/powerplay: add control method to bypass metrics cache on Vega12
As for the gpu metric export, metrics cache makes no sense. It's up to
user to decide how often the metrics should be retrieved.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Thu, 30 Jul 2020 07:24:08 +0000 (15:24 +0800)]
drm/amd/powerplay: add control method to bypass metrics cache on Vega20
As for the gpu metric export, metrics cache makes no sense. It's up to
user to decide how often the metrics should be retrieved.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Thu, 30 Jul 2020 07:02:11 +0000 (15:02 +0800)]
drm/amd/powerplay: add control method to bypass metrics cache on Renoir
As for the gpu metric export, metrics cache makes no sense. It's up to
user to decide how often the metrics should be retrieved.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Thu, 30 Jul 2020 07:09:57 +0000 (15:09 +0800)]
drm/amd/powerplay: add control method to bypass metrics cache on Sienna Cichlid
As for the gpu metric export, metrics cache makes no sense. It's up to
user to decide how often the metrics should be retrieved.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Thu, 30 Jul 2020 06:55:32 +0000 (14:55 +0800)]
drm/amd/powerplay: add control method to bypass metrics cache on Navi10
As for the gpu metric export, metrics cache makes no sense. It's up to
user to decide how often the metrics should be retrieved.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Thu, 30 Jul 2020 06:31:21 +0000 (14:31 +0800)]
drm/amd/powerplay: add control method to bypass metrics cache on Arcturus
As for the gpu metric export, metrics cache makes no sense. It's up to
user to decide how often the metrics should be retrieved.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Thu, 30 Jul 2020 04:39:58 +0000 (12:39 +0800)]
drm/amd/powerplay: add Vega12 support for gpu metrics export
Add Vega12 gpu metrics export interface.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Thu, 30 Jul 2020 04:23:42 +0000 (12:23 +0800)]
drm/amd/powerplay: add Vega20 support for gpu metrics export
Add Vega20 gpu metrics export interface.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Thu, 30 Jul 2020 03:40:07 +0000 (11:40 +0800)]
drm/amd/powerplay: enable gpu_metrics export on legacy powerplay routines
Enable gpu_metrics support on legacy powerplay routines.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Mon, 27 Jul 2020 08:24:46 +0000 (16:24 +0800)]
drm/amd/powerplay: add Renoir support for gpu metrics export(V2)
Add Renoir gpu metrics export interface.
V2: use memcpy to make code more compact
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Mon, 27 Jul 2020 02:00:47 +0000 (10:00 +0800)]
drm/amd/powerplay: add Sienna Cichlid support for gpu metrics export
Add Sienna Cichlid gpu metrics export interface.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Fri, 24 Jul 2020 09:24:34 +0000 (17:24 +0800)]
drm/amd/powerplay: add Navi1x support for gpu metrics export
Add Navi1x gpu metrics export interface.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Fri, 24 Jul 2020 09:47:03 +0000 (17:47 +0800)]
drm/amd/powerplay: update the data structure for NV12 SmuMetrics
Although it does not bring any problem for now, the coming gpu
metrics interface needs to handle them differently based on the
asic type.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Fri, 24 Jul 2020 02:42:39 +0000 (10:42 +0800)]
drm/amd/powerplay: add Arcturus support for gpu metrics export
Add Arcturus gpu metrics export interface.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Fri, 24 Jul 2020 10:39:33 +0000 (18:39 +0800)]
drm/amd/powerplay: implement SMU V11 common APIs for retrieving link speed/width
This will be shared around all SMU V11 asics.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Thu, 23 Jul 2020 10:03:35 +0000 (18:03 +0800)]
drm/amd/powerplay: add new sysfs interface for retrieving gpu metrics(V2)
A new interface for UMD to retrieve gpu metrics data.
V2: rich the documentation
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Thu, 23 Jul 2020 08:07:01 +0000 (16:07 +0800)]
drm/amd/powerplay: define an universal data structure for gpu metrics (V4)
Thus we can provide an interface for UMD to retrieve gpu metrics data.
V2: better naming and comments
V3: two structures created for dGPU and APU separately
V4: add driver attached timestamp
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Colin Ian King [Wed, 5 Aug 2020 12:15:27 +0000 (13:15 +0100)]
drm/amdgpu: fix spelling mistake "paramter" -> "parameter"
There is a spelling mistake in a dev_warn message. Fix it.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Tue, 4 Aug 2020 08:58:30 +0000 (16:58 +0800)]
drm/amd/powerplay: grant Arcturus softmin/max setting on latest PM firmware
For Arcturus, the softmin/max settings from driver are permitted on the
latest(54.26 later) SMU firmware. Thus enabling them in driver.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Philip Yang [Mon, 27 Jul 2020 13:06:18 +0000 (09:06 -0400)]
drm/amdkfd: option to disable system mem limit
If multiple process share system memory through /dev/shm, KFD allocate
memory should not fail if it reaches the system memory limit because
one copy of physical system memory are shared by multiple process.
Add module parameter no_system_mem_limit to provide user option to
disable system memory limit check at runtime using sysfs or during
driver module init using kernel boot argument. By default the system
memory limit is on.
Print out debug message to warn user if KFD allocate memory failed
because system memory reaches limit.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Tianjia Zhang [Sun, 2 Aug 2020 11:15:36 +0000 (19:15 +0800)]
drm/amd/display: Fix wrong return value in dm_update_plane_state()
On an error exit path, a negative error code should be returned
instead of a positive return value.
Fixes:
9e869063b0021 ("drm/amd/display: Move iteration out of dm_update_planes")
Cc: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rikard Falkeborn [Tue, 4 Aug 2020 20:06:55 +0000 (22:06 +0200)]
drm/amd/display: Constify dcn30_res_pool_funcs
The only usage of dcn30_res_pool_funcs is to assign its address to a
const pointer. Make it const to allow the compiler to put it in
read-only memory.
Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rikard Falkeborn [Tue, 4 Aug 2020 20:06:54 +0000 (22:06 +0200)]
drm/amd/display: Constify dcn21_res_pool_funcs
The only usage of dcn21_res_pool_funcs is to assign its address to a
const pointer. Make it const to allow the compiler to put it in
read-only memory.
Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rikard Falkeborn [Tue, 4 Aug 2020 20:06:53 +0000 (22:06 +0200)]
drm/amd/display: Constify dcn20_res_pool_funcs
The only usage of dcn20_res_pool_funcs is to assign its address to a
const pointer. Make it const to allow the compiler to put it in
read-only memory.
Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dan Carpenter [Mon, 3 Aug 2020 14:35:19 +0000 (17:35 +0300)]
drm/amd/display: Indent an if statement
The if statement wasn't indented so it's confusing.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Wed, 29 Jul 2020 17:14:17 +0000 (13:14 -0400)]
drm/amdgpu: move vram usage by vbios to mman (v2)
It's related to the memory manager so move it there.
v2: inline the structure
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Wed, 29 Jul 2020 17:02:25 +0000 (13:02 -0400)]
drm/amdgpu: move IP discovery data to mman
It's related to the memory manager so move it there.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Wed, 29 Jul 2020 16:53:56 +0000 (12:53 -0400)]
drm/amdgpu: move stolen memory from gmc to mman
It's more related to memory management than memory
controller.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 28 Jul 2020 19:35:56 +0000 (15:35 -0400)]
drm/amdgpu/gmc: disable keep_stolen_vga_memory on arcturus
I suspect the only reason this was set was to avoid touching
the display related registers on arcturus. Someone should
double check this on arcturus with S3.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 28 Jul 2020 22:34:50 +0000 (18:34 -0400)]
drm/amdgpu: drop the CPU pointers for the stolen vga bos
We never use them.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 28 Jul 2020 22:30:14 +0000 (18:30 -0400)]
drm/amdgpu/gmc10: switch to using amdgpu_gmc_get_vbios_allocations
The new helper centralizes the logic in one place.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 28 Jul 2020 22:29:55 +0000 (18:29 -0400)]
drm/amdgpu/gmc9: switch to using amdgpu_gmc_get_vbios_allocations
The new helper centralizes the logic in one place.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 28 Jul 2020 22:29:39 +0000 (18:29 -0400)]
drm/amdgpu/gmc8: switch to using amdgpu_gmc_get_vbios_allocations
The new helper centralizes the logic in one place.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 28 Jul 2020 22:29:20 +0000 (18:29 -0400)]
drm/amdgpu/gmc7: switch to using amdgpu_gmc_get_vbios_allocations
The new helper centralizes the logic in one place.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 28 Jul 2020 22:27:46 +0000 (18:27 -0400)]
drm/amdgpu/gmc6: switch to using amdgpu_gmc_get_vbios_allocations
The new helper centralizes the logic in one place.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 28 Jul 2020 19:04:52 +0000 (15:04 -0400)]
drm/amdgpu/gmc: add new helper to get the FB size used by pre-OS console
This adds a new gmc callback to get the size reserved by the pre-OS
console and provides a helper function for use by gmc IP drivers.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 28 Jul 2020 22:05:11 +0000 (18:05 -0400)]
drm/amdgpu: add support for extended stolen vga memory
This will allow us to split the allocation for systems
where we have to keep the stolen memory around to avoid
S3 issues. This way we don't waste as much memory and
still avoid any screen artifacts during the bios to
driver transition.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 28 Jul 2020 21:55:30 +0000 (17:55 -0400)]
drm/amdgpu: move keep stolen memory check into gmc core
Rather than leaving this as a gmc v9 specific hack.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 28 Jul 2020 21:46:00 +0000 (17:46 -0400)]
drm/amdgpu: move stolen vga bo from amdgpu to amdgpu.gmc
Since that is where we store the other data related to
the stolen vga memory.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 28 Jul 2020 18:10:46 +0000 (14:10 -0400)]
drm/amdgpu: use a define for the memory size of the vga emulator
Rather than open coding it everywhere.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>