platform/kernel/linux-rpi.git
4 years agodrm/amdgpu: clear err_event_athub flag after reset exit
Le Ma [Fri, 25 Oct 2019 09:19:38 +0000 (17:19 +0800)]
drm/amdgpu: clear err_event_athub flag after reset exit

Otherwise next err_event_athub error cannot call gpu reset. And following
resume sequence will not be affected by this flag.

v2: create function to clear amdgpu_ras_in_intr for modularity of ras driver

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: support full gpu reset workflow when ras err_event_athub occurs
Le Ma [Wed, 27 Nov 2019 05:17:17 +0000 (13:17 +0800)]
drm/amdgpu: support full gpu reset workflow when ras err_event_athub occurs

This athub fatal error can be recovered by baco without system-level reboot,
so add a mode to use baco for the recovery. Not affect the default psp reset
situations for now.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: add concurrent baco reset support for XGMI
Le Ma [Tue, 26 Nov 2019 14:12:31 +0000 (22:12 +0800)]
drm/amdgpu: add concurrent baco reset support for XGMI

Currently each XGMI node reset wq does not run in parrallel if bound to same
cpu. Make change to bound the xgmi_reset_work item to different cpus.

XGMI requires all nodes enter into baco within very close proximity before
any node exit baco. So schedule the xgmi_reset_work wq twice for enter/exit
baco respectively.

To use baco for XGMI, PMFW supported for baco on XGMI needs to be involved.

The case that PSP reset and baco reset coexist within an XGMI hive never exist
and is not in the consideration.

v2: define use_baco flag to simplify the code for xgmi baco sequence

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: enable/disable doorbell interrupt in baco entry/exit helper
Le Ma [Tue, 26 Nov 2019 09:24:56 +0000 (17:24 +0800)]
drm/amdgpu: enable/disable doorbell interrupt in baco entry/exit helper

This operation is needed when baco entry/exit for ras recovery

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: clear uncorrectable parity error status bit
Le Ma [Fri, 22 Nov 2019 10:39:11 +0000 (18:39 +0800)]
drm/amdgpu: clear uncorrectable parity error status bit

This should be cleared during every nbif uncorrectable error cleanup work.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: clear ras controller status registers when interrupt occurs
Le Ma [Fri, 22 Nov 2019 09:56:47 +0000 (17:56 +0800)]
drm/amdgpu: clear ras controller status registers when interrupt occurs

To fix issue that ras controller interrupt cannot be triggered anymore after
one time nbif uncorrectable error. And error count is stored in nbif ras object
for query.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: export amdgpu_ras_find_obj to use externally
Le Ma [Mon, 25 Nov 2019 04:26:09 +0000 (12:26 +0800)]
drm/amdgpu: export amdgpu_ras_find_obj to use externally

Change it to external interface.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: remove ras global recovery handling from ras_controller_int handler
Le Ma [Mon, 21 Oct 2019 18:41:26 +0000 (02:41 +0800)]
drm/amdgpu: remove ras global recovery handling from ras_controller_int handler

v2: add notification when ras controller interrupt generates

Signed-off-by: Le Ma <Le.Ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: add cache flush workaround to gfx8 emit_fence
Pierre-Eric Pelloux-Prayer [Thu, 28 Nov 2019 11:08:58 +0000 (12:08 +0100)]
drm/amdgpu: add cache flush workaround to gfx8 emit_fence

The same workaround is used for gfx7.
Both PAL and Mesa use it for gfx8 too, so port this commit to
gfx_v8_0_ring_emit_fence_gfx.

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: add header line for power profile on Arcturus
Alex Deucher [Thu, 5 Dec 2019 03:07:49 +0000 (22:07 -0500)]
drm/amdgpu: add header line for power profile on Arcturus

So the output is consistent with other asics.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdkfd: Eliminate unnecessary kernel queue function pointers
Yong Zhao [Fri, 8 Nov 2019 05:30:49 +0000 (00:30 -0500)]
drm/amdkfd: Eliminate unnecessary kernel queue function pointers

Up to this point, those functions are all the same for all ASICs, so
no need to call them by functions pointers. Removing the function
pointers will greatly increase the code readablity. If there is ever
need for those function pointers, we can add it back then.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu/gfx: Improvement on EDC GPR workarounds
James Zhu [Tue, 3 Dec 2019 20:40:10 +0000 (15:40 -0500)]
drm/amdgpu/gfx: Improvement on EDC GPR workarounds

SPI limits total CS waves in flight per SE to no more than 32 * num_cu and
we need to stuff 40 waves on a CU to completely clean the SGPR. This is
accomplished in the WR by cleaning the SE in two steps, half of the CU per
step.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: add check before enabling/disabling broadcast mode
Guchun Chen [Wed, 4 Dec 2019 07:51:16 +0000 (15:51 +0800)]
drm/amdgpu: add check before enabling/disabling broadcast mode

When security violation from new vbios happens, data fabric is
risky to stop working. So prevent the direct access to DF
mmFabricConfigAccessControl from the new vbios and onwards.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/radeon: fix r1xx/r2xx register checker for POT textures
Alex Deucher [Tue, 26 Nov 2019 14:41:46 +0000 (09:41 -0500)]
drm/radeon: fix r1xx/r2xx register checker for POT textures

Shift and mask were reversed.  Noticed by chance.

Tested-by: Meelis Roos <mroos@linux.ee>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
4 years agodrm/amd/display: Loading NV10/14 Bounding Box Data Directly From Code
Zhan Liu [Tue, 3 Dec 2019 17:46:01 +0000 (12:46 -0500)]
drm/amd/display: Loading NV10/14 Bounding Box Data Directly From Code

[Why]
NV10/14 has released. Its time to get NV10/14 bounding box
directly from code.

[How]
Retrieve NV10/14 bounding box data directly from code.

Signed-off-by: Zhan Liu <zhan.liu@amd.com>
Reviewed-by: Hersen Wu <hersenxs.wu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdkfd: Contain MMHUB number in mmhub_v9_4_setup_vm_pt_regs()
Yong Zhao [Tue, 3 Dec 2019 04:12:10 +0000 (23:12 -0500)]
drm/amdkfd: Contain MMHUB number in mmhub_v9_4_setup_vm_pt_regs()

Adjust the exposed function prototype so that the caller does not need
to know the MMHUB number.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu/powerplay: unify smu send message function
Likun Gao [Mon, 2 Dec 2019 07:04:35 +0000 (15:04 +0800)]
drm/amdgpu/powerplay: unify smu send message function

Drop smu_send_smc_msg function from ASIC specify structure.
Reuse smu_send_smc_msg_with_param function for smu_send_smc_msg.
Set paramer to 0 for smu_send_msg function, otherwise it will send
with previous paramer value (Not a certain value).
Materialize msg type for smu send message function definition.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: load np fw prior before loading the TAs
Hawking Zhang [Mon, 2 Dec 2019 05:44:38 +0000 (13:44 +0800)]
drm/amdgpu: load np fw prior before loading the TAs

Platform TAs will independently toggle DF Cstate.
for instance, get/set topology from xgmi ta. do error
injection from ras ta. In such case, PMFW needs to be
loaded before TAs so that all the subsequent Cstate
calls recieved by PSP FW can be routed to PMFW.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: unload asd in psp hw de-init phase
Hawking Zhang [Mon, 2 Dec 2019 05:37:42 +0000 (13:37 +0800)]
drm/amdgpu: unload asd in psp hw de-init phase

issue unload_ta_cmd to tOS to unload asd driver

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: drop asd shared memory
Hawking Zhang [Mon, 2 Dec 2019 05:16:09 +0000 (13:16 +0800)]
drm/amdgpu: drop asd shared memory

asd shared memory is not needed since drivers doesn't
invoke any further cmd to asd directly after the asd
loading. trust application is the one who needs
to talk to asd after the initialization

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agoamd/amdgpu/sriov swSMU disable for sriov
Jack Zhang [Mon, 2 Dec 2019 10:41:36 +0000 (18:41 +0800)]
amd/amdgpu/sriov swSMU disable for sriov

For boards greater than ARCTURUS, and under sriov platform,
swSMU is not supported because smu ip block is commented at
guest driver.

Generally for sriov, initialization of smu is moved to host driver.
Thus, smu sw_init and hw_init will not be executed at guest driver.

Without sw structure being initialized in guest driver, swSMU cannot
declare to be supported.

Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: remove redundant assignment to variable v_total
Colin Ian King [Mon, 2 Dec 2019 15:47:38 +0000 (15:47 +0000)]
drm/amd/display: remove redundant assignment to variable v_total

The variable v_total is being initialized with a value that is never
read and it is being updated later with a new value.  The initialization
is redundant and can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: Remove unneeded semicolon in display_rq_dlg_calc_21.c
zhengbin [Thu, 28 Nov 2019 02:31:40 +0000 (10:31 +0800)]
drm/amd/display: Remove unneeded semicolon in display_rq_dlg_calc_21.c

Fixes coccicheck warning:

drivers/gpu/drm/amd/display/dc/dml/dcn21/display_rq_dlg_calc_21.c:1525:144-145: Unneeded semicolon
drivers/gpu/drm/amd/display/dc/dml/dcn21/display_rq_dlg_calc_21.c:1526:142-143: Unneeded semicolon

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: Remove unneeded semicolon in hdcp.c
zhengbin [Thu, 28 Nov 2019 02:31:39 +0000 (10:31 +0800)]
drm/amd/display: Remove unneeded semicolon in hdcp.c

Fixes coccicheck warning:

drivers/gpu/drm/amd/display/modules/hdcp/hdcp.c:506:2-3: Unneeded semicolon

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: Remove unneeded semicolon in bios_parser2.c
zhengbin [Thu, 28 Nov 2019 02:31:38 +0000 (10:31 +0800)]
drm/amd/display: Remove unneeded semicolon in bios_parser2.c

Fixes coccicheck warning:

drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c:995:2-3: Unneeded semicolon

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: Remove unneeded semicolon in bios_parser.c
zhengbin [Thu, 28 Nov 2019 02:31:37 +0000 (10:31 +0800)]
drm/amd/display: Remove unneeded semicolon in bios_parser.c

Fixes coccicheck warning:

drivers/gpu/drm/amd/display/dc/bios/bios_parser.c:2192:2-3: Unneeded semicolon

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: Remove unneeded variable 'ret' in amdgpu_smu.c
zhengbin [Wed, 27 Nov 2019 09:33:42 +0000 (17:33 +0800)]
drm/amd/powerplay: Remove unneeded variable 'ret' in amdgpu_smu.c

Fixes coccicheck warning:

drivers/gpu/drm/amd/powerplay/amdgpu_smu.c:1192:5-8: Unneeded variable: "ret". Return "0" on line 1195
drivers/gpu/drm/amd/powerplay/amdgpu_smu.c:1945:5-8: Unneeded variable: "ret". Return "0" on line 1961

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: Remove unneeded variable 'result' in vega12_hwmgr.c
zhengbin [Wed, 27 Nov 2019 09:33:41 +0000 (17:33 +0800)]
drm/amd/powerplay: Remove unneeded variable 'result' in vega12_hwmgr.c

Fixes coccicheck warning:

drivers/gpu/drm/amd/powerplay/hwmgr/vega12_hwmgr.c:502:5-11: Unneeded variable: "result". Return "0" on line 515

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: Remove unneeded variable 'ret' in smu7_hwmgr.c
zhengbin [Wed, 27 Nov 2019 09:33:40 +0000 (17:33 +0800)]
drm/amd/powerplay: Remove unneeded variable 'ret' in smu7_hwmgr.c

Fixes coccicheck warning:

drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c:5188:5-8: Unneeded variable: "ret". Return "0" on line 5196

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: Remove unneeded variable 'result' in vega10_hwmgr.c
zhengbin [Wed, 27 Nov 2019 09:33:39 +0000 (17:33 +0800)]
drm/amd/powerplay: Remove unneeded variable 'result' in vega10_hwmgr.c

Fixes coccicheck warning:

drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c:4363:5-11: Unneeded variable: "result". Return "0" on line 4370

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: Remove unneeded variable 'result' in smu10_hwmgr.c
zhengbin [Wed, 27 Nov 2019 09:33:38 +0000 (17:33 +0800)]
drm/amd/powerplay: Remove unneeded variable 'result' in smu10_hwmgr.c

Fixes coccicheck warning:

drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c:1154:5-11: Unneeded variable: "result". Return "0" on line 1159

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: fix double assignment to msg_id field
Colin Ian King [Wed, 20 Nov 2019 17:22:42 +0000 (17:22 +0000)]
drm/amd/display: fix double assignment to msg_id field

The msg_id field is being assigned twice. Fix this by replacing the second
assignment with an assignment to msg_size.

Addresses-Coverity: ("Unused value")
Fixes: 11a00965d261 ("drm/amd/display: Add PSP block to verify HDCP2.2 steps")
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: re-enable wait in pipelock, but add timeout
Alex Deucher [Fri, 15 Nov 2019 15:02:44 +0000 (10:02 -0500)]
drm/amd/display: re-enable wait in pipelock, but add timeout

Removing this causes hangs in some games, so re-add it, but add
a timeout so we don't hang while switching flip types.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=205169
Bug: https://bugs.freedesktop.org/show_bug.cgi?id=112266
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: Get NV14 specific ip params as needed
Zhan liu [Mon, 2 Dec 2019 20:12:27 +0000 (15:12 -0500)]
drm/amd/display: Get NV14 specific ip params as needed

[Why]
NV14 is using its own ip params that's different from other
DCN2.0 ASICs.

[How]
Add ASIC revision check to make sure NV14 gets correct
ip params.

Signed-off-by: Zhan Liu <zhan.liu@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: Adding NV14 IP Parameters
Zhan liu [Mon, 2 Dec 2019 19:54:16 +0000 (14:54 -0500)]
drm/amd/display: Adding NV14 IP Parameters

[Why]
NV14 IP Parameters are missing.

[How]
Add IP Parameters in.

Signed-off-by: Zhan liu <zhan.liu@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu/sriov: No need the event 3 and 4 now
Emily Deng [Mon, 2 Dec 2019 17:53:10 +0000 (01:53 +0800)]
drm/amdgpu/sriov: No need the event 3 and 4 now

As will call unload kms when initialize fail, and the unload kms will
send event 3 and 4, so don't need event 3 and 4 in device init.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Zhan Liu <zhan.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Added ASIC specific checks in gfxhub V1.1 get XGMI info
John Clements [Mon, 2 Dec 2019 09:57:25 +0000 (17:57 +0800)]
drm/amdgpu: Added ASIC specific checks in gfxhub V1.1 get XGMI info

Added max hive/node info checks per supported ASIC

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: Load TA firmware for navi10/12/14
Bhawanpreet Lakha [Fri, 8 Nov 2019 21:57:21 +0000 (16:57 -0500)]
drm/amd/display: Load TA firmware for navi10/12/14

load the ta firmware for navi10/12/14.
This is already being done for raven/picasso and
is needed for supporting hdcp on navi

Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: not remove sysfs if not create sysfs
Yintian Tao [Fri, 29 Nov 2019 08:05:55 +0000 (16:05 +0800)]
drm/amdgpu: not remove sysfs if not create sysfs

When load amdgpu failed before create pm_sysfs and ucode_sysfs,
the pm_sysfs and ucode_sysfs should not be removed.
Otherwise, there will be warning call trace just like below.
[   24.836386] [drm] VCE initialized successfully.
[   24.841352] amdgpu 0000:00:07.0: amdgpu_device_ip_init failed
[   25.370383] amdgpu 0000:00:07.0: Fatal error during GPU init
[   25.889575] [drm] amdgpu: finishing device.
[   26.069128] amdgpu 0000:00:07.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
[   26.070110] [drm:gfx_v9_0_hw_fini [amdgpu]] *ERROR* KCQ disable failed
[   26.200309] [TTM] Finalizing pool allocator
[   26.200314] [TTM] Finalizing DMA pool allocator
[   26.200349] [TTM] Zone  kernel: Used memory at exit: 0 KiB
[   26.200351] [TTM] Zone   dma32: Used memory at exit: 0 KiB
[   26.200353] [drm] amdgpu: ttm finalized
[   26.205329] ------------[ cut here ]------------
[   26.205330] sysfs group 'fw_version' not found for kobject '0000:00:07.0'
[   26.205347] WARNING: CPU: 0 PID: 1228 at fs/sysfs/group.c:256 sysfs_remove_group+0x80/0x90
[   26.205348] Modules linked in: amdgpu(OE+) gpu_sched(OE) ttm(OE) drm_kms_helper(OE) drm(OE) i2c_algo_bit fb_sys_fops syscopyarea sysfillrect sysimgblt rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache binfmt_misc snd_hda_codec_generic ledtrig_audio crct10dif_pclmul snd_hda_intel crc32_pclmul snd_hda_codec ghash_clmulni_intel snd_hda_core snd_hwdep snd_pcm snd_timer input_leds snd joydev soundcore serio_raw pcspkr evbug aesni_intel aes_x86_64 crypto_simd cryptd mac_hid glue_helper sunrpc ip_tables x_tables autofs4 8139too psmouse 8139cp mii i2c_piix4 pata_acpi floppy
[   26.205369] CPU: 0 PID: 1228 Comm: modprobe Tainted: G           OE     5.2.0-rc1 #1
[   26.205370] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[   26.205372] RIP: 0010:sysfs_remove_group+0x80/0x90
[   26.205374] Code: e8 35 b9 ff ff 5b 41 5c 41 5d 5d c3 48 89 df e8 f6 b5 ff ff eb c6 49 8b 55 00 49 8b 34 24 48 c7 c7 48 7a 70 98 e8 60 63 d3 ff <0f> 0b eb d7 66 90 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55
[   26.205375] RSP: 0018:ffffbee242b0b908 EFLAGS: 00010282
[   26.205376] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000006
[   26.205377] RDX: 0000000000000007 RSI: 0000000000000092 RDI: ffff97ad6f817380
[   26.205377] RBP: ffffbee242b0b920 R08: ffffffff98f520c4 R09: 00000000000002b3
[   26.205378] R10: ffffbee242b0b8f8 R11: 00000000000002b3 R12: ffffffffc0e58240
[   26.205379] R13: ffff97ad6d1fe0b0 R14: ffff97ad4db954c8 R15: ffff97ad4db7fff0
[   26.205380] FS:  00007ff3d8a1c4c0(0000) GS:ffff97ad6f800000(0000) knlGS:0000000000000000
[   26.205381] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   26.205381] CR2: 00007f9b2ef1df04 CR3: 000000042aab8001 CR4: 00000000003606f0
[   26.205384] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   26.205385] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   26.205385] Call Trace:
[   26.205461]  amdgpu_ucode_sysfs_fini+0x18/0x20 [amdgpu]
[   26.205518]  amdgpu_device_fini+0x3b4/0x560 [amdgpu]
[   26.205573]  amdgpu_driver_unload_kms+0x4f/0xa0 [amdgpu]
[   26.205623]  amdgpu_driver_load_kms+0xcd/0x250 [amdgpu]
[   26.205637]  drm_dev_register+0x12b/0x1c0 [drm]
[   26.205695]  amdgpu_pci_probe+0x12a/0x1e0 [amdgpu]
[   26.205699]  local_pci_probe+0x47/0xa0
[   26.205701]  pci_device_probe+0x106/0x1b0
[   26.205704]  really_probe+0x21a/0x3f0
[   26.205706]  driver_probe_device+0x11c/0x140
[   26.205707]  device_driver_attach+0x58/0x60
[   26.205709]  __driver_attach+0xc3/0x140

Signed-off-by: Yintian Tao <yttao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: fix GFX10 missing CSIB set(v3)
Monk Liu [Tue, 26 Nov 2019 11:33:38 +0000 (19:33 +0800)]
drm/amdgpu: fix GFX10 missing CSIB set(v3)

still need to init csb even for SRIOV

v2:
drop init_pg() for gfx10 at all since
PG and GFX off feature will be fully controled
by RLC and SMU fw for gfx10

v3:
drop the flush_gpu_tlb lines since we consider
it is only usefull in emulation

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: should stop GFX ring in hw_fini
Monk Liu [Fri, 29 Nov 2019 08:20:51 +0000 (16:20 +0800)]
drm/amdgpu: should stop GFX ring in hw_fini

To align with the scheme from gfx9

disabling GFX ring after VM shutdown could avoid
garbage data be fetched to GFX RB which may lead
to unnecessary screw up on GFX

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: do autoload right after MEC loaded for SRIOV VF
Monk Liu [Tue, 26 Nov 2019 11:38:22 +0000 (19:38 +0800)]
drm/amdgpu: do autoload right after MEC loaded for SRIOV VF

since we don't have RLCG ucode loading and no SRlist as well

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: skip rlc ucode loading for SRIOV gfx10
Monk Liu [Tue, 26 Nov 2019 11:36:29 +0000 (19:36 +0800)]
drm/amdgpu: skip rlc ucode loading for SRIOV gfx10

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: fix calltrace during kmd unload(v3)
Monk Liu [Tue, 26 Nov 2019 11:42:25 +0000 (19:42 +0800)]
drm/amdgpu: fix calltrace during kmd unload(v3)

issue:
kernel would report a warning from a double unpin
during the driver unloading on the CSB bo

why:
we unpin it during hw_fini, and there will be another
unpin in sw_fini on CSB bo.

fix:
actually we don't need to pin/unpin it during
hw_init/fini since it is created with kernel pinned,
we only need to fullfill the CSB again during hw_init
to prevent CSB/VRAM lost after S3

v2:
get_csb in init_rlc so hw_init() will make CSIB content
back even after reset or s3

v3:
use bo_create_kernel instead of bo_create_reserved for CSB
otherwise the bo_free_kernel() on CSB is not aligned and
would lead to its internal reserve pending there forever

take care of gfx7/8 as well

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: Drop AMD_EDID_UTILITY defines
Harry Wentland [Thu, 28 Nov 2019 16:30:10 +0000 (11:30 -0500)]
drm/amd/display: Drop AMD_EDID_UTILITY defines

We don't use this upstream in the Linux kernel.

Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: Include num_vmid and num_dsc within NV14's resource caps
Zhan Liu [Thu, 28 Nov 2019 19:12:11 +0000 (14:12 -0500)]
drm/amd/display: Include num_vmid and num_dsc within NV14's resource caps

[Why]
"num_vmid" and "num_dsc" are missing within NV14's resource caps structure.

[How]
Add the missing parts.

Signed-off-by: Zhan Liu <zhan.liu@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: use CPU to flush vmhub if sched stopped
Monk Liu [Tue, 26 Nov 2019 11:40:08 +0000 (19:40 +0800)]
drm/amdgpu: use CPU to flush vmhub if sched stopped

otherwse the flush_gpu_tlb will hang if we unload the
KMD becuase the schedulers already stopped

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu/gfx: Increase dispatch packet number
James Zhu [Tue, 26 Nov 2019 19:27:46 +0000 (14:27 -0500)]
drm/amdgpu/gfx: Increase dispatch packet number

For Arcturus, increase dispatch packet number to stress scheduler.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu/gfx: Clear more EDC cnt
James Zhu [Tue, 26 Nov 2019 19:23:10 +0000 (14:23 -0500)]
drm/amdgpu/gfx: Clear more EDC cnt

Clear SDMA and HDP EDC counter in GPR workarounds.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu/gfx10: remove outdated comments
Xiaojie Yuan [Wed, 6 Nov 2019 13:08:06 +0000 (21:08 +0800)]
drm/amdgpu/gfx10: remove outdated comments

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Zhan Liu <zhan.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu/gfx10: unlock srbm_mutex after queue programming finish
Xiaojie Yuan [Wed, 6 Nov 2019 13:10:20 +0000 (21:10 +0800)]
drm/amdgpu/gfx10: unlock srbm_mutex after queue programming finish

srbm_mutex is to guarantee atomicity for r/w of gfx indexed registers

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm: radeon: replace 0 with NULL
Jules Irenge [Tue, 26 Nov 2019 00:35:14 +0000 (00:35 +0000)]
drm: radeon: replace 0 with NULL

Replace 0 with NULL to fix sparse tool  warning
 warning: Using plain integer as NULL pointer

Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Zhan Liu <zhan.liu@amd.com>
Signed-off-by: Jules Irenge <jbi.octave@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Fix a bug in jpeg_v1_0_start()
Dan Carpenter [Tue, 26 Nov 2019 12:10:29 +0000 (15:10 +0300)]
drm/amdgpu: Fix a bug in jpeg_v1_0_start()

Originally the last WREG32_SOC15() was a part of the if statement block
but the curly braces are on the wrong line.

Fixes: bb0db70f3f75 ("drm/amdgpu: separate JPEG1.0 code out from VCN1.0")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: flag vram lost on baco reset for VI/CIK
Alex Deucher [Mon, 25 Nov 2019 16:11:18 +0000 (11:11 -0500)]
drm/amdgpu: flag vram lost on baco reset for VI/CIK

VI/CIK BACO was inflight when this fix landed for SOC15/NV.
Add the fix to VI/CIK as well.

Acked-by: Evan Quan <evan.quan@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: move pci handling out of pm ops
Alex Deucher [Wed, 20 Nov 2019 22:31:11 +0000 (17:31 -0500)]
drm/amdgpu: move pci handling out of pm ops

The documentation says the that PCI core handles this
for you unless you choose to implement it.  Just rely
on the PCI core to handle the pci specific bits.

Reviewed-by: Zhan Liu <zhan.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: Modify comments to match the code
Zhan liu [Mon, 25 Nov 2019 22:25:18 +0000 (17:25 -0500)]
drm/amd/display: Modify comments to match the code

[Why]
This line of code was modified. However, comments
remained unchanged. As a result, comments and code are
mismatching.

[How]
Modifying comments to reflect code. At the same time,
explaining why the value was changed from 200ms to
3000ms.

Signed-off-by: Zhan Liu <zhan.liu@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: apply gpr/gds workaround before enabling GFX EDC mode
Hawking Zhang [Wed, 20 Nov 2019 11:21:35 +0000 (19:21 +0800)]
drm/amdgpu: apply gpr/gds workaround before enabling GFX EDC mode

gfx memory should be initialized before enabling
DED and FUE field in mmGB_EDC_MODE

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/amdgpu/sriov skip jpeg ip block for ARCTURUS VF
Jack Zhang [Tue, 26 Nov 2019 06:47:29 +0000 (14:47 +0800)]
drm/amd/amdgpu/sriov skip jpeg ip block for ARCTURUS VF

Currently ARCTURUS VF doesn't support jpeg ip block.
Skip jpeg ip block in case guest driver load fail.

Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Reviewed-by: Zhexi Zhang <zhexi.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Optimize KFD page table reservation
Felix Kuehling [Mon, 15 Jul 2019 20:18:03 +0000 (16:18 -0400)]
drm/amdgpu: Optimize KFD page table reservation

Be less pessimistic about estimated page table use for KFD. Most
allocations use 2MB pages and therefore need less VRAM for page
tables. This allows more VRAM to be used for applications especially
on large systems with many GPUs and hundreds of GB of system memory.

Example: 8 GPUs with 32GB VRAM each + 256GB system memory = 512GB
Old page table reservation per GPU:  1GB
New page table reservation per GPU: 32MB

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: xinhui pan <xinhui.pan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Raise KFD unpinned system memory limit
Felix Kuehling [Mon, 25 Nov 2019 21:25:35 +0000 (16:25 -0500)]
drm/amdgpu: Raise KFD unpinned system memory limit

Allow KFD applications to use more unpinned system memory through
HMM.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: Null check aconnector in event_property_validate
Bhawanpreet Lakha [Mon, 25 Nov 2019 15:34:13 +0000 (10:34 -0500)]
drm/amd/display: Null check aconnector in event_property_validate

[Why]
previously event_property_validate was only called after we enabled the display.
But after "Refactor HDCP to handle multiple displays per link" this function
can be called at any time. In certain cases we don't have a aconnector

[How]
Null check aconnector and exit early. This is ok because we only need to check the
ENABLED->DESIRED transition if a connector exists.

Fixes: b1abe5586ffc ("drm/amd/display: Refactor HDCP to handle multiple displays per link")
Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Reviewed-by: Zhan Liu <zhan.liu@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: remove set but not used variable 'stretch_amount2'
YueHaibing [Mon, 25 Nov 2019 14:58:43 +0000 (22:58 +0800)]
drm/amd/powerplay: remove set but not used variable 'stretch_amount2'

drivers/gpu/drm/amd/amdgpu/../powerplay/smumgr/vegam_smumgr.c:
 In function vegam_populate_clock_stretcher_data_table:
drivers/gpu/drm/amd/amdgpu/../powerplay/smumgr/vegam_smumgr.c:1489:29:
 warning: variable stretch_amount2 set but not used [-Wunused-but-set-variable]

It is never used, so can be removed.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: remove set but not used variable 'msg_out'
YueHaibing [Mon, 25 Nov 2019 14:54:45 +0000 (22:54 +0800)]
drm/amd/display: remove set but not used variable 'msg_out'

drivers/gpu/drm/amd/amdgpu/../display/modules/hdcp/hdcp_psp.c: In function mod_hdcp_hdcp2_enable_encryption:
drivers/gpu/drm/amd/amdgpu/../display/modules/hdcp/hdcp_psp.c:633:77: warning: variable msg_out set but not used [-Wunused-but-set-variable]
drivers/gpu/drm/amd/amdgpu/../display/modules/hdcp/hdcp_psp.c: In function mod_hdcp_hdcp2_enable_dp_stream_encryption:
drivers/gpu/drm/amd/amdgpu/../display/modules/hdcp/hdcp_psp.c:710:77: warning: variable msg_out set but not used [-Wunused-but-set-variable]

It is never used, so remove it.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agoamdgpu: Enable KFD on POWER systems
Timothy Pearson [Sun, 24 Nov 2019 19:15:16 +0000 (13:15 -0600)]
amdgpu: Enable KFD on POWER systems

KFD has been verified to function on POWER systems (Talos II / Vega 64).
It should be available as a kernel configuration option on these systems.

Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/radeon: remove redundant assignment to variable ret
Colin Ian King [Fri, 22 Nov 2019 23:15:04 +0000 (23:15 +0000)]
drm/radeon: remove redundant assignment to variable ret

The variable ret is being initialized with a value that is never
read and it is being updated later with a new value. The
initialization is redundant and can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Ensure ret is always initialized when using SOC15_WAIT_ON_RREG
Nathan Chancellor [Sat, 23 Nov 2019 19:23:36 +0000 (12:23 -0700)]
drm/amdgpu: Ensure ret is always initialized when using SOC15_WAIT_ON_RREG

Commit b0f3cd3191cd ("drm/amdgpu: remove unnecessary JPEG2.0 code from
VCN2.0") introduced a new clang warning in the vcn_v2_0_stop function:

../drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c:1082:2: warning: variable 'r'
is used uninitialized whenever 'while' loop exits because its condition
is false [-Wsometimes-uninitialized]
        SOC15_WAIT_ON_RREG(VCN, 0, mmUVD_STATUS, UVD_STATUS__IDLE, 0x7, r);
        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../drivers/gpu/drm/amd/amdgpu/../amdgpu/soc15_common.h:55:10: note:
expanded from macro 'SOC15_WAIT_ON_RREG'
                while ((tmp_ & (mask)) != (expected_value)) {   \
                       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c:1083:6: note: uninitialized use
occurs here
        if (r)
            ^
../drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c:1082:2: note: remove the
condition if it is always true
        SOC15_WAIT_ON_RREG(VCN, 0, mmUVD_STATUS, UVD_STATUS__IDLE, 0x7, r);
        ^
../drivers/gpu/drm/amd/amdgpu/../amdgpu/soc15_common.h:55:10: note:
expanded from macro 'SOC15_WAIT_ON_RREG'
                while ((tmp_ & (mask)) != (expected_value)) {   \
                       ^
../drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c:1072:7: note: initialize the
variable 'r' to silence this warning
        int r;
             ^
              = 0
1 warning generated.

To prevent warnings like this from happening in the future, make the
SOC15_WAIT_ON_RREG macro initialize its ret variable before the while
loop that can time out. This macro's return value is always checked so
it should set ret in both the success and fail path.

Link: https://github.com/ClangBuiltLinux/linux/issues/776
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: Use NULL for pointer assignment in copy_stream_update_to_stream
Nathan Chancellor [Sat, 23 Nov 2019 19:36:39 +0000 (12:36 -0700)]
drm/amd/display: Use NULL for pointer assignment in copy_stream_update_to_stream

Clang warns:

../drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc.c:1965:26: warning:
expression which evaluates to zero treated as a null pointer constant of
type 'struct dc_dsc_config *' [-Wnon-literal-null-conversion]
                                update->dsc_config = false;
                                                     ^~~~~
../drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc.c:1971:25: warning:
expression which evaluates to zero treated as a null pointer constant of
type 'struct dc_dsc_config *' [-Wnon-literal-null-conversion]
                        update->dsc_config = false;
                                             ^~~~~
2 warnings generated.

Fixes: f6fe4053b91f ("drm/amd/display: Use a temporary copy of the current state when updating DSC config")
Link: https://github.com/ClangBuiltLinux/linux/issues/777
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: remove redundant assignment to variables HiSidd and LoSidd
Colin Ian King [Fri, 22 Nov 2019 23:04:07 +0000 (23:04 +0000)]
drm/amd/powerplay: remove redundant assignment to variables HiSidd and LoSidd

The variables HiSidd and LoSidd are being initialized with values that
are never read and are being updated a little later with a new value.
The initialization is redundant and can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agoMAINTAINERS: Drop Rex Zhu for amdgpu powerplay
Alex Deucher [Fri, 22 Nov 2019 19:16:55 +0000 (14:16 -0500)]
MAINTAINERS: Drop Rex Zhu for amdgpu powerplay

No longer works on the driver.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd: Fix Kconfig indentation
Krzysztof Kozlowski [Thu, 21 Nov 2019 13:29:30 +0000 (21:29 +0800)]
drm/amd: Fix Kconfig indentation

Adjust indentation from spaces to tab (+optional two spaces) as in
coding style with command like:
$ sed -e 's/^        /\t/' -i */Kconfig

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Resolved offchip EEPROM I/O issue
John Clements [Mon, 25 Nov 2019 10:24:17 +0000 (18:24 +0800)]
drm/amdgpu: Resolved offchip EEPROM I/O issue

Updated target I2C address

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Apply noretry setting for mmhub9.4
Oak Zeng [Fri, 22 Nov 2019 20:15:43 +0000 (14:15 -0600)]
drm/amdgpu: Apply noretry setting for mmhub9.4

Config the translation retry behavior according to noretry
kernel parameter

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Suggested-by: Jay Cornwall <Jay.Cornwall@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: add default clocks if not able to fetch them
Alex Deucher [Tue, 19 Nov 2019 20:54:17 +0000 (15:54 -0500)]
drm/amd/display: add default clocks if not able to fetch them

dm_pp_get_clock_levels_by_type needs to add the default clocks
to the powerplay case as well.  This was accidently dropped.

Fixes: b3ea88fef321de ("drm/amd/powerplay: add get_clock_by_type interface for display")
Bug: https://gitlab.freedesktop.org/drm/amd/issues/906
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Use ARRAY_SIZE for sos_old_versions
zhengbin [Fri, 22 Nov 2019 03:42:52 +0000 (11:42 +0800)]
drm/amdgpu: Use ARRAY_SIZE for sos_old_versions

Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/psp_v3_1.c:182:40-41: WARNING: Use ARRAY_SIZE

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: Use ARRAY_SIZE for smu7_profiling
zhengbin [Fri, 22 Nov 2019 03:42:51 +0000 (11:42 +0800)]
drm/amd/powerplay: Use ARRAY_SIZE for smu7_profiling

Fixes coccicheck warning:

drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c:4946:28-29: WARNING: Use ARRAY_SIZE

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdkfd: Remove duplicate functions update_mqd_hiq()
Yong Zhao [Sat, 9 Nov 2019 06:16:05 +0000 (01:16 -0500)]
drm/amdkfd: Remove duplicate functions update_mqd_hiq()

The functions are the same as update_mqd().

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Zhan Liu <zhan.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: invalidate mmhub semaphore workaround in gmc9/gmc10
changzhu [Tue, 19 Nov 2019 03:13:29 +0000 (11:13 +0800)]
drm/amdgpu: invalidate mmhub semaphore workaround in gmc9/gmc10

It may lose gpuvm invalidate acknowldege state across power-gating off
cycle. To avoid this issue in gmc9/gmc10 invalidation, add semaphore acquire
before invalidation and semaphore release after invalidation.

After adding semaphore acquire before invalidation, the semaphore
register become read-only if another process try to acquire semaphore.
Then it will not be able to release this semaphore. Then it may cause
deadlock problem. If this deadlock problem happens, it needs a semaphore
firmware fix.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: initialize vm_inv_eng0_sem for gfxhub and mmhub
changzhu [Tue, 19 Nov 2019 02:18:39 +0000 (10:18 +0800)]
drm/amdgpu: initialize vm_inv_eng0_sem for gfxhub and mmhub

SW must acquire/release one of the vm_invalidate_eng*_sem around the
invalidation req/ack. Through this way,it can avoid losing invalidate
acknowledge state across power-gating off cycle.
To use vm_invalidate_eng*_sem, it needs to initialize
vm_invalidate_eng*_sem firstly.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/amdgpu/sriov skip RLCG s/r list for arcturus VF.
Jack Zhang [Thu, 21 Nov 2019 06:09:08 +0000 (14:09 +0800)]
drm/amd/amdgpu/sriov skip RLCG s/r list for arcturus VF.

After rlcg fw 2.1, kmd driver starts to load extra fw for
LIST_CNTL,GPM_MEM,SRM_MEM. We needs to skip the three fw
because all rlcg related fw have been loaded by host driver.
Guest driver would load the three fw fail without this change.

Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/amdgpu/sriov temporarily skip ras,dtm,hdcp for arcturus VF
Jack Zhang [Thu, 21 Nov 2019 05:59:28 +0000 (13:59 +0800)]
drm/amd/amdgpu/sriov temporarily skip ras,dtm,hdcp for arcturus VF

Temporarily skip ras,dtm,hdcp initialize and terminate for arcturus VF
Currently the three features haven't been enabled at SRIOV, it would
trigger guest driver load fail with the bare-metal path of the three
features.

Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu/gfx10: re-init clear state buffer after gpu reset
Xiaojie Yuan [Wed, 20 Nov 2019 06:02:22 +0000 (14:02 +0800)]
drm/amdgpu/gfx10: re-init clear state buffer after gpu reset

This patch fixes 2nd baco reset failure with gfxoff enabled on navi1x.

clear state buffer (resides in vram) is corrupted after 1st baco reset,
upon gfxoff exit, CPF gets garbage header in CSIB and hangs.

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agomerge fix for "ftrace: Rework event_create_dir()"
Stephen Rothwell [Thu, 21 Nov 2019 03:54:03 +0000 (14:54 +1100)]
merge fix for "ftrace: Rework event_create_dir()"

Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: remove redundant assignment to pointer write_frame
Colin Ian King [Thu, 21 Nov 2019 16:54:01 +0000 (16:54 +0000)]
drm/amdgpu: remove redundant assignment to pointer write_frame

The pointer write_frame is being initialized with a value that is
never read and it is being updated later with a new value. The
initialization is redundant and can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: simplify runtime suspend
Alex Deucher [Wed, 20 Nov 2019 19:20:32 +0000 (14:20 -0500)]
drm/amdgpu: simplify runtime suspend

In the standard _PR3 case, the pci core handles the pci state.
The driver only needs to handle it in the legacy ATPX case.

This may fix issues with runtime suspend/resume on certain
hybrid graphics laptops.

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdkfd: DIQ should not use HIQ way to allocate memory
Yong Zhao [Sat, 9 Nov 2019 05:47:31 +0000 (00:47 -0500)]
drm/amdkfd: DIQ should not use HIQ way to allocate memory

In the mqd_diq_sdma buffer, there should be only one HIQ mqd. All DIQs
should be allocated somewhere else using the regular way.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdkfd: Delete KFD_MQD_TYPE_COMPUTE
Yong Zhao [Sat, 9 Nov 2019 04:57:37 +0000 (23:57 -0500)]
drm/amdkfd: Delete KFD_MQD_TYPE_COMPUTE

It is the same as KFD_MQD_TYPE_CP, so delete it. As a result, we will
have one less mqd mananger per device.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Update Arcturus golden registers
Jay Cornwall [Wed, 20 Nov 2019 16:32:46 +0000 (16:32 +0000)]
drm/amdgpu: Update Arcturus golden registers

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: implement querying ras error count for mmhub9.4
Dennis Li [Tue, 19 Nov 2019 06:02:57 +0000 (14:02 +0800)]
drm/amdgpu: implement querying ras error count for mmhub9.4

Get mmhub error counter by accessing EDC_CNT registers.

v2: Add mmhub_v9_4_ prefix for local static variable and function

Signed-off-by: Dennis Li <dennis.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: refine query function of mmhub EDC counter in vg20
Dennis Li [Tue, 19 Nov 2019 08:02:28 +0000 (16:02 +0800)]
drm/amdgpu: refine query function of mmhub EDC counter in vg20

Add codes to print the detail EDC info for the subblock of mmhub

v2: Move the EDC_CNT registers' defintion from mmhub_9_4 header
files to mmhub_1_0 ones. Add mmhub_v1_0_ prefix for the local
static variable and function.

v3: squash in DC fix

Signed-off-by: Dennis Li <dennis.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: define soc15_ras_field_entry for reuse
Dennis Li [Tue, 19 Nov 2019 08:25:25 +0000 (16:25 +0800)]
drm/amdgpu: define soc15_ras_field_entry for reuse

The struct soc15_ras_field_entry will be reused by
other IPs, such as mmhub and gc

v2: rename ras_subblock_regs to gc_ras_fields_vg20,
because the future asic maybe have a different table.

Signed-off-by: Dennis Li <dennis.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu/gfx10: fix out-of-bound mqd_backup array access
Xiaojie Yuan [Fri, 22 Nov 2019 19:21:15 +0000 (14:21 -0500)]
drm/amdgpu/gfx10: fix out-of-bound mqd_backup array access

Fixes: 0bb419c76b3150 ("drm/amdgpu/gfx10: fix mqd backup/restore for gfx rings (v2)")
Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu/gfx10: explicitly wait for cp idle after halt/unhalt
Xiaojie Yuan [Thu, 14 Nov 2019 08:56:08 +0000 (16:56 +0800)]
drm/amdgpu/gfx10: explicitly wait for cp idle after halt/unhalt

50us is not enough to wait for cp ready after gpu reset on some navi asics.

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Suggested-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agoamd/amdgpu: force to trigger a no-retry-fault after a retry-fault
Alex Sierra [Mon, 18 Nov 2019 21:33:07 +0000 (15:33 -0600)]
amd/amdgpu: force to trigger a no-retry-fault after a retry-fault

Only for the debugger use case.

[why]
Avoid endless translation retries, after an invalid address access has
been issued to the GPU. Instead, the trap handler is forced to enter by
generating a no-retry-fault.
A s_trap instruction is inserted in the debugger case to let the wave to
enter trap handler to save context.

[how]
Intentionally using an invalid flag combination (F and P set at the same
time) to trigger a no-retry-fault, after a retry-fault happens. This is
only valid under compute context.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: add flag to indicate amdgpu vm context
Alex Sierra [Mon, 18 Nov 2019 19:28:46 +0000 (13:28 -0600)]
drm/amdgpu: add flag to indicate amdgpu vm context

Flag added to indicate if the amdgpu vm context is used for compute or
graphics.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: enable runtime pm on BACO capable boards if runpm=1
Alex Deucher [Fri, 4 Oct 2019 18:47:39 +0000 (13:47 -0500)]
drm/amdgpu: enable runtime pm on BACO capable boards if runpm=1

BACO - Bus Active, Chip Off

Everything is in place now.  Not enabled by default yet.  You
still have to specify runpm=1.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: disentangle runtime pm and vga_switcheroo
Alex Deucher [Fri, 4 Oct 2019 18:25:37 +0000 (13:25 -0500)]
drm/amdgpu: disentangle runtime pm and vga_switcheroo

Originally we only supported runtime pm on PX/HG laptops
so vga_switcheroo and runtime pm are sort of entangled.

Attempt to logically separate them.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: start to disentangle boco from runtime pm
Alex Deucher [Thu, 7 Nov 2019 23:13:14 +0000 (18:13 -0500)]
drm/amdgpu: start to disentangle boco from runtime pm

BACO - Bus Active, Chip Off
BOCO - Bus Off, Chip Off

We originally only supported runtime pm on PX/HG
laptops so most of the runtime pm code looks for this.
Add a new flag to check for runtime pm enablement and
use this rather than checking for PX/HG.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: add baco support to runtime suspend/resume
Alex Deucher [Fri, 4 Oct 2019 17:54:12 +0000 (12:54 -0500)]
drm/amdgpu: add baco support to runtime suspend/resume

BACO - Bus Active, Chip Off

This adds the necessary support to the runtime suspend
and resume functions to handle boards that support
baco.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: add helpers for baco entry and exit
Alex Deucher [Fri, 4 Oct 2019 17:33:09 +0000 (12:33 -0500)]
drm/amdgpu: add helpers for baco entry and exit

BACO - Bus Active, Chip Off

Will be used for runtime pm.  Entry will enter the BACO
state (chip off).  Exit will exit the BACO state (chip on).

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: split swSMU baco_reset into enter and exit
Alex Deucher [Mon, 28 Oct 2019 19:20:03 +0000 (15:20 -0400)]
drm/amdgpu: split swSMU baco_reset into enter and exit

BACO - Bus Active, Chip Off

So we can use it for power savings rather than just reset.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>