Dingchen (David) Zhang [Mon, 25 Jan 2021 23:05:50 +0000 (18:05 -0500)]
drm/amd/display: add handling for hdcp2 rx id list validation
[why]
the current implementation of hdcp2 rx id list validation does not
have handler/checker for invalid message status, e.g. HMAC, the V
parameter calculated from PSP not matching the V prime from Rx.
[how]
return a generic FAILURE for any message status not SUCCESS or
REVOKED.
Signed-off-by: Dingchen (David) Zhang <dingchen.zhang@amd.com>
Reviewed-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dingchen (David) Zhang [Fri, 8 Jan 2021 22:32:47 +0000 (17:32 -0500)]
drm/amd/display: update hdcp display using correct CP type.
[why]
currently we enforce to update hdcp display using TYPE0, but there
is case that connector CP type prop be TYPE1 instead of type0.
[how]
using the drm prop of CP type of the connector as input argument.
Signed-off-by: Dingchen (David) Zhang <dingchen.zhang@amd.com>
Reviewed-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Anthony Wang [Mon, 5 Apr 2021 21:13:58 +0000 (17:13 -0400)]
drm/amd/display: Add DSC check to seamless boot validation
[Why & How]
We want to immediately fail seamless boot validation if DSC is active,
as VBIOS currently does not support DSC timings. Add a check for
the relevant flag in dc_validate_seamless_boot_timing.
Signed-off-by: Anthony Wang <anthony1.wang@amd.com>
Reviewed-by: Martin Leung <Martin.Leung@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Robin Singh [Tue, 15 Dec 2020 00:14:48 +0000 (19:14 -0500)]
drm/amd/display: fixed divide by zero kernel crash during dsc enablement
[why]
During dsc enable, a divide by zero condition triggered the
kernel crash.
[how]
An IGT test, which enable the DSC, was crashing at the time of
restore the default dsc status, becaue of h_totals value
becoming 0. So add a check before divide condition. If h_total
is zero, gracefully ignore and set the default value.
kernel panic log:
[ 128.758827] divide error: 0000 [#1] PREEMPT SMP NOPTI
[ 128.762714] CPU: 5 PID: 4562 Comm: amd_dp_dsc Tainted: G W 5.4.19-android-x86_64 #1
[ 128.769728] Hardware name: ADVANCED MICRO DEVICES, INC. Mauna/Mauna, BIOS WMN0B13N Nov 11 2020
[ 128.777695] RIP: 0010:hubp2_vready_at_or_After_vsync+0x37/0x7a [amdgpu]
[ 128.785707] Code: 80 02 00 00 48 89 f3 48 8b 7f 08 b ......
[ 128.805696] RSP: 0018:
ffffad8f82d43628 EFLAGS:
00010246
......
[ 128.857707] CR2:
00007106d8465000 CR3:
0000000426530000 CR4:
0000000000140ee0
[ 128.865695] Call Trace:
[ 128.869712] hubp3_setup+0x1f/0x7f [amdgpu]
[ 128.873705] dcn20_update_dchubp_dpp+0xc8/0x54a [amdgpu]
[ 128.877706] dcn20_program_front_end_for_ctx+0x31d/0x463 [amdgpu]
[ 128.885706] dc_commit_state+0x3d2/0x658 [amdgpu]
[ 128.889707] amdgpu_dm_atomic_commit_tail+0x4b3/0x1e7c [amdgpu]
[ 128.897699] ? dm_read_reg_func+0x41/0xb5 [amdgpu]
[ 128.901707] ? dm_read_reg_func+0x41/0xb5 [amdgpu]
[ 128.905706] ? __is_insn_slot_addr+0x43/0x48
[ 128.909706] ? fill_plane_buffer_attributes+0x29e/0x3dc [amdgpu]
[ 128.917705] ? dm_plane_helper_prepare_fb+0x255/0x284 [amdgpu]
[ 128.921700] ? usleep_range+0x7c/0x7c
[ 128.925705] ? preempt_count_sub+0xf/0x18
[ 128.929706] ? _raw_spin_unlock_irq+0x13/0x24
[ 128.933732] ? __wait_for_common+0x11e/0x18f
[ 128.937705] ? _raw_spin_unlock_irq+0x13/0x24
[ 128.941706] ? __wait_for_common+0x11e/0x18f
[ 128.945705] commit_tail+0x8b/0xd2 [drm_kms_helper]
[ 128.949707] drm_atomic_helper_commit+0xd8/0xf5 [drm_kms_helper]
[ 128.957706] amdgpu_dm_atomic_commit+0x337/0x360 [amdgpu]
[ 128.961705] ? drm_atomic_check_only+0x543/0x68d [drm]
[ 128.969705] ? drm_atomic_set_property+0x760/0x7af [drm]
[ 128.973704] ? drm_mode_atomic_ioctl+0x6f3/0x85a [drm]
[ 128.977705] drm_mode_atomic_ioctl+0x6f3/0x85a [drm]
[ 128.985705] ? drm_atomic_set_property+0x7af/0x7af [drm]
[ 128.989706] drm_ioctl_kernel+0x82/0xda [drm]
[ 128.993706] drm_ioctl+0x225/0x319 [drm]
[ 128.997707] ? drm_atomic_set_property+0x7af/0x7af [drm]
[ 129.001706] ? preempt_count_sub+0xf/0x18
[ 129.005713] amdgpu_drm_ioctl+0x4b/0x76 [amdgpu]
[ 129.009705] vfs_ioctl+0x1d/0x2a
[ 129.013705] do_vfs_ioctl+0x419/0x43d
[ 129.017707] ksys_ioctl+0x52/0x71
[ 129.021707] __x64_sys_ioctl+0x16/0x19
[ 129.025706] do_syscall_64+0x78/0x85
[ 129.029705] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Signed-off-by: Robin Singh <robin.singh@amd.com>
Reviewed-by: Harry Wentland <Harry.Wentland@amd.com>
Reviewed-by: Robin Singh <Robin.Singh@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jiansong Chen [Mon, 19 Apr 2021 08:33:22 +0000 (16:33 +0800)]
drm/amdgpu: fix GCR_GENERAL_CNTL offset for dimgrey_cavefish
dimgrey_cavefish has similar gc_10_3 ip with sienna_cichlid,
so follow its registers offset setting.
Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
John Clements [Mon, 19 Apr 2021 03:23:07 +0000 (11:23 +0800)]
drm/amdgpu: resolve erroneous gfx_v9_4_2 prints
resolve bug on aldebaran where gfx error counts will
print on driver load when there are no errors present
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dennis Li [Fri, 16 Apr 2021 14:41:11 +0000 (22:41 +0800)]
drm/amdgpu: fix a error injection failed issue
because "sscanf(str, "retire_page")" always return 0, if application use
the raw data for error injection, it always wrongly falls into "op ==
3". Change to use strstr instead.
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Hawking Zhang [Fri, 16 Apr 2021 09:30:12 +0000 (17:30 +0800)]
drm/amdgpu: only harvest gcea/mmea error status in aldebaran
In aldebaran, driver only needs to harvest SDP
RdRspStatus, WrRspStatus and first parity error
on RdRsp data. Check error type before harvest
error information.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Stanley Yang <Stanley.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Hawking Zhang [Fri, 16 Apr 2021 09:34:13 +0000 (17:34 +0800)]
drm/amdgpu: only harvest gcea/mmea error status in arcturus
SDP RdRspStatus/WrRspStatus or first parity error on
RdRsp data can cause system fatal error in arcturus.
GPU will be freezed in such case.
Driver needs to harvest these error information before
reset the GPU. Check error type to avoid harvest normal
gcea/mmea information.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Stanley Yang <Stanley.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Huang Rui [Wed, 14 Apr 2021 10:45:54 +0000 (18:45 +0800)]
drm/amdgpu: enable tmz on renoir asics
The tmz functions are verified on renoir chips as well. So enable it by
default.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Tested-by: Lang Yu <Lang.Yu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Hawking Zhang [Fri, 16 Apr 2021 06:44:27 +0000 (14:44 +0800)]
drm/amdgpu: correct default gfx wdt timeout setting
When gfx wdt was configured to fatal_disable, the
timeout period should be configured to 0x0 (timeout
disabled)
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dennis Li [Wed, 14 Apr 2021 11:00:34 +0000 (19:00 +0800)]
drm/amdkfd: add edc error interrupt handle for poison propogate mode
In poison progogate mode, when driver receive the edc error interrupt
from SQ, driver should kill the process by pasid which is using the
poison data, and then trigger GPU reset.
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Yang Li [Thu, 15 Apr 2021 09:30:20 +0000 (17:30 +0800)]
drm/radeon/si: Fix inconsistent indenting
Kernel test robot throws below warning ->
smatch warnings:
drivers/gpu/drm/radeon/si.c:4514 si_vm_packet3_cp_dma_check() warn:
inconsistent indenting
Fixed the inconsistent indenting.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dan Carpenter [Wed, 14 Apr 2021 05:59:22 +0000 (08:59 +0300)]
drm/amd/pm: fix error code in smu_set_power_limit()
We should return -EINVAL instead of success if the "limit" is too high.
Fixes:
e098bc9612c2 ("drm/amd/pm: optimize the power related source code layout")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dan Carpenter [Wed, 14 Apr 2021 05:58:55 +0000 (08:58 +0300)]
drm/amdgpu: fix an error code in init_pmu_entry_by_type_and_add()
If the kmemdup() fails then this should return a negative error code
but it currently returns success
Fixes:
b4a7db71ea06 ("drm/amdgpu: add per device user friendly xgmi events for vega20")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Tian Tao [Tue, 13 Apr 2021 03:26:19 +0000 (11:26 +0800)]
drm/radeon/cik: remove set but not used variables
The value of pipe_id and queue_id are not used under certain
circumstances, so just delete.
Signed-off-by: Tian Tao <tiantao6@hisilicon.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Simon Ser [Fri, 26 Mar 2021 16:59:44 +0000 (17:59 +0100)]
amd/display: allow non-linear multi-planar formats
Accept non-linear buffers which use a multi-planar format, as long
as they don't use DCC.
Tested on GFX9 with NV12.
Signed-off-by: Simon Ser <contact@emersion.fr>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Harry Wentland <hwentlan@amd.com>
Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Shashank Sharma [Thu, 15 Apr 2021 17:48:12 +0000 (23:18 +0530)]
drm/amdgpu/dm: Fix NULL pointer crash during DP MST hotplug
This patch checks the return value of the function
dc_link_add_remote_sink before using it. This was causing
a crash during consecutive hotplugs of DP MST displays.
Cc: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Qingqing Zhuo [Wed, 14 Apr 2021 23:14:14 +0000 (19:14 -0400)]
Revert "Revert "drm/amdgpu: Ensure that the modifier requested is supported by plane.""
This reverts commit
55fa622fe635bfc3f2587d784f6facc30f8fdf12.
The regression caused by the original patch has been
cleared, thus introduce back the change.
Signed-off-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Qingqing Zhuo [Wed, 14 Apr 2021 23:00:01 +0000 (19:00 -0400)]
drm/amd/display: Update modifier list for gfx10_3
[Why]
Current list supports modifiers that have DCC_MAX_COMPRESSED_BLOCK
set to AMD_FMT_MOD_DCC_BLOCK_128B, while AMD_FMT_MOD_DCC_BLOCK_64B
is used instead by userspace.
[How]
Replace AMD_FMT_MOD_DCC_BLOCK_128B with AMD_FMT_MOD_DCC_BLOCK_64B
for modifiers with DCC supported.
Fixes:
faa37f54ce0462 ("drm/amd/display: Expose modifiers")
Signed-off-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Xiaojian Du [Wed, 14 Apr 2021 06:55:27 +0000 (14:55 +0800)]
drm/amd/pm: revise two names of sensor values for vangogh
This patch is to revise two names of sensor values for vangogh.
New smu metrics table is supported by new pmfw
(from version 4.63.36.00 ), it includes two parts, one part is
the current smu metrics table data and the other part is the
average smu metrics table data. The hwmon will read the current gfxclk
and mclk from the current smu metrics table data.
Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Xiaojian Du [Tue, 13 Apr 2021 07:52:17 +0000 (15:52 +0800)]
drm/amd/pm: remove the "set" function of pp_dpm_mclk for vangogh
This patch is to remove the "set" function of pp_dpm_mclk for vangogh.
For vangogh, mclk bonds with fclk, they will lock each other
on the same perfomance level. But according to the smu message from pmfw,
only fclk is allowed to set value manually, so remove the unnecessary
code of "set" function for mclk.
Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Joseph Greathouse [Thu, 15 Apr 2021 06:02:46 +0000 (14:02 +0800)]
drm/amdgpu: Copy MEC FW version to MEC2 if we skipped loading MEC2
If we skipped loading MEC2 firmware separately
from MEC, then MEC2 will be running the same
firmware image. Copy the MEC version and feature
numbers into MEC2 version and feature numbers.
This is needed for things like GWS support, where
we rely on knowing what version of firmware is
running on MEC2. Leaving these MEC2 entries blank
breaks our ability to version-check enables and
workarounds.
Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Xiaojian Du [Tue, 13 Apr 2021 07:08:41 +0000 (15:08 +0800)]
drm/amd/pm: add the callback to get the bootup values for renoir
This patch is to add the callback to get the bootup values for renoir.
Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Xiaojian Du [Tue, 13 Apr 2021 07:03:42 +0000 (15:03 +0800)]
drm/amd: update the atomfirmware header for smu12
This patch is to update the atomfirmware header for smu12.
v2: remove some unnecessary members
Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Felix Kuehling [Wed, 7 Apr 2021 21:30:05 +0000 (17:30 -0400)]
drm/amdkfd: Remove legacy code not acquiring VMs
ROCm user mode has acquired VMs from DRM file descriptors for as long
as it supported the upstream KFD. Legacy code to support older versions
of ROCm is not needed any more.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ramesh Errabolu [Mon, 12 Apr 2021 23:23:05 +0000 (18:23 -0500)]
drm/amdgpu: Use iterator methods exposed by amdgpu_res_cursor.h in building SG_TABLE's for a VRAM BO
Extend current implementation of SG_TABLE construction method to
allow exportation of sub-buffers of a VRAM BO. This capability will
enable logical partitioning of a VRAM BO into multiple non-overlapping
sub-buffers. One example of this use case is to partition a VRAM BO
into two sub-buffers, one for SRC and another for DST.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Wed, 14 Apr 2021 15:17:01 +0000 (11:17 -0400)]
drm/amdgpu: Add double-sscanf but invert
Add back the double-sscanf so that both decimal
and hexadecimal values could be read in, but this
time invert the scan so that hexadecimal format
with a leading 0x is tried first, and if that
fails, then try decimal format.
Also use a logical-AND instead of nesting double
if-conditional.
See commit "drm/amdgpu: Fix a bug for input with double sscanf"
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Kenneth Feng [Wed, 14 Apr 2021 03:01:42 +0000 (11:01 +0800)]
drm/amd/amdgpu: add ASPM support on polaris
add ASPM support on polaris
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Kenneth Feng [Wed, 14 Apr 2021 10:34:55 +0000 (18:34 +0800)]
drm/amd/amdgpu: enable ASPM on vega
enable ASPM on vega to save the power
without the performance hurt.
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Kenneth Feng [Wed, 14 Apr 2021 10:31:05 +0000 (18:31 +0800)]
drm/amd/amdgpu: enable ASPM on navi1x
enable ASPM on navi1x for the benifit of system power consumption
without performance hurt.
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jack Zhang [Wed, 14 Apr 2021 08:51:59 +0000 (16:51 +0800)]
drm/amd/sriov no need to config GECC for sriov
No need to config GECC feature here for sriov
Leave the host drvier to do the configuration job.
Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Lijo Lazar [Fri, 9 Apr 2021 12:16:59 +0000 (20:16 +0800)]
drm/amd/pm: Show updated clocks on aldebaran
When GFXCLK range is updated in manual/determinism mode, show the
updated min/max clock range.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Tue, 13 Apr 2021 12:48:59 +0000 (08:48 -0400)]
drm/amdgpu: Fix kernel-doc for the RAS sysfs interface
Imporve the kernel-doc for the RAS sysfs
interface. Fix the grammar, fix the context.
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Tue, 13 Apr 2021 12:40:18 +0000 (08:40 -0400)]
drm/amdgpu: Add bad_page_cnt_threshold to debugfs
Add bad_page_cnt_threshold to debugfs, an optional
file system used for debugging, for reporting
purposes only--it usually matches the size of
EEPROM but may be different depending on the
"bad_page_threshold" kernel module option.
The "bad_page_cnt_threshold" is a dynamically
computed value. It depends on three things: the
VRAM size; the size of the EEPROM (or the size
allocated to the RAS table therein); and the
"bad_page_threshold" module parameter. It is a
dynamically computed value, when the amdgpu module
is run, on which further parameters and logic
depend, and as such it is helpful to see the
dynamically computed value in debugfs.
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Tue, 13 Apr 2021 12:31:59 +0000 (08:31 -0400)]
drm/amdgpu: Fix a bug in checking the result of reserve page
Fix if (ret) --> if (!ret), a bug, for
"retire_page", which caused the kernel to recall
the method with *pos == end of file, and that
bounced back with error. On the first run, we
advanced *pos, but returned 0 back to fs layer,
also a bug.
Fix the logic of the check of the result of
amdgpu_reserve_page_direct()--it is 0 on success,
and non-zero on error, not the other way
around. This patch fixes this bug.
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: John Clements <john.clements@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Tue, 13 Apr 2021 11:49:11 +0000 (07:49 -0400)]
drm/amdgpu: Fix a bug for input with double sscanf
Remove double-sscanf to scan for %llu and 0x%llx,
as that is not going to work!
The %llu will consume the "0" in "0x" of your
input, and the hex value you think you're entering
will always be 0. That is, a valid hex value can
never be consumed.
On the other hand, just entering a hex number
without leading 0x will either be scanned as a
string and not match, for instance FAB123, or
the leading decimal portion is scanned as the
%llu, for instance 123FAB will be scanned as 123,
which is not correct.
Thus remove the first %llu scan and leave only the
%llx scan, removing the leading 0x since %llx can
scan either.
Addresses are usually always hex values, so this
suffices.
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Xinhui Pan <xinhui.pan@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jinzhou Su [Mon, 12 Apr 2021 07:45:31 +0000 (15:45 +0800)]
drm/amdgpu: Add graphics cache rinse packet for sdma
Add emit mem sync callback for sdma_v5_2
In amdgpu sync object test, three threads created jobs
to send GFX IB and SDMA IB in sequence. After the first
GFX thread joined, sometimes the third thread will reuse
the same physical page to store the SDMA IB. There will
be a risk that SDMA will read GFX IB in the previous physical
page. So it's better to flush the cache before commit sdma IB.
Signed-off-by: Jinzhou Su <Jinzhou.Su@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Kent Russell [Mon, 12 Apr 2021 12:45:58 +0000 (08:45 -0400)]
drm/amdgpu: Ensure dcefclk isn't created on Aldebaran
Like Arcturus, this isn't available on Aldebaran, so remove it
accordingly
Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Anthony Koo [Sun, 4 Apr 2021 14:38:19 +0000 (10:38 -0400)]
drm/amd/display: [FW Promotion] Release 0.0.61
Signed-off-by: Anthony Koo <Anthony.Koo@amd.com>
Acked-by: Bindu Ramamurthy <bindur12@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Aric Cyr [Sun, 4 Apr 2021 16:32:38 +0000 (12:32 -0400)]
drm/amd/display: 3.2.131
DC version 3.2.131 brings improvements in multiple areas.
In summary, we highlight:
-Enhancement for multiple eDP BL control.
-Add debug flag to enable eDP ILR by default and debugfs to repress HPD/HPR_RX IRQ.
-Fixes for DSC enable sequence,Force vsync flip,hang when psr is enabled etc.
-Firmware releases:
0.0.60
0.0.61
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Acked-by: Bindu Ramamurthy <bindur12@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Roman Li [Wed, 31 Mar 2021 20:50:44 +0000 (16:50 -0400)]
drm/amd/display: Fix hangs with psr enabled on dcn3.xx
[Why]
SKIP_CRTC_DISABLE bit should be applicable to all dcn asics
not only Raven.
[How]
Replace check for Raven only with check for all DCNs.
Signed-off-by: Roman Li <Roman.Li@amd.com>
Acked-by: Bindu Ramamurthy <bindur12@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jake Wang [Thu, 1 Apr 2021 19:04:50 +0000 (15:04 -0400)]
drm/amd/display: Added support for multiple eDP BL control
[WHY & HOW]
Driver currently assumes only 1 eDP is connected. Added support for
multiple eDP BL control.
Signed-off-by: Jake Wang <haonan.wang2@amd.com>
Acked-by: Bindu Ramamurthy <bindur12@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Xiaojian Du [Fri, 9 Apr 2021 08:19:43 +0000 (16:19 +0800)]
drm/amd/pm: add support for new smu metrics table for vangogh
This patch is to add support for new smu metrics table for vangogh.
It will support new and legacy smu metrics table in the meanwhile.
New pmfw version is 4.63.36.00, and new smu interface version is #3.
v1: check smu pmfw version to determine to use new or legacy smu metrics
table
v2: check smu interface version to determine to use new or legacy smu
metrics table
v3: revise wrong symbol
Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Xiaojian Du [Thu, 25 Mar 2021 08:33:27 +0000 (16:33 +0800)]
drm/amd/pm: update the driver interface header for vangogh
This patch is to update the driver interface header for vangogh.
New version driver interface header will support new version pmfw
(from version 4.63.36.00) which uses new smu metrics table.
Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Anthony Koo [Thu, 1 Apr 2021 21:57:40 +0000 (17:57 -0400)]
drm/amd/display: [FW Promotion] Release 0.0.60
Signed-off-by: Anthony Koo <Anthony.Koo@amd.com>
Acked-by: Bindu Ramamurthy <bindur12@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Lewis Huang [Fri, 26 Mar 2021 08:12:45 +0000 (16:12 +0800)]
drm/amd/display: wait vblank when stream enabled and update dpp clock
[Why]
When boot into OS, seamless boot device won't blank stream.
Driver update dpp clock when scanline position in vactive will show
garbage on screen.
[How]
Wait for vblank for seamless boot edp display when driver update dpp clock.
The apply seamless boot flag will be clear when OS call SetVisibility on.
Therefore we only wait for vblank once after boot into OS.
Signed-off-by: Lewis Huang <Lewis.Huang@amd.com>
Acked-by: Bindu Ramamurthy <bindur12@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Harry Wentland [Fri, 2 Oct 2020 17:32:00 +0000 (13:32 -0400)]
drm/amd/display: Add debugfs to repress HPD and HPR_RX IRQs
[Why]
For debugging reasons it can be beneficial to disable any hotplug and DP shortpulse interrupt handling.
[How]
Expose a debugfs to set a flag to bypass HPD IRQ handling and skip IRQ handling if flag is set.
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Bindu Ramamurthy <bindur12@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mikita Lipski [Tue, 1 Dec 2020 15:52:58 +0000 (10:52 -0500)]
drm/amd/display: Connect clock optimization function to dcn301
[why/how]
Connecting clock optimization functions to dcn301 HWSS
to enable power state enter/exit optimization
Signed-off-by: Mikita Lipski <mikita.lipski@amd.com>
Acked-by: Bindu Ramamurthy <bindur12@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mikita Lipski [Tue, 27 Oct 2020 19:52:43 +0000 (15:52 -0400)]
drm/amd/display: Remove unused flag from stream state
[why & how]
Removing unused DSC flag which is incorrect and is not used.
We are only using stream->timing.flags.DSC for DSC's current
state. Stream state as an input parameter and should not contain
any past status flags.
Signed-off-by: Mikita Lipski <mikita.lipski@amd.com>
Acked-by: Bindu Ramamurthy <bindur12@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Anthony Wang [Wed, 31 Mar 2021 15:03:35 +0000 (11:03 -0400)]
drm/amd/display: Force vsync flip when reconfiguring MPCC
[Why]
Underflow observed when disabling PIP overlay in-game when
vsync is disabled, due to OTC master lock not working with
game pipe which is immediate flip.
[How]
When performing a full update, override flip_immediate value
to false for all planes, so that flip occurs on vsync.
Signed-off-by: Anthony Wang <anthony1.wang@amd.com>
Acked-by: Bindu Ramamurthy <bindur12@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Wesley Chalmers [Tue, 30 Mar 2021 22:15:21 +0000 (18:15 -0400)]
drm/amd/display: Set LTTPR mode to non-LTTPR if no repeaters found
[WHY]
If no repeaters are found, we do not need or want to attempt to
link-train repeaters, as this could cause bugs.
Signed-off-by: Wesley Chalmers <Wesley.Chalmers@amd.com>
Acked-by: Bindu Ramamurthy <bindur12@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Nicholas Kazlauskas [Tue, 30 Mar 2021 20:25:56 +0000 (16:25 -0400)]
drm/amd/display: Fix DML validation of simple vs native 422 modes
[Why]
We're always validating DML with simple 422 DSC even if native 422 DSC
is in use.
[How]
Use the mode configuration from the timing.
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Acked-by: Bindu Ramamurthy <bindur12@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Michael Strauss [Mon, 22 Mar 2021 18:53:29 +0000 (14:53 -0400)]
drm/amd/display: Remove static property from decide_edp_link_settings
[Why & How]
Static cleanup for eDP ILR Support.
Signed-off-by: Michael Strauss <michael.strauss@amd.com>
Acked-by: Bindu Ramamurthy <bindur12@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mike Hsieh [Mon, 25 Jan 2021 04:46:21 +0000 (12:46 +0800)]
drm/amd/display: Fix DSC enable sequence
[Why]
DSC is enabled before reset link and potentially cause DSC enable fail problem.
[How]
Enable DSC after link is reseted
Signed-off-by: Mike Hsieh <chun-wei.hsieh@amd.com>
Acked-by: Bindu Ramamurthy <bindur12@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Michael Strauss [Tue, 9 Mar 2021 19:35:05 +0000 (14:35 -0500)]
drm/amd/display: Disable boot optimizations if ILR optimzation is required
[Why]
VBIOS currently sets the max link rate found in eDP 1.4 SUPPORTED_LINK_RATES table
If eDP fastboot optimizations are enabled, the link rate remains at max after init
[How]
Determine optimal link rate during boot, disable seamless boot
and eDP fastboot optimizations if link rate optimization is required
Signed-off-by: Michael Strauss <michael.strauss@amd.com>
Acked-by: Bindu Ramamurthy <bindur12@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Michael Strauss [Mon, 22 Mar 2021 19:13:13 +0000 (15:13 -0400)]
drm/amd/display: Add debug flag to enable eDP ILR by default
[Why & How]
Allow per-asic enablement of ILR feature with debug flag
Signed-off-by: Michael Strauss <michael.strauss@amd.com>
Acked-by: Bindu Ramamurthy <bindur12@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Eric Huang [Tue, 6 Apr 2021 17:19:40 +0000 (13:19 -0400)]
drm/amdkfd: change MTYPEs for Aldebaran's HW requirement
Due to changes of HW memory model, we need to change Aldebaran
MTYPEs to meet HW changes.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Oak Zeng [Thu, 1 Apr 2021 19:36:42 +0000 (14:36 -0500)]
drm/amdgpu: Introduce new SETUP_TMR interface
This new interface passes both virtual and physical address
to PSP. It is backward compatible with old interface.
v2: use a function to simplify tmr physical address calc (Lijo)
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Oak Zeng [Thu, 1 Apr 2021 19:36:41 +0000 (14:36 -0500)]
drm/amdgpu: Calling address translation functions to simplify codes
Use amdgpu_gmc_vram_pa and amdgpu_gmc_vram_cpu_pa
to simplify codes. No logic change.
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Oak Zeng [Thu, 1 Apr 2021 19:36:40 +0000 (14:36 -0500)]
drm/amdgpu: Introduce functions for vram physical addr calculation
Add one function to calculate BO's GPU physical address.
And another function to calculate BO's CPU physical address.
v2: Use functions vs macros (Christian)
Use more proper function names (Christian)
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Suggested-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
John Clements [Mon, 12 Apr 2021 08:12:56 +0000 (16:12 +0800)]
drm/amdgpu: update gfx 9.4.2 ras error reporting
only output ras error status if an error bit is set or error counter is set
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
John Clements [Mon, 12 Apr 2021 08:12:41 +0000 (16:12 +0800)]
drm/amdgpu: update mmhub 1.7 ras error reporting
only output ras error status if an error bit is set
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Lijo Lazar [Mon, 12 Apr 2021 05:57:35 +0000 (13:57 +0800)]
drm/amd/pm: Use VBIOS PPTable for aldebaran
Keep the logic to force-use VBIOS PPTable in aldebaran rather
than in generic SMU13.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Daniel Vetter [Tue, 13 Apr 2021 21:33:41 +0000 (23:33 +0200)]
Merge tag 'drm-msm-next-2021-04-11' of https://gitlab.freedesktop.org/drm/msm into drm-next
msm-next from Rob:
* Big DSI phy/pll cleanup. Includes some clk patches, acked by
maintainer
* Initial support for sc7280
* compatibles fixes for sm8150/sm8250
* cleanups for all dpu gens to use same bandwidth scaling paths (\o/)
* various shrinker path lock contention optimizations
* unpin/swap support for GEM objects (disabled by default, enable with
msm.enable_eviction=1 .. due to various combinations of iommu drivers
with older gens I want to get more testing on hw I don't have in front
of me before enabling by default)
* The usual assortment of misc fixes and cleanups
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Rob Clark <robdclark@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGvL=4aw15qoY8fbKG9FCgnx8Y-dCtf7xiFwTQSHopwSQg@mail.gmail.com
Daniel Vetter [Tue, 13 Apr 2021 21:06:34 +0000 (23:06 +0200)]
Merge drm/drm-fixes into drm-next
msm-next pull request has a baseline with stuff from -fixes, roll
forward first.
Some simple conflicts in amdgpu, ttm and one in i915 where git gets
confused and tries to add the same function twice.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Daniel Vetter [Tue, 13 Apr 2021 10:25:16 +0000 (12:25 +0200)]
Merge tag 'amd-drm-next-5.13-2021-04-12' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.13-2021-04-12:
amdgpu:
- Re-enable GPU reset on VanGogh
- Enable DPM flags for SMART_SUSPEND and MAY_SKIP_RESUME
- Disentangle HG from vga_switcheroo
- S0ix fixes
- W=1 fixes
- Resource iterator fixes
- DMCUB updates
- UBSAN fixes
- More PM API cleanup
- Aldebaran updates
- Modifier fixes
- Enable VCN load balancing with asymmetric engines
- Rework BO structs
- Aldebaran reset support
- Initial LTTPR display work
- Display MALL fixes
- Fall back to YCbCr420 when YCbCr444 fails
- SR-IOV fixes
- RAS updates
- Misc cleanups and fixes
radeon:
- Typo fixes
- Fix error handling for firmware on r6xx
- Fix a missing check in DP MST handling
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210412220732.3845-1-alexander.deucher@amd.com
Linus Torvalds [Sun, 11 Apr 2021 22:16:13 +0000 (15:16 -0700)]
Linux 5.12-rc7
Linus Torvalds [Sun, 11 Apr 2021 18:53:36 +0000 (11:53 -0700)]
Merge tag 'for-5.12-rc6-tag' of git://git./linux/kernel/git/kdave/linux
Pull btrfs fix from David Sterba:
"One more patch that we'd like to get to 5.12 before release.
It's changing where and how the superblock is stored in the zoned
mode. It is an on-disk format change but so far there are no
implications for users as the proper mkfs support hasn't been merged
and is waiting for the kernel side to settle.
Until now, the superblocks were derived from the zone index, but zone
size can differ per device. This is changed to be based on fixed
offset values, to make it independent of the device zone size.
The work on that got a bit delayed, we discussed the exact locations
to support potential device sizes and usecases. (Partially delayed
also due to my vacation.) Having that in the same release where the
zoned mode is declared usable is highly desired, there are userspace
projects that need to be updated to recognize the feature. Pushing
that to the next release would make things harder to test"
* tag 'for-5.12-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: zoned: move superblock logging zone location
Linus Torvalds [Sun, 11 Apr 2021 18:47:03 +0000 (11:47 -0700)]
Merge tag 'locking-urgent-2021-04-11' of git://git./linux/kernel/git/tip/tip
Pull locking fixlets from Ingo Molnar:
"Two minor fixes: one for a Clang warning, the other improves an
ambiguous/confusing kernel log message"
* tag 'locking-urgent-2021-04-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
lockdep: Address clang -Wformat warning printing for %hd
lockdep: Add a missing initialization hint to the "INFO: Trying to register non-static key" message
Linus Torvalds [Sun, 11 Apr 2021 18:42:18 +0000 (11:42 -0700)]
Merge tag 'x86_urgent_for_v5.12-rc7' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
- Fix the vDSO exception handling return path to disable interrupts
again.
- A fix for the CE collector to return the proper return values to its
callers which are used to convey what the collector has done with the
error address.
* tag 'x86_urgent_for_v5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/traps: Correct exc_general_protection() and math_error() return paths
RAS/CEC: Correct ce_add_elem()'s returned values
Linus Torvalds [Sat, 10 Apr 2021 19:51:12 +0000 (12:51 -0700)]
Merge branch 'for-5.12-fixes' of git://git./linux/kernel/git/dennis/percpu
Pull percpu fix from Dennis Zhou:
"This contains a fix for sporadically failing atomic percpu
allocations.
I only caught it recently while I was reviewing a new series [1] and
simultaneously saw reports by btrfs in xfstests [2] and [3].
In v5.9, memcg accounting was extended to percpu done by adding a
second type of chunk. I missed an interaction with the free page float
count used to ensure we can support atomic allocations. If one type of
chunk has no free pages, but the other has enough to satisfy the free
page float requirement, we will not repopulate the free pages for the
former type of chunk. This led to the sporadically failing atomic
allocations"
Link: https://lore.kernel.org/linux-mm/20210324190626.564297-1-guro@fb.com/
Link: https://lore.kernel.org/linux-mm/20210401185158.3275.409509F4@e16-tech.com/
Link: https://lore.kernel.org/linux-mm/CAL3q7H5RNBjCi708GH7jnczAOe0BLnacT9C+OBgA-Dx9jhB6SQ@mail.gmail.com/
* 'for-5.12-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu:
percpu: make pcpu_nr_empty_pop_pages per chunk type
Linus Torvalds [Sat, 10 Apr 2021 19:29:19 +0000 (12:29 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Seven fixes, all in drivers.
The hpsa three are the most extensive and the most problematic: it's a
packed structure misalignment that oopses on ia64 but looks like it
would also oops on quite a few non-x86 architectures.
The pm80xx is a regression and the rest are bug fixes for patches in
the misc tree"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: scsi_transport_srp: Don't block target in SRP_PORT_LOST state
scsi: target: iscsi: Fix zero tag inside a trace event
scsi: pm80xx: Fix chip initialization failure
scsi: ufs: core: Fix wrong Task Tag used in task management request UPIUs
scsi: ufs: core: Fix task management request completion timeout
scsi: hpsa: Add an assert to prevent __packed reintroduction
scsi: hpsa: Fix boot on ia64 (atomic_t alignment)
scsi: hpsa: Use __packed on individual structs, not header-wide
Linus Torvalds [Sat, 10 Apr 2021 16:31:52 +0000 (09:31 -0700)]
Merge tag 'powerpc-5.12-6' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Some some more powerpc fixes for 5.12:
- Fix an oops triggered by ptrace when CONFIG_PPC_FPU_REGS=n
- Fix an oops on sigreturn when the VDSO is unmapped on 32-bit
- Fix vdso_wrapper.o not being rebuilt everytime vdso.so is rebuilt
Thanks to Christophe Leroy"
* tag 'powerpc-5.12-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/vdso: Make sure vdso_wrapper.o is rebuilt everytime vdso.so is rebuilt
powerpc/signal32: Fix Oops on sigreturn with unmapped VDSO
powerpc/ptrace: Don't return error when getting/setting FP regs without CONFIG_PPC_FPU_REGS
Linus Torvalds [Sat, 10 Apr 2021 16:24:35 +0000 (09:24 -0700)]
Merge tag 'driver-core-5.12-rc7' of git://git./linux/kernel/git/gregkh/driver-core
Pull driver core fix from Greg KH:
"Here is a single driver core fix for 5.12-rc7 to resolve a reported
problem that caused some devices to lockup when booting. It has been
in linux-next with no reported issues"
* tag 'driver-core-5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
driver core: Fix locking bug in deferred_probe_timeout_work_func()
Linus Torvalds [Sat, 10 Apr 2021 16:19:33 +0000 (09:19 -0700)]
Merge tag 'usb-5.12-rc7' of git://git./linux/kernel/git/gregkh/usb
Pull USB/Thunderbolt fixes from Greg KH:
"Here are a few small USB and Thunderbolt driver fixes for 5.12-rc7 for
reported issues:
- thunderbolt leaks and off-by-one fix
- cdnsp deque fix
- usbip fixes for syzbot-reported issues
All have been in linux-next with no reported problems"
* tag 'usb-5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usbip: synchronize event handler with sysfs code paths
usbip: vudc synchronize sysfs code paths
usbip: stub-dev synchronize sysfs code paths
usbip: add sysfs_lock to synchronize sysfs code paths
thunderbolt: Fix off by one in tb_port_find_retimer()
thunderbolt: Fix a leak in tb_retimer_add()
usb: cdnsp: Fixes issue with dequeuing requests after disabling endpoint
Linus Torvalds [Sat, 10 Apr 2021 16:10:55 +0000 (09:10 -0700)]
Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"A mixture of driver and documentation bugfixes for I2C"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: imx: mention Oleksij as maintainer of the binding docs
i2c: exynos5: correct top kerneldoc
i2c: designware: Adjust bus_freq_hz when refuse high speed mode set
i2c: hix5hd2: use the correct HiSilicon copyright
i2c: gpio: update email address in binding docs
i2c: imx: drop me as maintainer of binding docs
i2c: stm32f4: Mundane typo fix
I2C: JZ4780: Fix bug for Ingenic X1000.
i2c: turn recovery error on init to debug
Naohiro Aota [Thu, 8 Apr 2021 08:25:28 +0000 (17:25 +0900)]
btrfs: zoned: move superblock logging zone location
Moves the location of the superblock logging zones. The new locations of
the logging zones are now determined based on fixed block addresses
instead of on fixed zone numbers.
The old placement method based on fixed zone numbers causes problems when
one needs to inspect a file system image without access to the drive zone
information. In such case, the super block locations cannot be reliably
determined as the zone size is unknown. By locating the superblock logging
zones using fixed addresses, we can scan a dumped file system image without
the zone information since a super block copy will always be present at or
after the fixed known locations.
Introduce the following three pairs of zones containing fixed offset
locations, regardless of the device zone size.
- primary superblock: offset 0B (and the following zone)
- first copy: offset 512G (and the following zone)
- Second copy: offset 4T (4096G, and the following zone)
If a logging zone is outside of the disk capacity, we do not record the
superblock copy.
The first copy position is much larger than for a non-zoned filesystem,
which is at 64M. This is to avoid overlapping with the log zones for
the primary superblock. This higher location is arbitrary but allows
supporting devices with very large zone sizes, plus some space around in
between.
Such large zone size is unrealistic and very unlikely to ever be seen in
real devices. Currently, SMR disks have a zone size of 256MB, and we are
expecting ZNS drives to be in the 1-4GB range, so this limit gives us
room to breathe. For now, we only allow zone sizes up to 8GB. The
maximum zone size that would still fit in the space is 256G.
The fixed location addresses are somewhat arbitrary, with the intent of
maintaining superblock reliability for smaller and larger devices, with
the preference for the latter. For this reason, there are two superblocks
under the first 1T. This should cover use cases for physical devices and
for emulated/device-mapper devices.
The superblock logging zones are reserved for superblock logging and
never used for data or metadata blocks. Note that we only reserve the
two zones per primary/copy actually used for superblock logging. We do
not reserve the ranges of zones possibly containing superblocks with the
largest supported zone size (0-16GB, 512G-528GB, 4096G-4112G).
The zones containing the fixed location offsets used to store
superblocks on a non-zoned volume are also reserved to avoid confusion.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Linus Torvalds [Sat, 10 Apr 2021 03:00:10 +0000 (20:00 -0700)]
Merge tag 'clk-fixes-for-linus' of git://git./linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
"Here's the latest pile of clk driver and clk framework fixes for this
release:
- Two clk framework fixes for a long standing issue in
clk_notifier_{register,unregister}() where we used a pointer that
was for a struct containing a list head when there was no container
struct
- A compile warning fix for socfpga that's good to have
- A double free problem with devm registered fixed factor clks
- One last fix to the Qualcomm camera clk driver to use the right clk
ops so clks don't get stuck and stop working because the firmware
takes them for a ride"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: fixed: fix double free in resource managed fixed-factor clock
clk: fix invalid usage of list cursor in unregister
clk: fix invalid usage of list cursor in register
clk: qcom: camcc: Update the clock ops for the SC7180
clk: socfpga: fix iomem pointer cast on 64-bit
Linus Torvalds [Sat, 10 Apr 2021 00:12:31 +0000 (17:12 -0700)]
Merge tag 'perf-tools-fixes-for-v5.12-2020-04-09' of git://git./linux/kernel/git/acme/linux
Pull perf tool fixes from Arnaldo Carvalho de Melo:
- Fix wrong LBR block sorting in 'perf report'
- Fix 'perf inject' repipe usage when consuming perf.data files
- Avoid potential buffer overrun when decoding ARM SPE hardware tracing
packets, bug found using a fuzzer
* tag 'perf-tools-fixes-for-v5.12-2020-04-09' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
perf arm-spe: Avoid potential buffer overrun
perf report: Fix wrong LBR block sorting
perf inject: Fix repipe usage
Linus Torvalds [Sat, 10 Apr 2021 00:06:32 +0000 (17:06 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
"14 patches.
Subsystems affected by this patch series: mm (kasan, gup, pagecache,
and kfence), MAINTAINERS, mailmap, nds32, gcov, ocfs2, ia64, and lib"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
lib: fix kconfig dependency on ARCH_WANT_FRAME_POINTERS
kfence, x86: fix preemptible warning on KPTI-enabled systems
lib/test_kasan_module.c: suppress unused var warning
kasan: fix conflict with page poisoning
fs: direct-io: fix missing sdio->boundary
ia64: fix user_stack_pointer() for ptrace()
ocfs2: fix deadlock between setattr and dio_end_io_write
gcov: re-fix clang-11+ support
nds32: flush_dcache_page: use page_mapping_file to avoid races with swapoff
mm/gup: check page posion status for coredump.
.mailmap: fix old email addresses
mailmap: update email address for Jordan Crouse
treewide: change my e-mail address, fix my name
MAINTAINERS: update CZ.NIC's Turris information
Linus Torvalds [Fri, 9 Apr 2021 22:26:51 +0000 (15:26 -0700)]
Merge tag 'net-5.12-rc7' of git://git./linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Networking fixes for 5.12-rc7, including fixes from can, ipsec,
mac80211, wireless, and bpf trees.
No scary regressions here or in the works, but small fixes for 5.12
changes keep coming.
Current release - regressions:
- virtio: do not pull payload in skb->head
- virtio: ensure mac header is set in virtio_net_hdr_to_skb()
- Revert "net: correct sk_acceptq_is_full()"
- mptcp: revert "mptcp: provide subflow aware release function"
- ethernet: lan743x: fix ethernet frame cutoff issue
- dsa: fix type was not set for devlink port
- ethtool: remove link_mode param and derive link params from driver
- sched: htb: fix null pointer dereference on a null new_q
- wireless: iwlwifi: Fix softirq/hardirq disabling in
iwl_pcie_enqueue_hcmd()
- wireless: iwlwifi: fw: fix notification wait locking
- wireless: brcmfmac: p2p: Fix deadlock introduced by avoiding the
rtnl dependency
Current release - new code bugs:
- napi: fix hangup on napi_disable for threaded napi
- bpf: take module reference for trampoline in module
- wireless: mt76: mt7921: fix airtime reporting and related tx hangs
- wireless: iwlwifi: mvm: rfi: don't lock mvm->mutex when sending
config command
Previous releases - regressions:
- rfkill: revert back to old userspace API by default
- nfc: fix infinite loop, refcount & memory leaks in LLCP sockets
- let skb_orphan_partial wake-up waiters
- xfrm/compat: Cleanup WARN()s that can be user-triggered
- vxlan, geneve: do not modify the shared tunnel info when PMTU
triggers an ICMP reply
- can: fix msg_namelen values depending on CAN_REQUIRED_SIZE
- can: uapi: mark union inside struct can_frame packed
- sched: cls: fix action overwrite reference counting
- sched: cls: fix err handler in tcf_action_init()
- ethernet: mlxsw: fix ECN marking in tunnel decapsulation
- ethernet: nfp: Fix a use after free in nfp_bpf_ctrl_msg_rx
- ethernet: i40e: fix receiving of single packets in xsk zero-copy
mode
- ethernet: cxgb4: avoid collecting SGE_QBASE regs during traffic
Previous releases - always broken:
- bpf: Refuse non-O_RDWR flags in BPF_OBJ_GET
- bpf: Refcount task stack in bpf_get_task_stack
- bpf, x86: Validate computation of branch displacements
- ieee802154: fix many similar syzbot-found bugs
- fix NULL dereferences in netlink attribute handling
- reject unsupported operations on monitor interfaces
- fix error handling in llsec_key_alloc()
- xfrm: make ipv4 pmtu check honor ip header df
- xfrm: make hash generation lock per network namespace
- xfrm: esp: delete NETIF_F_SCTP_CRC bit from features for esp
offload
- ethtool: fix incorrect datatype in set_eee ops
- xdp: fix xdp_return_frame() kernel BUG throw for page_pool memory
model
- openvswitch: fix send of uninitialized stack memory in ct limit
reply
Misc:
- udp: add get handling for UDP_GRO sockopt"
* tag 'net-5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (182 commits)
net: fix hangup on napi_disable for threaded napi
net: hns3: Trivial spell fix in hns3 driver
lan743x: fix ethernet frame cutoff issue
net: ipv6: check for validity before dereferencing cfg->fc_nlinfo.nlh
net: dsa: lantiq_gswip: Configure all remaining GSWIP_MII_CFG bits
net: dsa: lantiq_gswip: Don't use PHY auto polling
net: sched: sch_teql: fix null-pointer dereference
ipv6: report errors for iftoken via netlink extack
net: sched: fix err handler in tcf_action_init()
net: sched: fix action overwrite reference counting
Revert "net: sched: bump refcount for new action in ACT replace mode"
ice: fix memory leak of aRFS after resuming from suspend
i40e: Fix sparse warning: missing error code 'err'
i40e: Fix sparse error: 'vsi->netdev' could be null
i40e: Fix sparse error: uninitialized symbol 'ring'
i40e: Fix sparse errors in i40e_txrx.c
i40e: Fix parameters in aq_get_phy_register()
nl80211: fix beacon head validation
bpf, x86: Validate computation of branch displacements for x86-32
bpf, x86: Validate computation of branch displacements for x86-64
...
Linus Torvalds [Fri, 9 Apr 2021 22:06:52 +0000 (15:06 -0700)]
Merge tag 'io_uring-5.12-2021-04-09' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
"Two minor fixups for the reissue logic, and one for making sure that
unbounded work is canceled on io-wq exit"
* tag 'io_uring-5.12-2021-04-09' of git://git.kernel.dk/linux-block:
io-wq: cancel unbounded works on io-wq destroy
io_uring: fix rw req completion
io_uring: clear F_REISSUE right after getting it
Julian Braha [Fri, 9 Apr 2021 20:27:47 +0000 (13:27 -0700)]
lib: fix kconfig dependency on ARCH_WANT_FRAME_POINTERS
When LATENCYTOP, LOCKDEP, or FAULT_INJECTION_STACKTRACE_FILTER is
enabled and ARCH_WANT_FRAME_POINTERS is disabled, Kbuild gives a warning
such as:
WARNING: unmet direct dependencies detected for FRAME_POINTER
Depends on [n]: DEBUG_KERNEL [=y] && (M68K || UML || SUPERH) || ARCH_WANT_FRAME_POINTERS [=n] || MCOUNT [=n]
Selected by [y]:
- LATENCYTOP [=y] && DEBUG_KERNEL [=y] && STACKTRACE_SUPPORT [=y] && PROC_FS [=y] && !MIPS && !PPC && !S390 && !MICROBLAZE && !ARM && !ARC && !X86
Depending on ARCH_WANT_FRAME_POINTERS causes a recursive dependency
error. ARCH_WANT_FRAME_POINTERS is to be selected by the architecture,
and is not supposed to be overridden by other config options.
Link: https://lkml.kernel.org/r/20210329165329.27994-1-julianbraha@gmail.com
Signed-off-by: Julian Braha <julianbraha@gmail.com>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Necip Fazil Yildiran <fazilyildiran@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Marco Elver [Fri, 9 Apr 2021 20:27:44 +0000 (13:27 -0700)]
kfence, x86: fix preemptible warning on KPTI-enabled systems
On systems with KPTI enabled, we can currently observe the following
warning:
BUG: using smp_processor_id() in preemptible
caller is invalidate_user_asid+0x13/0x50
CPU: 6 PID: 1075 Comm: dmesg Not tainted 5.12.0-rc4-gda4a2b1a5479-kfence_1+ #1
Hardware name: Hewlett-Packard HP Pro 3500 Series/2ABF, BIOS 8.11 10/24/2012
Call Trace:
dump_stack+0x7f/0xad
check_preemption_disabled+0xc8/0xd0
invalidate_user_asid+0x13/0x50
flush_tlb_one_kernel+0x5/0x20
kfence_protect+0x56/0x80
...
While it normally makes sense to require preemption to be off, so that
the expected CPU's TLB is flushed and not another, in our case it really
is best-effort (see comments in kfence_protect_page()).
Avoid the warning by disabling preemption around flush_tlb_one_kernel().
Link: https://lore.kernel.org/lkml/YGIDBAboELGgMgXy@elver.google.com/
Link: https://lkml.kernel.org/r/20210330065737.652669-1-elver@google.com
Signed-off-by: Marco Elver <elver@google.com>
Reported-by: Tomi Sarvela <tomi.p.sarvela@intel.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Jann Horn <jannh@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Morton [Fri, 9 Apr 2021 20:27:41 +0000 (13:27 -0700)]
lib/test_kasan_module.c: suppress unused var warning
Local `unused' is intentionally unused - it is there to suppress
__must_check warnings.
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lkml.kernel.org/r/202104050216.HflRxfJm-lkp@intel.com
Cc: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrey Konovalov [Fri, 9 Apr 2021 20:27:38 +0000 (13:27 -0700)]
kasan: fix conflict with page poisoning
When page poisoning is enabled, it accesses memory that is marked as
poisoned by KASAN, which leas to false-positive KASAN reports.
Suppress the reports by adding KASAN annotations to unpoison_page()
(poison_page() already has them).
Link: https://lkml.kernel.org/r/2dc799014d31ac13fd97bd906bad33e16376fc67.1617118501.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jack Qiu [Fri, 9 Apr 2021 20:27:35 +0000 (13:27 -0700)]
fs: direct-io: fix missing sdio->boundary
I encountered a hung task issue, but not a performance one. I run DIO
on a device (need lba continuous, for example open channel ssd), maybe
hungtask in below case:
DIO: Checkpoint:
get addr A(at boundary), merge into BIO,
no submit because boundary missing
flush dirty data(get addr A+1), wait IO(A+1)
writeback timeout, because DIO(A) didn't submit
get addr A+2 fail, because checkpoint is doing
dio_send_cur_page() may clear sdio->boundary, so prevent it from missing
a boundary.
Link: https://lkml.kernel.org/r/20210322042253.38312-1-jack.qiu@huawei.com
Fixes:
b1058b981272 ("direct-io: submit bio after boundary buffer is added to it")
Signed-off-by: Jack Qiu <jack.qiu@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sergei Trofimovich [Fri, 9 Apr 2021 20:27:32 +0000 (13:27 -0700)]
ia64: fix user_stack_pointer() for ptrace()
ia64 has two stacks:
- memory stack (or stack), pointed at by by r12
- register backing store (register stack), pointed at by
ar.bsp/ar.bspstore with complications around dirty
register frame on CPU.
In [1] Dmitry noticed that PTRACE_GET_SYSCALL_INFO returns the register
stack instead memory stack.
The bug comes from the fact that user_stack_pointer() and
current_user_stack_pointer() don't return the same register:
ulong user_stack_pointer(struct pt_regs *regs) { return regs->ar_bspstore; }
#define current_user_stack_pointer() (current_pt_regs()->r12)
The change gets both back in sync.
I think ptrace(PTRACE_GET_SYSCALL_INFO) is the only affected user by
this bug on ia64.
The change fixes 'rt_sigreturn.gen.test' strace test where it was
observed initially.
Link: https://bugs.gentoo.org/769614
Link: https://lkml.kernel.org/r/20210331084447.2561532-1-slyfox@gentoo.org
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
Reported-by: Dmitry V. Levin <ldv@altlinux.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Wengang Wang [Fri, 9 Apr 2021 20:27:29 +0000 (13:27 -0700)]
ocfs2: fix deadlock between setattr and dio_end_io_write
The following deadlock is detected:
truncate -> setattr path is waiting for pending direct IO to be done (inode->i_dio_count become zero) with inode->i_rwsem held (down_write).
PID: 14827 TASK:
ffff881686a9af80 CPU: 20 COMMAND: "ora_p005_hrltd9"
#0 __schedule at
ffffffff818667cc
#1 schedule at
ffffffff81866de6
#2 inode_dio_wait at
ffffffff812a2d04
#3 ocfs2_setattr at
ffffffffc05f322e [ocfs2]
#4 notify_change at
ffffffff812a5a09
#5 do_truncate at
ffffffff812808f5
#6 do_sys_ftruncate.constprop.18 at
ffffffff81280cf2
#7 sys_ftruncate at
ffffffff81280d8e
#8 do_syscall_64 at
ffffffff81003949
#9 entry_SYSCALL_64_after_hwframe at
ffffffff81a001ad
dio completion path is going to complete one direct IO (decrement
inode->i_dio_count), but before that it hung at locking inode->i_rwsem:
#0 __schedule+700 at
ffffffff818667cc
#1 schedule+54 at
ffffffff81866de6
#2 rwsem_down_write_failed+536 at
ffffffff8186aa28
#3 call_rwsem_down_write_failed+23 at
ffffffff8185a1b7
#4 down_write+45 at
ffffffff81869c9d
#5 ocfs2_dio_end_io_write+180 at
ffffffffc05d5444 [ocfs2]
#6 ocfs2_dio_end_io+85 at
ffffffffc05d5a85 [ocfs2]
#7 dio_complete+140 at
ffffffff812c873c
#8 dio_aio_complete_work+25 at
ffffffff812c89f9
#9 process_one_work+361 at
ffffffff810b1889
#10 worker_thread+77 at
ffffffff810b233d
#11 kthread+261 at
ffffffff810b7fd5
#12 ret_from_fork+62 at
ffffffff81a0035e
Thus above forms ABBA deadlock. The same deadlock was mentioned in
upstream commit
28f5a8a7c033 ("ocfs2: should wait dio before inode lock
in ocfs2_setattr()"). It seems that that commit only removed the
cluster lock (the victim of above dead lock) from the ABBA deadlock
party.
End-user visible effects: Process hang in truncate -> ocfs2_setattr path
and other processes hang at ocfs2_dio_end_io_write path.
This is to fix the deadlock itself. It removes inode_lock() call from
dio completion path to remove the deadlock and add ip_alloc_sem lock in
setattr path to synchronize the inode modifications.
[wen.gang.wang@oracle.com: remove the "had_alloc_lock" as suggested]
Link: https://lkml.kernel.org/r/20210402171344.1605-1-wen.gang.wang@oracle.com
Link: https://lkml.kernel.org/r/20210331203654.3911-1-wen.gang.wang@oracle.com
Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nick Desaulniers [Fri, 9 Apr 2021 20:27:26 +0000 (13:27 -0700)]
gcov: re-fix clang-11+ support
LLVM changed the expected function signature for llvm_gcda_emit_function()
in the clang-11 release. Users of clang-11 or newer may have noticed
their kernels producing invalid coverage information:
$ llvm-cov gcov -a -c -u -f -b <input>.gcda -- gcno=<input>.gcno
1 <func>: checksum mismatch, \
(<lineno chksum A>, <cfg chksum B>) != (<lineno chksum A>, <cfg chksum C>)
2 Invalid .gcda File!
...
Fix up the function signatures so calling this function interprets its
parameters correctly and computes the correct cfg checksum. In
particular, in clang-11, the additional checksum is no longer optional.
Link: https://reviews.llvm.org/rG25544ce2df0daa4304c07e64b9c8b0f7df60c11d
Link: https://lkml.kernel.org/r/20210408184631.1156669-1-ndesaulniers@google.com
Reported-by: Prasad Sodagudi <psodagud@quicinc.com>
Tested-by: Prasad Sodagudi <psodagud@quicinc.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Cc: <stable@vger.kernel.org> [5.4+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Rapoport [Fri, 9 Apr 2021 20:27:23 +0000 (13:27 -0700)]
nds32: flush_dcache_page: use page_mapping_file to avoid races with swapoff
Commit
cb9f753a3731 ("mm: fix races between swapoff and flush dcache")
updated flush_dcache_page implementations on several architectures to
use page_mapping_file() in order to avoid races between page_mapping()
and swapoff().
This update missed arch/nds32 and there is a possibility of a race
there.
Replace page_mapping() with page_mapping_file() in nds32 implementation
of flush_dcache_page().
Link: https://lkml.kernel.org/r/20210330175126.26500-1-rppt@kernel.org
Fixes:
cb9f753a3731 ("mm: fix races between swapoff and flush dcache")
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Greentime Hu <green.hu@gmail.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Aili Yao [Fri, 9 Apr 2021 20:27:19 +0000 (13:27 -0700)]
mm/gup: check page posion status for coredump.
When we do coredump for user process signal, this may be an SIGBUS signal
with BUS_MCEERR_AR or BUS_MCEERR_AO code, which means this signal is
resulted from ECC memory fail like SRAR or SRAO, we expect the memory
recovery work is finished correctly, then the get_dump_page() will not
return the error page as its process pte is set invalid by
memory_failure().
But memory_failure() may fail, and the process's related pte may not be
correctly set invalid, for current code, we will return the poison page,
get it dumped, and then lead to system panic as its in kernel code.
So check the poison status in get_dump_page(), and if TRUE, return NULL.
There maybe other scenario that is also better to check the posion status
and not to panic, so make a wrapper for this check, Thanks to David's
suggestion(<david@redhat.com>).
[akpm@linux-foundation.org: s/0/false/]
[yaoaili@kingsoft.com: is_page_poisoned() arg cannot be null, per Matthew]
Link: https://lkml.kernel.org/r/20210322115233.05e4e82a@alex-virtual-machine
Link: https://lkml.kernel.org/r/20210319104437.6f30e80d@alex-virtual-machine
Signed-off-by: Aili Yao <yaoaili@kingsoft.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Aili Yao <yaoaili@kingsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Matthew Wilcox [Fri, 9 Apr 2021 20:27:10 +0000 (13:27 -0700)]
.mailmap: fix old email addresses
Update Nick & Nadia's old addresses.
Link: https://lkml.kernel.org/r/20210406134036.GQ2531743@casper.infradead.org
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Nadia Yvette Chambers <nyc@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jordan Crouse [Fri, 9 Apr 2021 20:27:07 +0000 (13:27 -0700)]
mailmap: update email address for Jordan Crouse
jcrouse at codeaurora.org has started bouncing. Redirect to a more
permanent address.
Link: https://lkml.kernel.org/r/20210325143700.1490518-1-jordan@cosmicpenguin.net
Signed-off-by: Jordan Crouse <jordan@cosmicpenguin.net>
Cc: Alexander Lobakin <alobakin@pm.me>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kees Cook <keescook@chromium.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Marek Behún [Fri, 9 Apr 2021 20:27:04 +0000 (13:27 -0700)]
treewide: change my e-mail address, fix my name
Change my e-mail address to kabel@kernel.org, and fix my name in
non-code parts (add diacritical mark).
Link: https://lkml.kernel.org/r/20210325171123.28093-2-kabel@kernel.org
Signed-off-by: Marek Behún <kabel@kernel.org>
Cc: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jassi Brar <jassisinghbrar@gmail.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Marek Behún [Fri, 9 Apr 2021 20:27:01 +0000 (13:27 -0700)]
MAINTAINERS: update CZ.NIC's Turris information
Add all the files maintained by Turris team, not only for MOX, but also
for Omnia. Change website.
Link: https://lkml.kernel.org/r/20210325171123.28093-1-kabel@kernel.org
Signed-off-by: Marek Behún <kabel@kernel.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Jassi Brar <jassisinghbrar@gmail.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
John Clements [Fri, 9 Apr 2021 09:25:29 +0000 (17:25 +0800)]
drm/amdgpu: page retire over debugfs mechanism
added support in RAS debugfs to add bad page for isolated page retirement testing
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Yingjie Wang [Wed, 7 Apr 2021 03:10:04 +0000 (20:10 -0700)]
drm/radeon: Fix a missing check bug in radeon_dp_mst_detect()
In radeon_dp_mst_detect(), We should check whether or not @connector
has been unregistered from userspace. If the connector is unregistered,
we should return disconnected status.
Fixes:
9843ead08f18 ("drm/radeon: add DisplayPort MST support (v2)")
Signed-off-by: Yingjie Wang <wangyingjie55@126.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Shaokun Zhang [Thu, 8 Apr 2021 12:41:58 +0000 (20:41 +0800)]
drm/amd/display: Fix the Wunused-function warning
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:941:13: warning: ‘dm_dmub_trace_high_irq’ defined but not used [-Wunused-function]
941 | static void dm_dmub_trace_high_irq(void *interrupt_params)
| ^~~~~~~~~~~~~~~~~~~~~~
Fixes:
a08f16cfe8dc ("drm/amd/display: Log DMCUB trace buffer events")
Cc: Leo (Hanghong) Ma <hanghong.ma@amd.com>
Cc: Daniel Wheeler <daniel.wheeler@amd.com>
Cc: Harry Wentland <Harry.Wentland@amd.com>
Cc: Solomon Chiu <solomon.chiu@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>