John Crispin [Tue, 5 May 2020 07:42:03 +0000 (10:42 +0300)]
ath11k: add tx hw 802.11 encapsulation offloading support
This patch adds support for ethernet rxtx mode to the driver. The feature
is enabled via a new module parameter. If enabled to driver will enable
the feature on a per vif basis if all other requirements were met.
Signed-off-by: Shashidhar Lakkavalli <slakkavalli@datto.com>
Signed-off-by: John Crispin <john@phrozen.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200430152814.18481-1-john@phrozen.org
Sowmiya Sree Elavalagan [Mon, 4 May 2020 11:45:55 +0000 (17:15 +0530)]
ath11k: fix resource unavailability for htt stats after peer stats display
htt stats are not working after htt peer stats display
and also after htt peer stats reset. Trying to dump htt
stats shows "Resource temporarily unavailable".
This is because of "ar->debug.htt_stats.stats_req" member is being
consecutively used for all htt stats without being reset
during the previous usage. Hence assigning NULL to this member
after freeing the allocated memory fixes the issue.
console logs below:
# echo 9 >/sys/kernel/debug/ath11k/ipq8074/mac1/htt_stats_type
# cat /sys/kernel/debug/ath11k/ipq8074/mac1/htt_stats_type
9
# cat /sys/kernel/debug/ath11k/ipq8074/mac1/htt_stats
cat: can't open '/sys/kernel/debug/ath11k/ipq8074/mac1/htt_stats'
: Resource temporarily unavailable
Signed-off-by: Sowmiya Sree Elavalagan <ssreeela@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1588592755-10427-1-git-send-email-ssreeela@codeaurora.org
Jason Yan [Mon, 4 May 2020 11:33:36 +0000 (19:33 +0800)]
ath11k: use true,false for bool variables
Fix the following coccicheck warning:
drivers/net/wireless/ath/ath11k/dp_rx.c:2964:1-39: WARNING: Assignment
of 0/1 to bool variable
drivers/net/wireless/ath/ath11k/dp_rx.c:2965:1-38: WARNING: Assignment
of 0/1 to bool variable
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200504113336.41249-1-yanaijie@huawei.com
Rakesh Pillai [Mon, 4 May 2020 09:03:52 +0000 (12:03 +0300)]
ath10k: Add support for targets without trustzone
Add the support to attach and map iommu
domain for targets which do not have the
support of TrustZone.
Tested HW: WCN3990
Tested FW: WLAN.HL.3.1-01040-QCAHLSWMTPLZ-1
Signed-off-by: Rakesh Pillai <pillair@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1586971906-20985-4-git-send-email-pillair@codeaurora.org
Rakesh Pillai [Mon, 4 May 2020 09:03:45 +0000 (12:03 +0300)]
ath10k: Setup the msa resources before qmi init
Move the msa resources setup out of qmi init and
setup the msa resources as a part of probe before
the qmi init is done.
Tested HW: WCN3990
Tested FW: WLAN.HL.3.1-01040-QCAHLSWMTPLZ-1
Signed-off-by: Rakesh Pillai <pillair@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1586971906-20985-3-git-send-email-pillair@codeaurora.org
Rakesh Pillai [Mon, 4 May 2020 09:03:33 +0000 (12:03 +0300)]
dt-bindings: ath10k: Add wifi-firmware subnode for wifi node
Add a wifi-firmware subnode for the wifi node.
This wifi-firmware subnode is needed for the
targets which do not support TrustZone.
Signed-off-by: Rakesh Pillai <pillair@codeaurora.org>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1586971906-20985-2-git-send-email-pillair@codeaurora.org
Wen Gong [Mon, 4 May 2020 09:03:14 +0000 (12:03 +0300)]
ath10k: remove the max_sched_scan_reqs value
The struct cfg80211_wowlan of NET_DETECT WoWLAN feature share the same
struct cfg80211_sched_scan_request together with scheduled scan request
feature, and max_sched_scan_reqs of wiphy is only used for sched scan,
and ath10k does not support scheduled scan request feature, so ath10k
does not set flag NL80211_FEATURE_SCHED_SCAN_RANDOM_MAC_ADDR, but ath10k
set max_sched_scan_reqs of wiphy to a non zero value 1, then function
nl80211_add_commands_unsplit of cfg80211 will set it support command
NL80211_CMD_START_SCHED_SCAN because max_sched_scan_reqs is a non zero
value, but actually ath10k not support it, then it leads a mismatch result
for sched scan of cfg80211, then application shill found the mismatch and
stop running case of MAC random address scan and then the case fail.
After remove max_sched_scan_reqs value, it keeps match for sched scan and
case of MAC random address scan pass.
Tested with QCA6174 SDIO with firmware WLAN.RMH.4.4.1-00029.
Tested with QCA6174 PCIe with firmware WLAN.RM.4.4.1-00110-QCARMSWP-1.
Fixes: ce834e280f2f875 ("ath10k: support NET_DETECT WoWLAN feature")
Signed-off-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20191114050001.4658-1-wgong@codeaurora.org
Maharaja Kennadyrajan [Mon, 4 May 2020 09:03:13 +0000 (12:03 +0300)]
ath10k: Avoid override CE5 configuration for QCA99X0 chipsets
As the exisiting CE configurations are defined in global, there
are the chances of QCA99X0 family chipsets CE configurations
are getting changed by the ath10k_pci_override_ce_config()
function.
The override will be hit and CE5 configurations will be changed,
when the user bring up the QCA99X0 chipsets along with QCA6174
or QCA9377 chipset. (Bring up QCA99X0 family chipsets after
QCA6174 or QCA9377).
Hence, fixing this issue by moving the global CE configuration
to radio specific CE configuration.
Tested hardware: QCA9888 & QCA6174
Tested firmware: 10.4-3.10-00047 & WLAN.RM.4.4.1.c3-00058
Signed-off-by: Maharaja Kennadyrajan <mkenna@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1587649759-14381-1-git-send-email-mkenna@codeaurora.org
Sathishkumar Muruganandam [Tue, 28 Apr 2020 04:45:26 +0000 (10:15 +0530)]
ath11k: add DBG_MAC prints to track vdev events
Added DBG_MAC prints to track vdev create, delete, start and
stop events.
Signed-off-by: Sathishkumar Muruganandam <murugana@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1588049126-1490-3-git-send-email-murugana@codeaurora.org
Sathishkumar Muruganandam [Tue, 28 Apr 2020 04:45:25 +0000 (10:15 +0530)]
ath11k: fix mgmt_tx_wmi cmd sent to FW for deleted vdev
In Multi-AP VAP scenario with frequent interface up-down, there is a
chance that ath11k_mgmt_over_wmi_tx_work() will dequeue a skb
corresponding to currently deleted/stopped vdev.
FW will assert on receiving mgmt_tx_wmi cmd for already deleted vdev.
Hence adding validation checks for arvif present on the corresponding
ar before sending mgmt_tx_wmi cmd.
Signed-off-by: Sathishkumar Muruganandam <murugana@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1588049126-1490-2-git-send-email-murugana@codeaurora.org
Wei Yongjun [Mon, 27 Apr 2020 10:46:21 +0000 (10:46 +0000)]
ath11k: fix error return code in ath11k_dp_alloc()
Fix to return negative error code -ENOMEM from the error handling
case instead of 0, as done elsewhere in this function.
Fixes: d0998eb84ed3 ("ath11k: optimise ath11k_dp_tx_completion_handler")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200427104621.23752-1-weiyongjun1@huawei.com
Wei Yongjun [Mon, 27 Apr 2020 10:43:48 +0000 (10:43 +0000)]
ath10k: fix possible memory leak in ath10k_bmi_lz_data_large()
'cmd' is malloced in ath10k_bmi_lz_data_large() and should be freed
before leaving from the error handling cases, otherwise it will cause
memory leak.
Fixes: d58f466a5dee ("ath10k: add large size for BMI download data for SDIO")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200427104348.13570-1-weiyongjun1@huawei.com
Wei Yongjun [Mon, 27 Apr 2020 09:24:17 +0000 (09:24 +0000)]
ath11k: use GFP_ATOMIC under spin lock
A spin lock is taken here so we should use GFP_ATOMIC.
Fixes: d5c65159f289 ("ath11k: driver for Qualcomm IEEE 802.11ax devices")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200427092417.56236-1-weiyongjun1@huawei.com
Wen Gong [Mon, 27 Apr 2020 08:04:16 +0000 (16:04 +0800)]
ath10k: correct tx bitrate of iw for SDIO
For legacy mode, tx bitrate not show correct sometimes, for example:
iw wlan0 link
Connected to 8c:21:0a:b3:5a:64 (on wlan0)
SSID: tplinkgw
freq: 2462
RX: 19672 bytes (184 packets)
TX: 9851 bytes (87 packets)
signal: -51 dBm
rx bitrate: 54.0 MBit/s
tx bitrate: 2.8 MBit/s
This patch use the tx bitrate info from WMI_TLV_PEER_STATS_INFO_EVENTID
report from firmware, and tx bitrate show correct.
iw wlan0 link
Connected to 8c:21:0a:b3:5a:64 (on wlan0)
SSID: tplinkgw
freq: 2462
RX: 13973 bytes (120 packets)
TX: 6737 bytes (57 packets)
signal: -52 dBm
rx bitrate: 54.0 MBit/s
tx bitrate: 54.0 MBit/s
This patch only effect SDIO chip, ath10k_mac_sta_get_peer_stats_info
has check for bitrate_statistics of hw_params, it is enabled only for
"qca6174 hw3.2 sdio".
Tested with QCA6174 SDIO with firmware WLAN.RMH.4.4.1-00042.
Signed-off-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200427080416.8265-5-wgong@codeaurora.org
Wen Gong [Mon, 27 Apr 2020 08:04:15 +0000 (16:04 +0800)]
ath10k: add bitrate parse for peer stats info
The rate code and rate kbps report by WMI_TLV_PEER_STATS_INFO_EVENTID
from firmware contains all the bitrate info which include OFDM, CCK,
HT/VHT, and mac80211 need the struct rate_info which include below
parameters:
flags: bitflag of flags from &enum rate_info_flags
mcs: mcs index if struct describes an HT/VHT/HE rate
legacy: bitrate in 100kbit/s for 802.11abg
nss: number of streams (VHT & HE only)
bw: bandwidth (from &enum rate_info_bw)
For OFDM/CCK, its rate kbps indicate the bitrate, for HT/VHT, mac80211
need the above 5 parameters to cacluate the bitrate and show by iw.
After parse the bitrate info, iw show the correct rx bitrate:
localhost ~ # iw wlan0 link
rx bitrate: 234.0 MBit/s VHT-MCS 3 80MHz VHT-NSS 2
rx bitrate: 40.5 MBit/s MCS 2 40MHz
rx bitrate: 72.2 MBit/s MCS 7 short GI
rx bitrate: 54.0 MBit/s
rx bitrate: 48.0 MBit/s
Tested with QCA6174 SDIO with firmware WLAN.RMH.4.4.1-00042.
Signed-off-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200427080416.8265-4-wgong@codeaurora.org
Wen Gong [Mon, 27 Apr 2020 08:04:14 +0000 (16:04 +0800)]
ath10k: add rx bitrate report for SDIO
For SDIO chip, its rx indication is struct htt_rx_indication_hl, which
does not include the bitrate info as well as PCIe, for PCIe, it use
function ath10k_htt_rx_h_rates to parse the bitrate info in struct
rx_ppdu_start and then report it to mac80211 via ieee80211_rx_status.
SDIO does not have the same info as PCIe, then iw command can not get
the rx bitrate by "iw wlan0 station dump".
for example, it always show 6.0 MBit/s
localhost ~ # iw wlan0 link
Connected to 3c:28:6d:96:fd:69 (on wlan0)
SSID: kukui_test
freq: 5180
RX: 111800 bytes (595 packets)
TX: 35419 bytes (202 packets)
signal: -41 dBm
rx bitrate: 6.0 MBit/s
This patch is to send WMI_TLV_REQUEST_PEER_STATS_INFO_CMDID to firmware
for ath10k_sta_statistics and save the rx bitrate for WMI event
WMI_TLV_PEER_STATS_INFO_EVENTID.
This patch only effect SDIO chip, ath10k_mac_sta_get_peer_stats_info
has check for bitrate_statistics of hw_params, this patch only enable
it for "qca6174 hw3.2 sdio".
Tested with QCA6174 SDIO firmware WLAN.RMH.4.4.1-00042.
Signed-off-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200427080416.8265-3-wgong@codeaurora.org
Wen Gong [Mon, 27 Apr 2020 08:04:13 +0000 (16:04 +0800)]
ath10k: enable firmware peer stats info for wmi tlv
For wmi tlv type, firmware disable peer stats info by default, after
enable it, firmware will report WMI_TLV_PEER_STATS_INFO_EVENTID if
ath10k send WMI_TLV_REQUEST_PEER_STATS_INFO_CMDID to firmware.
Enable it will only set a flag in firmware, firmware will not report
it without receive request WMI command.
Tested with QCA6174 SDIO with firmware WLAN.RMH.4.4.1-00042.
Signed-off-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200427080416.8265-2-wgong@codeaurora.org
Jason Yan [Sun, 26 Apr 2020 09:40:37 +0000 (17:40 +0800)]
ath5k: remove conversion to bool in ath5k_ani_calibration()
The '>' expression itself is bool, no need to convert it to bool again.
This fixes the following coccicheck warning:
drivers/net/wireless/ath/ath5k/ani.c:504:56-61: WARNING: conversion to
bool not needed here
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200426094037.23048-1-yanaijie@huawei.com
Sergey Ryazanov [Fri, 24 Apr 2020 00:49:23 +0000 (03:49 +0300)]
ath9k: add calibration timeout for AR9002
ADC & I/Q calibrations could take infinite time to comple, since they
depend on received frames. In particular the I/Q mismatch calibration
requires receiving of OFDM frames for completion. But in the 2.4GHz
band, a station could receive only CCK frames for a very long time.
And while we wait for the completion of one of the mentioned
calibrations, the NF calibration is blocked. Moreover, in some
environments, I/Q calibration is unable to complete until a correct
noise calibration will be performed due to AGC behaviour.
In order to avoid delaying NF calibration on forever, limit the maximum
duration of ADCs & I/Q calibrations. If the calibration is not completed
within the maximum time, it will be interrupted and a next calibration
will be performed. The code that selects the next calibration has been
reworked to the loop so incompleted calibration will be respinned later.
Ð
\90 maximum calibration time of 30 seconds was selected to give the
calibration enough time to complete and to not interfere with the long
(NF) calibration.
Run tested with AR9220.
Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200424004923.17129-7-ryazanov.s.a@gmail.com
Sergey Ryazanov [Fri, 24 Apr 2020 00:49:22 +0000 (03:49 +0300)]
ath9k: invalidate all calibrations at once
Previously after the calibration validity period is over,
calibrations are invalidated in a one at time manner. So, for AR9002
family, which has three calibrations, the full recalibration interval
becomes 3 x ATH_RESTART_CALINTERVAL. And each next calibration will be
separated by the ATH_RESTART_CALINTERVAL time from a previous one.
It seems like it is better to do whole recalibration at once. Also, this
change makes the driver behaviour a little simpler. So, invalidate all
calibrations at once at the end of the calibration validity interval.
This change affects only AR9002 chips family, since the AR9003 utilize
only a single calibration.
Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200424004923.17129-6-ryazanov.s.a@gmail.com
Sergey Ryazanov [Fri, 24 Apr 2020 00:49:21 +0000 (03:49 +0300)]
ath9k: interleaved NF calibration on AR9002
NF calibration and other elements of long calibration are usually faster
than ADCs & I/Q calibrations due to independence of receiption of the
OFDM signal. Moreover sometime I/Q calibration can not be completed at
all without preceding NF calibration. This is due to AGC, which has a
habit to block a weak signal without regular NF calibration. Thus, we do
not need to deferr the long calibration forever.
So, if the long calibration is requested, then deferr the ADCs & I/Q
calibration(s) and run the longcal (the NF calibration in particular) to
obtain fresh noise data.
Run tested with AR9220.
Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200424004923.17129-5-ryazanov.s.a@gmail.com
Sergey Ryazanov [Fri, 24 Apr 2020 00:49:20 +0000 (03:49 +0300)]
ath9k: do not miss longcal on AR9002
Each of AGC & I/Q calibrations can take a long time. Long calibration
and NF calibration in particular are forbiden for parallel run with
ADC & I/Q calibrations. So, the chip could not be ready to perform the
long calibration at the time of request. And a request to perform the
long calibration may be lost.
In order to fix this, preserve the long calibration request as a
calibration state flag and restore the long calibration request each
time the calibration function is called again (i.e. on each subsequent
ivocation of the short calibration).
This feature will be twice useful after the next change, which will
make it possible to start the long calibration before all ADCs & I/Q
calibrations are completed.
Run tested with AR9220.
Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200424004923.17129-4-ryazanov.s.a@gmail.com
Sergey Ryazanov [Fri, 24 Apr 2020 00:49:19 +0000 (03:49 +0300)]
ath9k: remove needless NFCAL_PENDING flag setting
The NFCAL_PENDING flag is set by the ath9k_hw_start_nfcal() routine,
so there is no reason to set it manually after calling it during the
AR9002 calibrations initialization.
Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200424004923.17129-3-ryazanov.s.a@gmail.com
Sergey Ryazanov [Fri, 24 Apr 2020 00:49:18 +0000 (03:49 +0300)]
ath9k: fix AR9002 ADC and NF calibrations
ADC calibration is only required for a 80 MHz sampling rate (i.e. for
40 MHz channels), when the chip utilizes the pair of ADCs in interleved
mode. Calibration on a 20 MHz channel will never be completed.
Previous channel check is trying to exclude all channels where the
calibration will get stuck. It effectively blocks the calibration run
for HT20 channels, but fails to exclude 20 MHz channels without HT (e.g.
legacy mode channels).
Fix this issue by reworking the channel check to explicitly allow ADCs
gain & DC offset calibrations for HT40 channels only. Also update the
complicated comment to make it clear that these calibrations are for
multi-ADC mode only.
Stuck ADCs calibration blocks the NF calibration, what could make it
impossible to work in a noisy evironment: too big Rx attentuation,
invalid RSSI value, etc. So this change is actually more of a NF
calibration fix rather then the ADC calibration fix.
Run tested with AR9220.
Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200424004923.17129-2-ryazanov.s.a@gmail.com
Wen Gong [Thu, 23 Apr 2020 02:41:34 +0000 (10:41 +0800)]
ath10k: add statistics of tx retries and tx failed when tx complete disable
When tx complete is disabled, all tx status will be set with status
HTT_TX_COMPL_STATE_ACK and indicate to mac80211 by ieee80211_tx_status,
then it does not have the statistics for retries and failed packets.
count of tx retries and tx failed of command "iw wlan0 station dump"
are both 0. If tx complete is not disabled, then firmware report the
tx status and ath10k indicate the status to mac80211, then mac80211
save the statistics and command "iw wlan0 station dump" show them.
for example:
localhost ~ # iw dev wlan0 station dump
Station 3c:28:6d:96:fd:69 (on wlan0)
inactive time: 5 ms
rx bytes:
1325012
rx packets: 6477
tx bytes: 85264
tx packets: 518
tx retries: 0
tx failed: 0
This patch only effect chips with tx complete disabled, e.g. SDIO.
with this patch, output of command "iw dev wlan0 station dump":
Station c4:04:15:5d:97:22 (on wlan0)
inactive time: 608 ms
rx bytes: 180366
rx packets: 991
tx bytes:
98765577
tx packets: 64624
tx retries: 14682
tx failed: 47086
Tested with QCA6174 SDIO with firmware WLAN.RMH.4.4.1-00042.
Signed-off-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200423024134.10601-1-wgong@codeaurora.org
Wen Gong [Thu, 23 Apr 2020 02:27:58 +0000 (10:27 +0800)]
ath10k: enable rx duration report default for wmi tlv
When run command "iw dev wlan0 station dump", the rx duration is 0.
When firmware indicate WMI_UPDATE_STATS_EVENTID, extended flag of
statsis not set by default, so firmware do not report rx duration.
one sample:
localhost # iw wlan0 station dump
Station c4:04:15:5d:97:22 (on wlan0)
inactive time: 48 ms
rx bytes: 21670
rx packets: 147
tx bytes: 11529
tx packets: 100
tx retries: 88
tx failed: 36
beacon loss: 1
beacon rx: 31
rx drop misc: 47
signal: -72 [-74, -75] dBm
signal avg: -71 [-74, -75] dBm
beacon signal avg: -71 dBm
tx bitrate: 54.0 MBit/s MCS 3 40MHz
rx bitrate: 1.0 MBit/s
rx duration: 0 us
This patch enable firmware's extened flag of stats by setting flag
WMI_TLV_STAT_PEER_EXTD of ar->fw_stats_req_mask which is set in
ath10k_core_init_firmware_features via WMI_REQUEST_STATS_CMDID.
After apply this patch, rx duration show value with the command:
Station c4:04:15:5d:97:22 (on wlan0)
inactive time: 883 ms
rx bytes: 44289
rx packets: 265
tx bytes: 10838
tx packets: 93
tx retries: 899
tx failed: 103
beacon loss: 0
beacon rx: 78
rx drop misc: 46
signal: -71 [-74, -76] dBm
signal avg: -70 [-74, -76] dBm
beacon signal avg: -70 dBm
tx bitrate: 54.0 MBit/s MCS 3 40MHz
rx bitrate: 1.0 MBit/s
rx duration: 358004 us
This patch do not have side effect for all chips, because function
ath10k_debug_fw_stats_request is already exported to debugfs
"fw_stats" and WMI_REQUEST_STATS_CMDID is safely sent after condition
checked by ath10k_peer_stats_enabled in ath10k_sta_statistics.
Tested with QCA6174 SDIO with firmware WLAN.RMH.4.4.1-00042.
Signed-off-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200423022758.5365-1-wgong@codeaurora.org
Karthikeyan Periyasamy [Wed, 22 Apr 2020 10:46:18 +0000 (16:16 +0530)]
ath11k: fix reo flush send
we are sending the reo flush command for the deleted peer
tid after the ageout period reaches 1 second. This handling
causes reo ring get full when more than 128 clients are
disconnected continuously. so added the count for flush list
and reo flush command is triggered after the list count reaches
the threshold value, it is configured as 64 (half of the reo ring).
This will avoid the situation where reo ring get full.
Signed-off-by: Karthikeyan Periyasamy <periyasa@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1587552378-4884-1-git-send-email-periyasa@codeaurora.org
Wen Gong [Wed, 22 Apr 2020 08:47:19 +0000 (16:47 +0800)]
ath10k: drop the TX packet which size exceed credit size for sdio
sdio chip use DMA buffer to receive TX packet from ath10k, and it has
limitation of each buffer, if the packet size exceed the credit size,
it will trigger error in firmware.
Tested with QCA6174 SDIO with firmware
WLAN.RMH.4.4.1-00017-QCARMSWP-1.
Signed-off-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200422084719.3479-1-wgong@codeaurora.org
Maharaja Kennadyrajan [Tue, 21 Apr 2020 18:58:32 +0000 (00:28 +0530)]
ath10k: Fix the invalid tx/rx chainmask configuration
The driver is allowing the invalid tx/rx chainmask configuration
(other than 1,3,7,15) set by the user. It causes the firmware
crash due to the invalid chainmask values.
Hence, reject the invalid chainmask values in the driver by not
sending the pdev set command to the firmware.
Tested hardware: QCA9888
Tested firmware: 10.4-3.10-00047
Signed-off-by: Maharaja Kennadyrajan <mkenna@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1587495512-29813-1-git-send-email-mkenna@codeaurora.org
Wen Gong [Tue, 21 Apr 2020 12:09:35 +0000 (15:09 +0300)]
ath10k: add flush tx packets for SDIO chip
When station connected to AP, and run TX traffic such as TCP/UDP, and
system enter suspend state, then mac80211 call ath10k_flush with set
drop flag, recently it only send wmi peer flush to firmware and
firmware will flush all pending TX packets, for PCIe, firmware will
indicate the TX packets status to ath10k, and then ath10k indicate to
mac80211 TX complete with the status, then all the packets has been
flushed at this moment. For SDIO chip, it is different, its TX
complete indication is disabled by default, and it has a tx queue in
ath10k, and its tx credit control is enabled, total tx credit is 96,
when its credit is not sufficient, then the packets will buffered in
the tx queue of ath10k, max packets is TARGET_TLV_NUM_MSDU_DESC_HL
which is 1024, for SDIO, when mac80211 call ath10k_flush with set drop
flag, maybe it have pending packets in tx queue of ath10k, and if it
does not have sufficient tx credit, the packets will stay in queue
untill tx credit report from firmware, if it is a noisy environment,
tx speed is low and the tx credit report from firmware will delay more
time, then the num_pending_tx will remain > 0 untill all packets send
to firmware. After the 1st ath10k_flush, mac80211 will call the 2nd
ath10k_flush without set drop flag immediately, then it will call to
ath10k_mac_wait_tx_complete, and it wait untill num_pending_tx become
to 0, in noisy environment, it is esay to wait about near 5 seconds,
then it cause the suspend take long time.
1st and 2nd callstack of ath10k_flush
[ 303.740427] ath10k_sdio mmc1:0001:1: ath10k_flush drop:1, pending:0-0
[ 303.740495] ------------[ cut here ]------------
[ 303.740739] WARNING: CPU: 1 PID: 3921 at /mnt/host/source/src/third_party/kernel/v4.19/drivers/net/wireless/ath/ath10k/mac.c:7025 ath10k_flush+0x54/0x104 [ath10k_core]
[ 303.740757] Modules linked in: bridge stp llc ath10k_sdio ath10k_core rfcomm uinput cros_ec_rpmsg mtk_seninf mtk_cam_isp mtk_vcodec_enc mtk_fd mtk_vcodec_dec mtk_vcodec_common mtk_dip mtk_mdp3 videobuf2_dma_contig videobuf2_memops v4l2_mem2mem videobuf2_v4l2 videobuf2_common hid_google_hammer hci_uart btqca bluetooth dw9768 ov8856 ecdh_generic ov02a10 v4l2_fwnode mtk_scp mtk_rpmsg rpmsg_core mtk_scp_ipi ipt_MASQUERADE fuse iio_trig_sysfs cros_ec_sensors_ring cros_ec_sensors_sync cros_ec_light_prox cros_ec_sensors industrialio_triggered_buffer
[ 303.740914] kfifo_buf cros_ec_activity cros_ec_sensors_core lzo_rle lzo_compress ath mac80211 zram cfg80211 joydev [last unloaded: ath10k_core]
[ 303.741009] CPU: 1 PID: 3921 Comm: kworker/u16:10 Tainted: G W 4.19.95 #2
[ 303.741027] Hardware name: MediaTek krane sku176 board (DT)
[ 303.741061] Workqueue: events_unbound async_run_entry_fn
[ 303.741086] pstate:
60000005 (nZCv daif -PAN -UAO)
[ 303.741166] pc : ath10k_flush+0x54/0x104 [ath10k_core]
[ 303.741244] lr : ath10k_flush+0x54/0x104 [ath10k_core]
[ 303.741260] sp :
ffffffdf080e77a0
[ 303.741276] x29:
ffffffdf080e77a0 x28:
ffffffdef3730040
[ 303.741300] x27:
ffffff907c2240a0 x26:
ffffffde6ff39afc
[ 303.741321] x25:
ffffffdef3730040 x24:
ffffff907bf61018
[ 303.741343] x23:
ffffff907c2240a0 x22:
ffffffde6ff39a50
[ 303.741364] x21:
0000000000000001 x20:
ffffffde6ff39a50
[ 303.741385] x19:
ffffffde6bac2420 x18:
0000000000017200
[ 303.741407] x17:
ffffff907c24a000 x16:
0000000000000037
[ 303.741428] x15:
ffffff907b49a568 x14:
ffffff907cf332c1
[ 303.741476] x13:
00000000000922e4 x12:
0000000000000000
[ 303.741497] x11:
0000000000000001 x10:
0000000000000007
[ 303.741518] x9 :
f2256b8c1de4bc00 x8 :
f2256b8c1de4bc00
[ 303.741539] x7 :
ffffff907ab5e764 x6 :
0000000000000000
[ 303.741560] x5 :
0000000000000080 x4 :
0000000000000001
[ 303.741582] x3 :
ffffffdf080e74a8 x2 :
ffffff907aa91244
[ 303.741603] x1 :
ffffffdf080e74a8 x0 :
0000000000000024
[ 303.741624] Call trace:
[ 303.741701] ath10k_flush+0x54/0x104 [ath10k_core]
[ 303.741941] __ieee80211_flush_queues+0x1dc/0x358 [mac80211]
[ 303.742098] ieee80211_flush_queues+0x34/0x44 [mac80211]
[ 303.742253] ieee80211_set_disassoc+0xc0/0x5ec [mac80211]
[ 303.742399] ieee80211_mgd_deauth+0x720/0x7d4 [mac80211]
[ 303.742535] ieee80211_deauth+0x24/0x30 [mac80211]
[ 303.742720] cfg80211_mlme_deauth+0x250/0x3bc [cfg80211]
[ 303.742849] cfg80211_mlme_down+0x90/0xd0 [cfg80211]
[ 303.742971] cfg80211_disconnect+0x340/0x3a0 [cfg80211]
[ 303.743087] __cfg80211_leave+0xe4/0x17c [cfg80211]
[ 303.743203] cfg80211_leave+0x38/0x50 [cfg80211]
[ 303.743319] wiphy_suspend+0x84/0x5bc [cfg80211]
[ 303.743335] dpm_run_callback+0x170/0x304
[ 303.743346] __device_suspend+0x2dc/0x3e8
[ 303.743356] async_suspend+0x2c/0xb0
[ 303.743370] async_run_entry_fn+0x48/0xf8
[ 303.743383] process_one_work+0x304/0x604
[ 303.743394] worker_thread+0x248/0x3f4
[ 303.743403] kthread+0x120/0x130
[ 303.743416] ret_from_fork+0x10/0x18
[ 303.743812] ath10k_sdio mmc1:0001:1: ath10k_flush drop:0, pending:0-0
[ 303.743858] ------------[ cut here ]------------
[ 303.744057] WARNING: CPU: 1 PID: 3921 at /mnt/host/source/src/third_party/kernel/v4.19/drivers/net/wireless/ath/ath10k/mac.c:7025 ath10k_flush+0x54/0x104 [ath10k_core]
[ 303.744075] Modules linked in: bridge stp llc ath10k_sdio ath10k_core rfcomm uinput cros_ec_rpmsg mtk_seninf mtk_cam_isp mtk_vcodec_enc mtk_fd mtk_vcodec_dec mtk_vcodec_common mtk_dip mtk_mdp3 videobuf2_dma_contig videobuf2_memops v4l2_mem2mem videobuf2_v4l2 videobuf2_common hid_google_hammer hci_uart btqca bluetooth dw9768 ov8856 ecdh_generic ov02a10 v4l2_fwnode mtk_scp mtk_rpmsg rpmsg_core mtk_scp_ipi ipt_MASQUERADE fuse iio_trig_sysfs cros_ec_sensors_ring cros_ec_sensors_sync cros_ec_light_prox cros_ec_sensors industrialio_triggered_buffer kfifo_buf cros_ec_activity cros_ec_sensors_core lzo_rle lzo_compress ath mac80211 zram cfg80211 joydev [last unloaded: ath10k_core]
[ 303.744256] CPU: 1 PID: 3921 Comm: kworker/u16:10 Tainted: G W 4.19.95 #2
[ 303.744273] Hardware name: MediaTek krane sku176 board (DT)
[ 303.744301] Workqueue: events_unbound async_run_entry_fn
[ 303.744325] pstate:
60000005 (nZCv daif -PAN -UAO)
[ 303.744403] pc : ath10k_flush+0x54/0x104 [ath10k_core]
[ 303.744480] lr : ath10k_flush+0x54/0x104 [ath10k_core]
[ 303.744496] sp :
ffffffdf080e77a0
[ 303.744512] x29:
ffffffdf080e77a0 x28:
ffffffdef3730040
[ 303.744534] x27:
ffffff907c2240a0 x26:
ffffffde6ff39afc
[ 303.744556] x25:
ffffffdef3730040 x24:
ffffff907bf61018
[ 303.744577] x23:
ffffff907c2240a0 x22:
ffffffde6ff39a50
[ 303.744598] x21:
0000000000000000 x20:
ffffffde6ff39a50
[ 303.744620] x19:
ffffffde6bac2420 x18:
000000000001831c
[ 303.744641] x17:
ffffff907c24a000 x16:
0000000000000037
[ 303.744662] x15:
ffffff907b49a568 x14:
ffffff907cf332c1
[ 303.744683] x13:
00000000000922ea x12:
0000000000000000
[ 303.744704] x11:
0000000000000001 x10:
0000000000000007
[ 303.744747] x9 :
f2256b8c1de4bc00 x8 :
f2256b8c1de4bc00
[ 303.744768] x7 :
ffffff907ab5e764 x6 :
0000000000000000
[ 303.744789] x5 :
0000000000000080 x4 :
0000000000000001
[ 303.744810] x3 :
ffffffdf080e74a8 x2 :
ffffff907aa91244
[ 303.744831] x1 :
ffffffdf080e74a8 x0 :
0000000000000024
[ 303.744853] Call trace:
[ 303.744929] ath10k_flush+0x54/0x104 [ath10k_core]
[ 303.745098] __ieee80211_flush_queues+0x1dc/0x358 [mac80211]
[ 303.745277] ieee80211_flush_queues+0x34/0x44 [mac80211]
[ 303.745424] ieee80211_set_disassoc+0x108/0x5ec [mac80211]
[ 303.745569] ieee80211_mgd_deauth+0x720/0x7d4 [mac80211]
[ 303.745706] ieee80211_deauth+0x24/0x30 [mac80211]
[ 303.745853] cfg80211_mlme_deauth+0x250/0x3bc [cfg80211]
[ 303.745979] cfg80211_mlme_down+0x90/0xd0 [cfg80211]
[ 303.746103] cfg80211_disconnect+0x340/0x3a0 [cfg80211]
[ 303.746219] __cfg80211_leave+0xe4/0x17c [cfg80211]
[ 303.746335] cfg80211_leave+0x38/0x50 [cfg80211]
[ 303.746452] wiphy_suspend+0x84/0x5bc [cfg80211]
[ 303.746467] dpm_run_callback+0x170/0x304
[ 303.746477] __device_suspend+0x2dc/0x3e8
[ 303.746487] async_suspend+0x2c/0xb0
[ 303.746498] async_run_entry_fn+0x48/0xf8
[ 303.746510] process_one_work+0x304/0x604
[ 303.746521] worker_thread+0x248/0x3f4
[ 303.746530] kthread+0x120/0x130
[ 303.746542] ret_from_fork+0x10/0x18
one sample's debugging log: it wait 3190 ms(5000 - 1810).
1st ath10k_flush, it has 120 packets in tx queue of ath10k:
<...>-1513 [000] .... 25374.786005: ath10k_log_err: ath10k_sdio mmc1:0001:1 ath10k_flush drop:1, pending:120-0
<...>-1513 [000] ...1 25374.788375: ath10k_log_warn: ath10k_sdio mmc1:0001:1 ath10k_htt_tx_mgmt_inc_pending htt->num_pending_mgmt_tx:0
<...>-1500 [001] .... 25374.790143: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 bundle tx work, eid:1, count:121
2st ath10k_flush, it has 121 packets in tx queue of ath10k:
<...>-1513 [000] .... 25374.790571: ath10k_log_err: ath10k_sdio mmc1:0001:1 ath10k_flush drop:0, pending:121-0
<...>-1513 [000] .... 25374.791990: ath10k_log_err: ath10k_sdio mmc1:0001:1 ath10k_mac_wait_tx_complete state:1 pending:121-0
<...>-1508 [001] .... 25374.792696: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 credit update: delta:46
<...>-1508 [001] .... 25374.792700: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 credit total:46
<...>-1508 [001] .... 25374.792729: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 bundle tx work, eid:1, count:121
<...>-1508 [001] .... 25374.792937: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 bundle tx status:0, eid:1, req count:88, count:32, len:49792
<...>-1508 [001] .... 25374.793031: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 bundle tx status:0, eid:1, req count:75, count:14, len:21784
kworker/u16:0-25773 [003] .... 25374.793701: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 bundle tx complete, eid:1, pending complete count:46
<...>-1881 [000] .... 25375.073178: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 credit update: delta:24
<...>-1881 [000] .... 25375.073182: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 credit total:24
<...>-1881 [000] .... 25375.073429: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 bundle tx work, eid:1, count:75
<...>-1879 [001] .... 25375.074090: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 bundle tx complete, eid:1, pending complete count:24
<...>-1881 [000] .... 25375.074123: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 bundle tx status:0, eid:1, req count:51, count:24, len:37344
<...>-1879 [001] .... 25375.270126: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 credit update: delta:26
<...>-1879 [001] .... 25375.270130: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 credit total:26
<...>-1488 [000] .... 25375.270174: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 bundle tx work, eid:1, count:51
<...>-1488 [000] .... 25375.270529: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 bundle tx status:0, eid:1, req count:25, count:26, len:40456
<...>-1879 [001] .... 25375.270693: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 bundle tx complete, eid:1, pending complete count:26
<...>-1488 [001] .... 25377.775885: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 credit update: delta:12
<...>-1488 [001] .... 25377.775890: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 credit total:12
<...>-1488 [001] .... 25377.775933: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 bundle tx work, eid:1, count:25
<...>-1488 [001] .... 25377.776059: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 bundle tx status:0, eid:1, req count:13, count:12, len:18672
<...>-1879 [001] .... 25377.776100: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 bundle tx complete, eid:1, pending complete count:12
<...>-1488 [001] .... 25377.878079: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 credit update: delta:15
<...>-1488 [001] .... 25377.878087: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 credit total:15
<...>-1879 [000] .... 25377.878323: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 bundle tx work, eid:1, count:13
<...>-1879 [000] .... 25377.878487: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 bundle tx status:0, eid:1, req count:0, count:13, len:20228
<...>-1879 [000] .... 25377.878497: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 bundle tx complete, eid:1, pending complete count:13
<...>-1488 [001] .... 25377.919927: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 credit update: delta:11
<...>-1488 [001] .... 25377.919932: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 credit total:13
<...>-1488 [001] .... 25377.919976: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 bundle tx work, eid:1, count:0
<...>-1881 [000] .... 25377.982645: ath10k_log_warn: ath10k_sdio mmc1:0001:1 HTT_T2H_MSG_TYPE_MGMT_TX_COMPLETION status:0
<...>-1513 [001] .... 25377.982973: ath10k_log_err: ath10k_sdio mmc1:0001:1 ath10k_mac_wait_tx_complete time_left:1810, pending:0-0
Flush all pending TX packets for the 1st ath10k_flush reduced the wait
time of the 2nd ath10k_flush and then suspend take short time.
This Patch only effect SDIO chips.
Tested with QCA6174 SDIO with firmware WLAN.RMH.4.4.1-00042.
Signed-off-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200415233730.10581-1-wgong@codeaurora.org
Wen Gong [Tue, 21 Apr 2020 12:09:35 +0000 (15:09 +0300)]
ath10k: enable alt data of TX path for sdio
The default credit size is 1792 bytes, but the IP mtu is 1500 bytes,
then it has about 290 bytes's waste for each data packet on sdio
transfer path for TX bundle, it will reduce the transmission utilization
ratio for data packet.
This patch enable the small credit size in firmware, firmware will use
the new credit size 1556 bytes, it will increase the transmission
utilization ratio for data packet on TX patch. It results in significant
performance improvement on TX path.
This patch only effect sdio chip, it will not effect PCI, SNOC etc.
Tested with QCA6174 SDIO with firmware
WLAN.RMH.4.4.1-00017-QCARMSWP-1.
Signed-off-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200410061400.14231-3-wgong@codeaurora.org
Wen Gong [Tue, 21 Apr 2020 12:09:35 +0000 (15:09 +0300)]
ath10k: add htt TX bundle for sdio
The transmission utilization ratio for sdio bus for small packet is
slow, because the space and time cost for sdio bus is same for large
length packet and small length packet. So the speed of data for large
length packet is higher than small length.
Test result of different length of data:
data packet(byte) cost time(us) calculated rate(Mbps)
256 28 73
512 33 124
1024 35 234
1792 45 318
14336 168 682
28672 333 688
57344 660 695
This patch change the TX packet from single packet to a large length
bundle packet, max size is 32, it results in significant performance
improvement on TX path.
Also there's a fourth thread "ath10k_tx_complete_wq" added to ath10k as it
improves TCP RX throughput (values in Mbps):
TCP-RX TCP-TX UDP-RX UDP-TX
use workqueue_tx_complete 423 357 448 412
change it to ar->workqueue 410 360 449 414
change it to ar->workqueue_aux 405 339 446 401
This patch only effect sdio chip, it will not effect PCI, SNOC etc.
It only enable bundle for sdio chip.
Tested with QCA6174 SDIO with firmware
WLAN.RMH.4.4.1-00017-QCARMSWP-1.
Signed-off-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200410061400.14231-2-wgong@codeaurora.org
Jason Yan [Mon, 20 Apr 2020 12:37:45 +0000 (20:37 +0800)]
ath11k: remove conversion to bool in ath11k_debug_fw_stats_process()
The '==' expression itself is bool, no need to convert it to bool again.
This fixes the following coccicheck warning:
drivers/net/wireless/ath/ath11k/debug.c:198:57-62: WARNING: conversion
to bool not needed here
drivers/net/wireless/ath/ath11k/debug.c:218:58-63: WARNING: conversion
to bool not needed here
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200420123745.4159-1-yanaijie@huawei.com
Jason Yan [Mon, 20 Apr 2020 12:37:18 +0000 (20:37 +0800)]
ath11k: remove conversion to bool in ath11k_dp_rxdesc_mpdu_valid()
The '==' expression itself is bool, no need to convert it to bool again.
This fixes the following coccicheck warning:
drivers/net/wireless/ath/ath11k/dp_rx.c:255:46-51: WARNING: conversion
to bool not needed here
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200420123718.3384-1-yanaijie@huawei.com
Kalle Valo [Thu, 16 Apr 2020 11:50:59 +0000 (14:50 +0300)]
ath10k: hif: make send_complete_check op optional
That way we don't need to have an empty function in sdio.c.
No functional changes, compile tested only.
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1587037859-28873-5-git-send-email-kvalo@codeaurora.org
Kalle Valo [Thu, 16 Apr 2020 11:50:58 +0000 (14:50 +0300)]
ath10k: sdio: remove _hif_ prefix from functions not part of hif interface
The _hif_ prefix should be used only on functions part of ath10k_hif_ops, so
remove it from functions which should not have it.
No functional changes, compile tested only.
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1587037859-28873-4-git-send-email-kvalo@codeaurora.org
Wen Gong [Thu, 16 Apr 2020 11:50:57 +0000 (14:50 +0300)]
ath10k: improve power save performance for sdio
This patch is to set register to allow the mbox enter sleep status
if it does not have tx traffic and wakeup it if tx traffic arrive.
After mbox enter sleep status, the soc will enter sleep status by
firmware, this will save power. The power consume drops from about
90mW to about 10mW with this patch.
This patch only effect sdio chip.
Tested with QCA6174 SDIO with firmware WLAN.RMH.4.4.1-00029.
Signed-off-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1587037859-28873-3-git-send-email-kvalo@codeaurora.org
Kalle Valo [Thu, 16 Apr 2020 11:50:56 +0000 (14:50 +0300)]
ath10k: rename ath10k_hif_swap_mailbox() to ath10k_hif_start_post()
Convert ath10k_hif_swap_mailbox() to a more generic op so that bus drivers can
do more than just swap the mailbox, for example set power save settings like in
the following sdio patch.
No functional changes, compile tested only.
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1587037859-28873-2-git-send-email-kvalo@codeaurora.org
Sriram R [Mon, 13 Apr 2020 12:57:02 +0000 (18:27 +0530)]
ath11k: Add dynamic tcl ring selection logic with retry mechanism
IPQ8074 HW supports three TCL rings for tx. Currently these rings
are mapped based on the Access categories, viz. VO, VI, BE, BK.
In case, one of the traffic type dominates, then it could stress
the same tcl rings. Rather, it would be optimal to make use of all
the rings in a round robin fashion irrespective of the traffic type
so that the load could be evenly distributed among all the rings.
Also, in case the selected ring is busy or full, a retry mechanism
is used to ensure other available ring is selected without dropping
the packet.
In SMP systems, this change avoids a single CPU from getting hogged
when heavy traffic of same category is transmitted.
The tx completion interrupts corresponding to the used tcl ring
would be more which causes the assigned CPU to get hogged.
Distribution of tx packets to different tcl rings helps balance
this load.
Signed-off-by: Sriram R <srirrama@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1586782622-22570-1-git-send-email-srirrama@codeaurora.org
Govindaraj Saminathan [Mon, 13 Apr 2020 11:21:12 +0000 (16:51 +0530)]
ath11k: cleanup reo command error code overwritten
should not overwrite the error code. No buffer available then return
invalid. For other failures return the error code of actual failure.
Signed-off-by: Govindaraj Saminathan <gsamin@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1586776872-25766-1-git-send-email-gsamin@codeaurora.org
Mamatha Telu [Sun, 12 Apr 2020 18:24:35 +0000 (23:54 +0530)]
ath10k: Fix typo in warning messages
Fix some typo:
s/fnrom/from
s/pkgs/pkts/
s/AMSUs/AMSDUs/
Signed-off-by: Mamatha Telu <telumamatha36@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1586715875-5182-1-git-send-email-telumamatha36@gmail.com
Maharaja Kennadyrajan [Fri, 10 Apr 2020 17:06:45 +0000 (22:36 +0530)]
ath11k: Fix rx_filter flags setting for per peer rx_stats
Rx_filter flags are set with default filter flags during
wifi up/down sequence even though the 'ext_rx_stats' debugfs
is enabled as 1. So, that we are not getting proper per peer
rx_stats.
Hence, fixing this by setting the missing rx_filter when
ext_rx_stats is already set/enabled.
Signed-off-by: Maharaja Kennadyrajan <mkenna@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1586538405-16226-3-git-send-email-mkenna@codeaurora.org
Ritesh Singh [Fri, 10 Apr 2020 17:06:44 +0000 (22:36 +0530)]
ath11k: Fix fw assert by setting proper vht cap
After setting fixed vht-rate if new station is trying to
assoc with mu_bfee cap, or if a sta is already connected
with mu_bfee cap then set the fixed vht-rate and
reconnecting the sta, FW assert is happening.
So to avoid this, reset the MU_BEAMFORMEE bit in vht->caps,
if mcs_index is invalid for nss 1.
Signed-off-by: Ritesh Singh <ritesi@codeaurora.org>
Signed-off-by: Maharaja Kennadyrajan <mkenna@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1586538405-16226-2-git-send-email-mkenna@codeaurora.org
Maharaja Kennadyrajan [Fri, 10 Apr 2020 17:06:43 +0000 (22:36 +0530)]
ath11k: Cleanup in pdev destroy and mac register during crash on recovery
Debugfs pdev entries should be cleaned up during the crash
on recovery. If not, mac register will fail for the reason
that it is already registered during core reconfigure.
Signed-off-by: Maharaja Kennadyrajan <mkenna@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1586538405-16226-1-git-send-email-mkenna@codeaurora.org
Kalle Valo [Tue, 14 Apr 2020 09:39:43 +0000 (12:39 +0300)]
Merge ath-next from git://git./linux/kernel/git/kvalo/ath.git
ath.git patches for v5.8. Major changes:
ath11k
* add debugfs file for testing ADDBA and DELBA
ath10k
* enable VHT160 and VHT80+80 modes
* enable radar detection in secondary segment
* sdio: disable TX complete indication to improve throughput
Manikanta Pubbisetty [Thu, 9 Apr 2020 08:43:17 +0000 (14:13 +0530)]
ath11k: rx path optimizations
During RX, accessing the reo dest ring descriptor directly is consuming
a lot of CPU cycles. Accessing the descriptor after copying it locally
has improved CPU usage by around ~10-15% while measuring throughput
in RX DBTC test cases(all radios are involved in the throughput
measurement).
HW tested: IPQ8074
Signed-off-by: Manikanta Pubbisetty <mpubbise@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1586421797-885-1-git-send-email-mpubbise@codeaurora.org
Manikanta Pubbisetty [Thu, 9 Apr 2020 08:30:13 +0000 (14:00 +0530)]
ath11k: set IRQ_DISABLE_UNLAZY flag for DP interrupts
Unlike CE interrupts, DP interrupts are not enabled/disabled at
source; they are enabled/disabled only at GIC level, therefore
it is required to set IRQ_DISABLE_UNLAZY flag to avoid spurious
interrupts.
Signed-off-by: Manikanta Pubbisetty <mpubbise@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1586421013-23025-1-git-send-email-mpubbise@codeaurora.org
Aloka Dixit [Wed, 8 Apr 2020 17:41:17 +0000 (10:41 -0700)]
ath11k: Fix TWT radio count
TWT feature fails on radio2 because physical device count is
hardcoded to 2. Set value dynamically.
Signed-off-by: Aloka Dixit <alokad@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200408174117.22957-1-alokad@codeaurora.org
Karthikeyan Periyasamy [Wed, 8 Apr 2020 11:05:57 +0000 (16:35 +0530)]
ath11k: Modify the interrupt timer threshold
Modify the interrupt timer threshold param as 256 to avoid HW watchdog
in heavy multicast traffic scenario.
Signed-off-by: Karthikeyan Periyasamy <periyasa@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1586343957-21474-1-git-send-email-periyasa@codeaurora.org
Karthikeyan Periyasamy [Wed, 8 Apr 2020 11:03:15 +0000 (16:33 +0530)]
ath11k: fix duplication peer create on same radio
Add the pdev index information in the peer object to validate
the peer creation. Ignore the peer creation request, if the given
MAC address is already present in the peer list with same radio.
If we allow the peer creation in above scenario, FW assert will happen.
Above scenario occurred in two cases, where Multiple AP VAP created in
the same radio.
1. when testing tool sends association request to two AP with same
MAC address
2. when a station do roaming from one AP VAP to another AP VAP.
Signed-off-by: Karthikeyan Periyasamy <periyasa@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1586343795-21422-1-git-send-email-periyasa@codeaurora.org
Linus Torvalds [Sun, 12 Apr 2020 19:35:55 +0000 (12:35 -0700)]
Linux 5.7-rc1
Linus Torvalds [Sun, 12 Apr 2020 18:04:58 +0000 (11:04 -0700)]
MAINTAINERS: sort field names for all entries
This sorts the actual field names too, potentially causing even more
chaos and confusion at merge time if you have edited the MAINTAINERS
file. But the end result is a more consistent layout, and hopefully
it's a one-time pain minimized by doing this just before the -rc1
release.
This was entirely scripted:
./scripts/parse-maintainers.pl --input=MAINTAINERS --output=MAINTAINERS --order
Requested-by: Joe Perches <joe@perches.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 12 Apr 2020 18:03:52 +0000 (11:03 -0700)]
MAINTAINERS: sort entries by entry name
They are all supposed to be sorted, but people who add new entries don't
always know the alphabet. Plus sometimes the entry names get edited,
and people don't then re-order the entry.
Let's see how painful this will be for merging purposes (the MAINTAINERS
file is often edited in various different trees), but Joe claims there's
relatively few patches in -next that touch this, and doing it just
before -rc1 is likely the best time. Fingers crossed.
This was scripted with
/scripts/parse-maintainers.pl --input=MAINTAINERS --output=MAINTAINERS
but then I also ended up manually upper-casing a few entry names that
stood out when looking at the end result.
Requested-by: Joe Perches <joe@perches.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 12 Apr 2020 17:17:16 +0000 (10:17 -0700)]
Merge tag 'x86-urgent-2020-04-12' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"A set of three patches to fix the fallout of the newly added split
lock detection feature.
It addressed the case where a KVM guest triggers a split lock #AC and
KVM reinjects it into the guest which is not prepared to handle it.
Add proper sanity checks which prevent the unconditional injection
into the guest and handles the #AC on the host side in the same way as
user space detections are handled. Depending on the detection mode it
either warns and disables detection for the task or kills the task if
the mode is set to fatal"
* tag 'x86-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
KVM: VMX: Extend VMXs #AC interceptor to handle split lock #AC in guest
KVM: x86: Emulate split-lock access as a write in emulator
x86/split_lock: Provide handle_guest_split_lock()
Linus Torvalds [Sun, 12 Apr 2020 17:13:14 +0000 (10:13 -0700)]
Merge tag 'timers-urgent-2020-04-12' of git://git./linux/kernel/git/tip/tip
Pull time(keeping) updates from Thomas Gleixner:
- Fix the time_for_children symlink in /proc/$PID/ so it properly
reflects that it part of the 'time' namespace
- Add the missing userns limit for the allowed number of time
namespaces, which was half defined but the actual array member was
not added. This went unnoticed as the array has an exessive empty
member at the end but introduced a user visible regression as the
output was corrupted.
- Prevent further silent ucount corruption by adding a BUILD_BUG_ON()
to catch half updated data.
* tag 'timers-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
ucount: Make sure ucounts in /proc/sys/user don't regress again
time/namespace: Add max_time_namespaces ucount
time/namespace: Fix time_for_children symlink
Linus Torvalds [Sun, 12 Apr 2020 17:09:19 +0000 (10:09 -0700)]
Merge tag 'sched-urgent-2020-04-12' of git://git./linux/kernel/git/tip/tip
Pull scheduler fixes/updates from Thomas Gleixner:
- Deduplicate the average computations in the scheduler core and the
fair class code.
- Fix a raise between runtime distribution and assignement which can
cause exceeding the quota by up to 70%.
- Prevent negative results in the imbalanace calculation
- Remove a stale warning in the workqueue code which can be triggered
since the call site was moved out of preempt disabled code. It's a
false positive.
- Deduplicate the print macros for procfs
- Add the ucmap values to the SCHED_DEBUG procfs output for completness
* tag 'sched-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/debug: Add task uclamp values to SCHED_DEBUG procfs
sched/debug: Factor out printing formats into common macros
sched/debug: Remove redundant macro define
sched/core: Remove unused rq::last_load_update_tick
workqueue: Remove the warning in wq_worker_sleeping()
sched/fair: Fix negative imbalance in imbalance calculation
sched/fair: Fix race between runtime distribution and assignment
sched/fair: Align rq->avg_idle and rq->avg_scan_cost
Linus Torvalds [Sun, 12 Apr 2020 17:05:24 +0000 (10:05 -0700)]
Merge tag 'perf-urgent-2020-04-12' of git://git./linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner:
"Three fixes/updates for perf:
- Fix the perf event cgroup tracking which tries to track the cgroup
even for disabled events.
- Add Ice Lake server support for uncore events
- Disable pagefaults when retrieving the physical address in the
sampling code"
* tag 'perf-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Disable page faults when getting phys address
perf/x86/intel/uncore: Add Ice Lake server uncore support
perf/cgroup: Correct indirection in perf_less_group_idx()
perf/core: Fix event cgroup tracking
Linus Torvalds [Sun, 12 Apr 2020 16:47:10 +0000 (09:47 -0700)]
Merge tag 'locking-urgent-2020-04-12' of git://git./linux/kernel/git/tip/tip
Pull locking fixes from Thomas Gleixner:
"Three small fixes/updates for the locking core code:
- Plug a task struct reference leak in the percpu rswem
implementation.
- Document the refcount interaction with PID_MAX_LIMIT
- Improve the 'invalid wait context' data dump in lockdep so it
contains all information which is required to decode the problem"
* tag 'locking-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/lockdep: Improve 'invalid wait context' splat
locking/refcount: Document interaction with PID_MAX_LIMIT
locking/percpu-rwsem: Fix a task_struct refcount
Linus Torvalds [Sun, 12 Apr 2020 16:41:01 +0000 (09:41 -0700)]
Merge tag '5.7-rc-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"Ten cifs/smb fixes:
- five RDMA (smbdirect) related fixes
- add experimental support for swap over SMB3 mounts
- also a fix which improves performance of signed connections"
* tag '5.7-rc-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6:
smb3: enable swap on SMB3 mounts
smb3: change noisy error message to FYI
smb3: smbdirect support can be configured by default
cifs: smbd: Do not schedule work to send immediate packet on every receive
cifs: smbd: Properly process errors on ib_post_send
cifs: Allocate crypto structures on the fly for calculating signatures of incoming packets
cifs: smbd: Update receive credits before sending and deal with credits roll back on failure before sending
cifs: smbd: Check send queue size before posting a send
cifs: smbd: Merge code to track pending packets
cifs: ignore cached share root handle closing errors
Linus Torvalds [Sun, 12 Apr 2020 16:39:47 +0000 (09:39 -0700)]
Merge tag 'nfs-for-5.7-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfix from Trond Myklebust:
"Fix an RCU read lock leakage in pnfs_alloc_ds_commits_list()"
* tag 'nfs-for-5.7-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
pNFS: Fix RCU lock leakage
Linus Torvalds [Sat, 11 Apr 2020 18:38:44 +0000 (11:38 -0700)]
Merge tag 'nios2-v5.7-rc1' of git://git./linux/kernel/git/lftan/nios2
Pull nios2 updates from Ley Foon Tan:
- Remove nios2-dev@lists.rocketboards.org from MAINTAINERS
- remove 'resetvalue' property
- rename 'altr,gpio-bank-width' -> 'altr,ngpio'
- enable the common clk subsystem on Nios2
* tag 'nios2-v5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2:
MAINTAINERS: Remove nios2-dev@lists.rocketboards.org
arch: nios2: remove 'resetvalue' property
arch: nios2: rename 'altr,gpio-bank-width' -> 'altr,ngpio'
arch: nios2: Enable the common clk subsystem on Nios2
Linus Torvalds [Sat, 11 Apr 2020 18:34:36 +0000 (11:34 -0700)]
Merge tag 'dma-mapping-5.7-1' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping fixes from Christoph Hellwig:
- fix an integer truncation in dma_direct_get_required_mask
(Kishon Vijay Abraham)
- fix the display of dma mapping types (Grygorii Strashko)
* tag 'dma-mapping-5.7-1' of git://git.infradead.org/users/hch/dma-mapping:
dma-debug: fix displaying of dma allocation type
dma-direct: fix data truncation in dma_direct_get_required_mask()
Linus Torvalds [Sat, 11 Apr 2020 16:46:12 +0000 (09:46 -0700)]
Merge tag 'kbuild-v5.7-2' of git://git./linux/kernel/git/masahiroy/linux-kbuild
Pull more Kbuild updates from Masahiro Yamada:
- raise minimum supported binutils version to 2.23
- remove old CONFIG_AS_* macros that we know binutils >= 2.23 supports
- move remaining CONFIG_AS_* tests to Kconfig from Makefile
- enable -Wtautological-compare warnings to catch more issues
- do not support GCC plugins for GCC <= 4.7
- fix various breakages of 'make xconfig'
- include the linker version used for linking the kernel into
LINUX_COMPILER, which is used for the banner, and also exposed to
/proc/version
- link lib-y objects to vmlinux forcibly when CONFIG_MODULES=y, which
allows us to remove the lib-ksyms.o workaround, and to solve the last
known issue of the LLVM linker
- add dummy tools in scripts/dummy-tools/ to enable all compiler tests
in Kconfig, which will be useful for distro maintainers
- support the single switch, LLVM=1 to use Clang and all LLVM utilities
instead of GCC and Binutils.
- support LLVM_IAS=1 to enable the integrated assembler, which is still
experimental
* tag 'kbuild-v5.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (36 commits)
kbuild: fix comment about missing include guard detection
kbuild: support LLVM=1 to switch the default tools to Clang/LLVM
kbuild: replace AS=clang with LLVM_IAS=1
kbuild: add dummy toolchains to enable all cc-option etc. in Kconfig
kbuild: link lib-y objects to vmlinux forcibly when CONFIG_MODULES=y
MIPS: fw: arc: add __weak to prom_meminit and prom_free_prom_memory
kbuild: remove -I$(srctree)/tools/include from scripts/Makefile
kbuild: do not pass $(KBUILD_CFLAGS) to scripts/mkcompile_h
Documentation/llvm: fix the name of llvm-size
kbuild: mkcompile_h: Include $LD version in /proc/version
kconfig: qconf: Fix a few alignment issues
kconfig: qconf: remove some old bogus TODOs
kconfig: qconf: fix support for the split view mode
kconfig: qconf: fix the content of the main widget
kconfig: qconf: Change title for the item window
kconfig: qconf: clean deprecated warnings
gcc-plugins: drop support for GCC <= 4.7
kbuild: Enable -Wtautological-compare
x86: update AS_* macros to binutils >=2.23, supporting ADX and AVX2
crypto: x86 - clean up poly1305-x86_64-cryptogams.S by 'make clean'
...
Sedat Dilek [Sat, 11 Apr 2020 13:29:43 +0000 (15:29 +0200)]
mailmap: Add Sedat Dilek (replacement for expired email address)
I do not longer work for credativ Germany.
Please, use my private email address instead.
This is for the case when people want to CC me on
patches sent from my old business email address.
Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Trond Myklebust [Sat, 11 Apr 2020 15:37:18 +0000 (11:37 -0400)]
pNFS: Fix RCU lock leakage
Another brown paper bag moment. pnfs_alloc_ds_commits_list() is leaking
the RCU lock.
Fixes: a9901899b649 ("pNFS: Add infrastructure for cleaning up per-layout commit structures")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Xiaoyao Li [Fri, 10 Apr 2020 11:54:02 +0000 (13:54 +0200)]
KVM: VMX: Extend VMXs #AC interceptor to handle split lock #AC in guest
Two types of #AC can be generated in Intel CPUs:
1. legacy alignment check #AC
2. split lock #AC
Reflect #AC back into the guest if the guest has legacy alignment checks
enabled or if split lock detection is disabled.
If the #AC is not a legacy one and split lock detection is enabled, then
invoke handle_guest_split_lock() which will either warn and disable split
lock detection for this task or force SIGBUS on it.
[ tglx: Switch it to handle_guest_split_lock() and rename the misnamed
helper function. ]
Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lkml.kernel.org/r/20200410115517.176308876@linutronix.de
Xiaoyao Li [Fri, 10 Apr 2020 11:54:01 +0000 (13:54 +0200)]
KVM: x86: Emulate split-lock access as a write in emulator
Emulate split-lock accesses as writes if split lock detection is on
to avoid #AC during emulation, which will result in a panic(). This
should never occur for a well-behaved guest, but a malicious guest can
manipulate the TLB to trigger emulation of a locked instruction[1].
More discussion can be found at [2][3].
[1] https://lkml.kernel.org/r/
8c5b11c9-58df-38e7-a514-
dc12d687b198@redhat.com
[2] https://lkml.kernel.org/r/
20200131200134.GD18946@linux.intel.com
[3] https://lkml.kernel.org/r/
20200227001117.GX9940@linux.intel.com
Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lkml.kernel.org/r/20200410115517.084300242@linutronix.de
Thomas Gleixner [Fri, 10 Apr 2020 11:54:00 +0000 (13:54 +0200)]
x86/split_lock: Provide handle_guest_split_lock()
Without at least minimal handling for split lock detection induced #AC,
VMX will just run into the same problem as the VMWare hypervisor, which
was reported by Kenneth.
It will inject the #AC blindly into the guest whether the guest is
prepared or not.
Provide a function for guest mode which acts depending on the host
SLD mode. If mode == sld_warn, treat it like user space, i.e. emit a
warning, disable SLD and mark the task accordingly. Otherwise force
SIGBUS.
[ bp: Add a !CPU_SUP_INTEL stub for handle_guest_split_lock(). ]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lkml.kernel.org/r/20200410115516.978037132@linutronix.de
Link: https://lkml.kernel.org/r/20200402123258.895628824@linutronix.de
Masahiro Yamada [Wed, 8 Apr 2020 18:29:19 +0000 (03:29 +0900)]
kbuild: fix comment about missing include guard detection
The keyword here is 'twice' to explain the trick.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Linus Torvalds [Sat, 11 Apr 2020 00:57:48 +0000 (17:57 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge yet more updates from Andrew Morton:
- Almost all of the rest of MM (memcg, slab-generic, slab, pagealloc,
gup, hugetlb, pagemap, memremap)
- Various other things (hfs, ocfs2, kmod, misc, seqfile)
* akpm: (34 commits)
ipc/util.c: sysvipc_find_ipc() should increase position index
kernel/gcov/fs.c: gcov_seq_next() should increase position index
fs/seq_file.c: seq_read(): add info message about buggy .next functions
drivers/dma/tegra20-apb-dma.c: fix platform_get_irq.cocci warnings
change email address for Pali Rohár
selftests: kmod: test disabling module autoloading
selftests: kmod: fix handling test numbers above 9
docs: admin-guide: document the kernel.modprobe sysctl
fs/filesystems.c: downgrade user-reachable WARN_ONCE() to pr_warn_once()
kmod: make request_module() return an error when autoloading is disabled
mm/memremap: set caching mode for PCI P2PDMA memory to WC
mm/memory_hotplug: add pgprot_t to mhp_params
powerpc/mm: thread pgprot_t through create_section_mapping()
x86/mm: introduce __set_memory_prot()
x86/mm: thread pgprot_t through init_memory_mapping()
mm/memory_hotplug: rename mhp_restrictions to mhp_params
mm/memory_hotplug: drop the flags field from struct mhp_restrictions
mm/special: create generic fallbacks for pte_special() and pte_mkspecial()
mm/vma: introduce VM_ACCESS_FLAGS
mm/vma: define a default value for VM_DATA_DEFAULT_FLAGS
...
Linus Torvalds [Sat, 11 Apr 2020 00:53:43 +0000 (17:53 -0700)]
Merge tag 'docs-5.7-2' of git://git.lwn.net/linux
Pull Documentation fixes from Jonathan Corbet:
"A handful of late-arriving fixes for the documentation tree"
* tag 'docs-5.7-2' of git://git.lwn.net/linux:
Documentation: android: binderfs: add 'stats' mount option
Documentation: driver-api/usb/writing_usb_driver.rst Updates documentation links
docs: driver-api: address duplicate label warning
Documentation: sysrq: fix RST formatting
docs: kernel-parameters.txt: Fix broken references
docs: kernel-parameters.txt: Remove nompx
docs: filesystems: fix typo in qnx6.rst
Linus Torvalds [Sat, 11 Apr 2020 00:50:01 +0000 (17:50 -0700)]
Merge tag 'for-linus-5.7-ofs1' of git://git./linux/kernel/git/hubcap/linux
Pull orangefs updates from Mike Marshall:
"A fix and two cleanups.
Fix:
- Christoph Hellwig noticed that some logic I added to
orangefs_file_read_iter introduced a race condition, so he sent a
reversion patch. I had to modify his patch since reverting at this
point broke Orangefs.
Cleanups:
- Christoph Hellwig noticed that we were doing some unnecessary work
in orangefs_flush, so he sent in a patch that removed the un-needed
code.
- Al Viro told me he had trouble building Orangefs. Orangefs should
be easy to build, even for Al :-).
I looked back at the test server build notes in orangefs.txt, just
in case that's where the trouble really is, and found a couple of
typos and made a couple of clarifications"
* tag 'for-linus-5.7-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
orangefs: clarify build steps for test server in orangefs.txt
orangefs: don't mess with I_DIRTY_TIMES in orangefs_flush
orangefs: get rid of knob code...
Linus Torvalds [Sat, 11 Apr 2020 00:39:20 +0000 (17:39 -0700)]
Merge tag 'xtensa-
20200410' of git://github.com/jcmvbkbc/linux-xtensa
Pull xtensa updates from Max Filippov:
- replace setup_irq() by request_irq()
- cosmetic fixes in xtensa Kconfig and boot/Makefile
* tag 'xtensa-
20200410' of git://github.com/jcmvbkbc/linux-xtensa:
arch/xtensa: fix grammar in Kconfig help text
xtensa: remove meaningless export ccflags-y
xtensa: replace setup_irq() by request_irq()
Linus Torvalds [Sat, 11 Apr 2020 00:20:06 +0000 (17:20 -0700)]
Merge tag 'for-linus-5.7-rc1b-tag' of git://git./linux/kernel/git/xen/tip
Pull more xen updates from Juergen Gross:
- two cleanups
- fix a boot regression introduced in this merge window
- fix wrong use of memory allocation flags
* tag 'for-linus-5.7-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
x86/xen: fix booting 32-bit pv guest
x86/xen: make xen_pvmmu_arch_setup() static
xen/blkfront: fix memory allocation flags in blkfront_setup_indirect()
xen: Use evtchn_type_t as a type for event channels
Vasily Averin [Fri, 10 Apr 2020 21:34:13 +0000 (14:34 -0700)]
ipc/util.c: sysvipc_find_ipc() should increase position index
If seq_file .next function does not change position index, read after
some lseek can generate unexpected output.
https://bugzilla.kernel.org/show_bug.cgi?id=206283
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Waiman Long <longman@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: NeilBrown <neilb@suse.com>
Cc: Peter Oberparleiter <oberpar@linux.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/b7a20945-e315-8bb0-21e6-3875c14a8494@virtuozzo.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vasily Averin [Fri, 10 Apr 2020 21:34:10 +0000 (14:34 -0700)]
kernel/gcov/fs.c: gcov_seq_next() should increase position index
If seq_file .next function does not change position index, read after
some lseek can generate unexpected output.
https://bugzilla.kernel.org/show_bug.cgi?id=206283
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: NeilBrown <neilb@suse.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Waiman Long <longman@redhat.com>
Link: http://lkml.kernel.org/r/f65c6ee7-bd00-f910-2f8a-37cc67e4ff88@virtuozzo.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vasily Averin [Fri, 10 Apr 2020 21:34:06 +0000 (14:34 -0700)]
fs/seq_file.c: seq_read(): add info message about buggy .next functions
Patch series "seq_file .next functions should increase position index".
In Aug 2018 NeilBrown noticed commit
1f4aace60b0e ("fs/seq_file.c:
simplify seq_file iteration code and interface")
"Some ->next functions do not increment *pos when they return NULL...
Note that such ->next functions are buggy and should be fixed. A simple
demonstration is dd if=/proc/swaps bs=1000 skip=1 Choose any block size
larger than the size of /proc/swaps. This will always show the whole
last line of /proc/swaps"
Described problem is still actual. If you make lseek into middle of
last output line following read will output end of last line and whole
last line once again.
$ dd if=/proc/swaps bs=1 # usual output
Filename Type Size Used Priority
/dev/dm-0 partition
4194812 97536 -2
104+0 records in
104+0 records out
104 bytes copied
$ dd if=/proc/swaps bs=40 skip=1 # last line was generated twice
dd: /proc/swaps: cannot skip to specified offset
v/dm-0 partition
4194812 97536 -2
/dev/dm-0 partition
4194812 97536 -2
3+1 records in
3+1 records out
131 bytes copied
There are lot of other affected files, I've found 30+ including
/proc/net/ip_tables_matches and /proc/sysvipc/*
I've sent patches into maillists of affected subsystems already, this
patch-set fixes the problem in files related to pstore, tracing, gcov,
sysvipc and other subsystems processed via linux-kernel@ mailing list
directly
https://bugzilla.kernel.org/show_bug.cgi?id=206283
This patch (of 4):
Add debug code to seq_read() to detect missed or out-of-tree incorrect
.next seq_file functions.
[akpm@linux-foundation.org: s/pr_info/pr_info_ratelimited/, per Qian Cai]
https://bugzilla.kernel.org/show_bug.cgi?id=206283
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: NeilBrown <neilb@suse.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Peter Oberparleiter <oberpar@linux.ibm.com>
Cc: Waiman Long <longman@redhat.com>
Link: http://lkml.kernel.org/r/244674e5-760c-86bd-d08a-047042881748@virtuozzo.com
Link: http://lkml.kernel.org/r/7c24087c-e280-e580-5b0c-0cdaeb14cd18@virtuozzo.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
kbuild test robot [Fri, 10 Apr 2020 21:34:03 +0000 (14:34 -0700)]
drivers/dma/tegra20-apb-dma.c: fix platform_get_irq.cocci warnings
Remove dev_err() messages after platform_get_irq*() failures.
platform_get_irq() already prints an error.
Generated by: scripts/coccinelle/api/platform_get_irq.cocci
Fixes: 6c41ac96ad92 ("dmaengine: tegra-apb: Support COMPILE_TEST")
Signed-off-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Julia Lawall <julia.lawall@inria.fr>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Cc: Laxman Dewangan <ldewangan@nvidia.com>
Cc: Vinod Koul <vinod.koul@linux.intel.com>
Cc: Stephen Warren <swarren@wwwdotorg.org>
Cc: Jon Hunter <jonathanh@nvidia.com>
Link: http://lkml.kernel.org/r/alpine.DEB.2.21.2002271133450.2973@hadrien
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pali Rohár [Fri, 10 Apr 2020 21:34:00 +0000 (14:34 -0700)]
change email address for Pali Rohár
For security reasons I stopped using gmail account and kernel address is
now up-to-date alias to my personal address.
People periodically send me emails to address which they found in source
code of drivers, so this change reflects state where people can contact
me.
[ Added .mailmap entry as per Joe Perches - Linus ]
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Joe Perches <joe@perches.com>
Link: http://lkml.kernel.org/r/20200307104237.8199-1-pali@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Biggers [Fri, 10 Apr 2020 21:33:57 +0000 (14:33 -0700)]
selftests: kmod: test disabling module autoloading
Test that request_module() fails with -ENOENT when
/proc/sys/kernel/modprobe contains (a) a nonexistent path, and (b) an
empty path.
Case (b) is a regression test for the patch "kmod: make request_module()
return an error when autoloading is disabled".
Tested with 'kmod.sh -t 0010 && kmod.sh -t 0011', and also simply with
'kmod.sh' to run all kmod tests.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: NeilBrown <neilb@suse.com>
Link: http://lkml.kernel.org/r/20200312202552.241885-5-ebiggers@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Biggers [Fri, 10 Apr 2020 21:33:53 +0000 (14:33 -0700)]
selftests: kmod: fix handling test numbers above 9
get_test_count() and get_test_enabled() were broken for test numbers
above 9 due to awk interpreting a field specification like '$0010' as
octal rather than decimal. Fix it by stripping the leading zeroes.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: NeilBrown <neilb@suse.com>
Link: http://lkml.kernel.org/r/20200318230515.171692-5-ebiggers@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Biggers [Fri, 10 Apr 2020 21:33:50 +0000 (14:33 -0700)]
docs: admin-guide: document the kernel.modprobe sysctl
Document the kernel.modprobe sysctl in the same place that all the other
kernel.* sysctls are documented. Make sure to mention how to use this
sysctl to completely disable module autoloading, and how this sysctl
relates to CONFIG_STATIC_USERMODEHELPER.
[ebiggers@google.com: v5]
Link: http://lkml.kernel.org/r/20200318230515.171692-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: NeilBrown <neilb@suse.com>
Link: http://lkml.kernel.org/r/20200312202552.241885-4-ebiggers@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Biggers [Fri, 10 Apr 2020 21:33:47 +0000 (14:33 -0700)]
fs/filesystems.c: downgrade user-reachable WARN_ONCE() to pr_warn_once()
After request_module(), nothing is stopping the module from being
unloaded until someone takes a reference to it via try_get_module().
The WARN_ONCE() in get_fs_type() is thus user-reachable, via userspace
running 'rmmod' concurrently.
Since WARN_ONCE() is for kernel bugs only, not for user-reachable
situations, downgrade this warning to pr_warn_once().
Keep it printed once only, since the intent of this warning is to detect
a bug in modprobe at boot time. Printing the warning more than once
wouldn't really provide any useful extra information.
Fixes: 41124db869b7 ("fs: warn in case userspace lied about modprobe return")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Jessica Yu <jeyu@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: NeilBrown <neilb@suse.com>
Cc: <stable@vger.kernel.org> [4.13+]
Link: http://lkml.kernel.org/r/20200312202552.241885-3-ebiggers@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Biggers [Fri, 10 Apr 2020 21:33:43 +0000 (14:33 -0700)]
kmod: make request_module() return an error when autoloading is disabled
Patch series "module autoloading fixes and cleanups", v5.
This series fixes a bug where request_module() was reporting success to
kernel code when module autoloading had been completely disabled via
'echo > /proc/sys/kernel/modprobe'.
It also addresses the issues raised on the original thread
(https://lkml.kernel.org/lkml/
20200310223731.126894-1-ebiggers@kernel.org/T/#u)
bydocumenting the modprobe sysctl, adding a self-test for the empty path
case, and downgrading a user-reachable WARN_ONCE().
This patch (of 4):
It's long been possible to disable kernel module autoloading completely
(while still allowing manual module insertion) by setting
/proc/sys/kernel/modprobe to the empty string.
This can be preferable to setting it to a nonexistent file since it
avoids the overhead of an attempted execve(), avoids potential
deadlocks, and avoids the call to security_kernel_module_request() and
thus on SELinux-based systems eliminates the need to write SELinux rules
to dontaudit module_request.
However, when module autoloading is disabled in this way,
request_module() returns 0. This is broken because callers expect 0 to
mean that the module was successfully loaded.
Apparently this was never noticed because this method of disabling
module autoloading isn't used much, and also most callers don't use the
return value of request_module() since it's always necessary to check
whether the module registered its functionality or not anyway.
But improperly returning 0 can indeed confuse a few callers, for example
get_fs_type() in fs/filesystems.c where it causes a WARNING to be hit:
if (!fs && (request_module("fs-%.*s", len, name) == 0)) {
fs = __get_fs_type(name, len);
WARN_ONCE(!fs, "request_module fs-%.*s succeeded, but still no fs?\n", len, name);
}
This is easily reproduced with:
echo > /proc/sys/kernel/modprobe
mount -t NONEXISTENT none /
It causes:
request_module fs-NONEXISTENT succeeded, but still no fs?
WARNING: CPU: 1 PID: 1106 at fs/filesystems.c:275 get_fs_type+0xd6/0xf0
[...]
This should actually use pr_warn_once() rather than WARN_ONCE(), since
it's also user-reachable if userspace immediately unloads the module.
Regardless, request_module() should correctly return an error when it
fails. So let's make it return -ENOENT, which matches the error when
the modprobe binary doesn't exist.
I've also sent patches to document and test this case.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Jessica Yu <jeyu@kernel.org>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Ben Hutchings <benh@debian.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200310223731.126894-1-ebiggers@kernel.org
Link: http://lkml.kernel.org/r/20200312202552.241885-1-ebiggers@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Logan Gunthorpe [Fri, 10 Apr 2020 21:33:39 +0000 (14:33 -0700)]
mm/memremap: set caching mode for PCI P2PDMA memory to WC
PCI BAR IO memory should never be mapped as WB, however prior to this
the PAT bits were set WB and it was typically overridden by MTRR
registers set by the firmware.
Set PCI P2PDMA memory to be UC as this is what it currently, typically,
ends up being mapped as on x86 after the MTRR registers override the
cache setting.
Future use-cases may need to generalize this by adding flags to select
the caching type, as some P2PDMA cases may not want UC. However, those
use-cases are not upstream yet and this can be changed when they arrive.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Eric Badger <ebadger@gigaio.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Link: http://lkml.kernel.org/r/20200306170846.9333-8-logang@deltatee.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Logan Gunthorpe [Fri, 10 Apr 2020 21:33:36 +0000 (14:33 -0700)]
mm/memory_hotplug: add pgprot_t to mhp_params
devm_memremap_pages() is currently used by the PCI P2PDMA code to create
struct page mappings for IO memory. At present, these mappings are
created with PAGE_KERNEL which implies setting the PAT bits to be WB.
However, on x86, an mtrr register will typically override this and force
the cache type to be UC-. In the case firmware doesn't set this
register it is effectively WB and will typically result in a machine
check exception when it's accessed.
Other arches are not currently likely to function correctly seeing they
don't have any MTRR registers to fall back on.
To solve this, provide a way to specify the pgprot value explicitly to
arch_add_memory().
Of the arches that support MEMORY_HOTPLUG: x86_64, and arm64 need a
simple change to pass the pgprot_t down to their respective functions
which set up the page tables. For x86_32, set the page tables
explicitly using _set_memory_prot() (seeing they are already mapped).
For ia64, s390 and sh, reject anything but PAGE_KERNEL settings -- this
should be fine, for now, seeing these architectures don't support
ZONE_DEVICE.
A check in __add_pages() is also added to ensure the pgprot parameter
was set for all arches.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Eric Badger <ebadger@gigaio.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Link: http://lkml.kernel.org/r/20200306170846.9333-7-logang@deltatee.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Logan Gunthorpe [Fri, 10 Apr 2020 21:33:32 +0000 (14:33 -0700)]
powerpc/mm: thread pgprot_t through create_section_mapping()
In prepartion to support a pgprot_t argument for arch_add_memory().
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Eric Badger <ebadger@gigaio.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Link: http://lkml.kernel.org/r/20200306170846.9333-6-logang@deltatee.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Logan Gunthorpe [Fri, 10 Apr 2020 21:33:28 +0000 (14:33 -0700)]
x86/mm: introduce __set_memory_prot()
For use in the 32bit arch_add_memory() to set the pgprot type of the
memory to add.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Eric Badger <ebadger@gigaio.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Will Deacon <will@kernel.org>
Link: http://lkml.kernel.org/r/20200306170846.9333-5-logang@deltatee.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Logan Gunthorpe [Fri, 10 Apr 2020 21:33:24 +0000 (14:33 -0700)]
x86/mm: thread pgprot_t through init_memory_mapping()
In preparation to support a pgprot_t argument for arch_add_memory().
It's required to move the prototype of init_memory_mapping() seeing the
original location came before the definition of pgprot_t.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Eric Badger <ebadger@gigaio.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Will Deacon <will@kernel.org>
Link: http://lkml.kernel.org/r/20200306170846.9333-4-logang@deltatee.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Logan Gunthorpe [Fri, 10 Apr 2020 21:33:21 +0000 (14:33 -0700)]
mm/memory_hotplug: rename mhp_restrictions to mhp_params
The mhp_restrictions struct really doesn't specify anything resembling a
restriction anymore so rename it to be mhp_params as it is a list of
extended parameters.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Eric Badger <ebadger@gigaio.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Link: http://lkml.kernel.org/r/20200306170846.9333-3-logang@deltatee.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Logan Gunthorpe [Fri, 10 Apr 2020 21:33:17 +0000 (14:33 -0700)]
mm/memory_hotplug: drop the flags field from struct mhp_restrictions
Patch series "Allow setting caching mode in arch_add_memory() for
P2PDMA", v4.
Currently, the page tables created using memremap_pages() are always
created with the PAGE_KERNEL cacheing mode. However, the P2PDMA code is
creating pages for PCI BAR memory which should never be accessed through
the cache and instead use either WC or UC. This still works in most
cases, on x86, because the MTRR registers typically override the caching
settings in the page tables for all of the IO memory to be UC-.
However, this tends not to work so well on other arches or some rare x86
machines that have firmware which does not setup the MTRR registers in
this way.
Instead of this, this series proposes a change to arch_add_memory() to
take the pgprot required by the mapping which allows us to explicitly
set pagetable entries for P2PDMA memory to UC.
This changes is pretty routine for most of the arches: x86_64, arm64 and
powerpc simply need to thread the pgprot through to where the page
tables are setup. x86_32 unfortunately sets up the page tables at boot
so must use _set_memory_prot() to change their caching mode. ia64, s390
and sh don't appear to have an easy way to change the page tables so,
for now at least, we just return -EINVAL on such mappings and thus they
will not support P2PDMA memory until the work for this is done. This
should be fine as they don't yet support ZONE_DEVICE.
This patch (of 7):
This variable is not used anywhere and should therefore be removed from
the structure.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Eric Badger <ebadger@gigaio.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Link: http://lkml.kernel.org/r/20200306170846.9333-2-logang@deltatee.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Anshuman Khandual [Fri, 10 Apr 2020 21:33:13 +0000 (14:33 -0700)]
mm/special: create generic fallbacks for pte_special() and pte_mkspecial()
Currently there are many platforms that dont enable ARCH_HAS_PTE_SPECIAL
but required to define quite similar fallback stubs for special page
table entry helpers such as pte_special() and pte_mkspecial(), as they
get build in generic MM without a config check. This creates two
generic fallback stub definitions for these helpers, eliminating much
code duplication.
mips platform has a special case where pte_special() and pte_mkspecial()
visibility is wider than what ARCH_HAS_PTE_SPECIAL enablement requires.
This restricts those symbol visibility in order to avoid redefinitions
which is now exposed through this new generic stubs and subsequent build
failure. arm platform set_pte_at() definition needs to be moved into a
C file just to prevent a build failure.
[anshuman.khandual@arm.com: use defined(CONFIG_ARCH_HAS_PTE_SPECIAL) in mips per Thomas]
Link: http://lkml.kernel.org/r/1583851924-21603-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Guo Ren <guoren@kernel.org> [csky]
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k]
Acked-by: Stafford Horne <shorne@gmail.com> [openrisc]
Acked-by: Helge Deller <deller@gmx.de> [parisc]
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Sam Creasey <sammy@sammy.net>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paulburton@kernel.org>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Link: http://lkml.kernel.org/r/1583802551-15406-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Anshuman Khandual [Fri, 10 Apr 2020 21:33:09 +0000 (14:33 -0700)]
mm/vma: introduce VM_ACCESS_FLAGS
There are many places where all basic VMA access flags (read, write,
exec) are initialized or checked against as a group. One such example
is during page fault. Existing vma_is_accessible() wrapper already
creates the notion of VMA accessibility as a group access permissions.
Hence lets just create VM_ACCESS_FLAGS (VM_READ|VM_WRITE|VM_EXEC) which
will not only reduce code duplication but also extend the VMA
accessibility concept in general.
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Rob Springer <rspringer@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Link: http://lkml.kernel.org/r/1583391014-8170-3-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Anshuman Khandual [Fri, 10 Apr 2020 21:33:05 +0000 (14:33 -0700)]
mm/vma: define a default value for VM_DATA_DEFAULT_FLAGS
There are many platforms with exact same value for VM_DATA_DEFAULT_FLAGS
This creates a default value for VM_DATA_DEFAULT_FLAGS in line with the
existing VM_STACK_DEFAULT_FLAGS. While here, also define some more
macros with standard VMA access flag combinations that are used
frequently across many platforms. Apart from simplification, this
reduces code duplication as well.
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paulburton@kernel.org>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Rich Felker <dalias@libc.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Chris Zankel <chris@zankel.net>
Link: http://lkml.kernel.org/r/1583391014-8170-2-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arjun Roy [Fri, 10 Apr 2020 21:33:01 +0000 (14:33 -0700)]
mm/memory.c: add vm_insert_pages()
Add the ability to insert multiple pages at once to a user VM with lower
PTE spinlock operations.
The intention of this patch-set is to reduce atomic ops for tcp zerocopy
receives, which normally hits the same spinlock multiple times
consecutively.
[akpm@linux-foundation.org: pte_alloc() no longer takes the `addr' argument]
[arjunroy@google.com: add missing page_count() check to vm_insert_pages()]
Link: http://lkml.kernel.org/r/20200214005929.104481-1-arjunroy.kdev@gmail.com
[arjunroy@google.com: vm_insert_pages() checks if pte_index defined]
Link: http://lkml.kernel.org/r/20200228054714.204424-2-arjunroy.kdev@gmail.com
Signed-off-by: Arjun Roy <arjunroy@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: David Miller <davem@davemloft.net>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Link: http://lkml.kernel.org/r/20200128025958.43490-2-arjunroy.kdev@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arjun Roy [Fri, 10 Apr 2020 21:32:58 +0000 (14:32 -0700)]
mm: define pte_index as macro for x86
pte_index() is either defined as a macro (e.g. sparc64) or as an
inlined function (e.g. x86). vm_insert_pages() depends on pte_index
but it is not defined on all platforms (e.g. m68k).
To fix compilation of vm_insert_pages() on architectures not providing
pte_index(), we perform the following fix:
0. For platforms where it is meaningful, and defined as a macro, no
change is needed.
1. For platforms where it is meaningful and defined as an inlined
function, and we want to use it with vm_insert_pages(), we define
a degenerate macro of the form: #define pte_index pte_index
2. vm_insert_pages() checks for the existence of a pte_index macro
definition. If found, it implements a batched insert. If not found,
it devolves to calling vm_insert_page() in a loop.
This patch implements step 1 for x86.
v3 of this patch fixes a compilation warning for an unused method.
v2 of this patch moved a macro definition to a more readable location.
Signed-off-by: Arjun Roy <arjunroy@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Link: http://lkml.kernel.org/r/20200228054714.204424-1-arjunroy.kdev@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arjun Roy [Fri, 10 Apr 2020 21:32:54 +0000 (14:32 -0700)]
mm: bring sparc pte_index() semantics inline with other platforms
pte_index() on platforms other than sparc return a numerical index. On
sparc, it returns a pte_t*. This presents an issue for
vm_insert_pages(), which relies on pte_index() to find the offset for a
pte within a pmd, for batched inserts.
This patch:
1. Modifies pte_index() for sparc to return a numerical index, like
other platforms,
2. Defines pte_entry() for sparc which returns a pte_t*
(as pte_index() used to),
3. Converts existing sparc callers for pte_index() to use pte_entry().
[sfr@canb.auug.org.au: remove pte_entry and just directly modified pte_offset_kernel instead]
Signed-off-by: Arjun Roy <arjunroy@google.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Cc: David Miller <davem@davemloft.net>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Arjun Roy <arjunroy.kdev@gmail.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Link: http://lkml.kernel.org/r/20200227105045.6b421d9f@canb.auug.org.au
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arjun Roy [Fri, 10 Apr 2020 21:32:51 +0000 (14:32 -0700)]
mm/memory.c: refactor insert_page to prepare for batched-lock insert
Add helper methods for vm_insert_page()/insert_page() to prepare for
vm_insert_pages(), which batch-inserts pages to reduce spinlock
operations when inserting multiple consecutive pages into the user page
table.
The intention of this patch-set is to reduce atomic ops for tcp zerocopy
receives, which normally hits the same spinlock multiple times
consecutively.
Signed-off-by: Arjun Roy <arjunroy@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: David Miller <davem@davemloft.net>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Link: http://lkml.kernel.org/r/20200128025958.43490-1-arjunroy.kdev@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jaewon Kim [Fri, 10 Apr 2020 21:32:48 +0000 (14:32 -0700)]
mm/mmap.c: initialize align_offset explicitly for vm_unmapped_area
On passing requirement to vm_unmapped_area, arch_get_unmapped_area and
arch_get_unmapped_area_topdown did not set align_offset. Internally on
both unmapped_area and unmapped_area_topdown, if info->align_mask is 0,
then info->align_offset was meaningless.
But commit
df529cabb7a2 ("mm: mmap: add trace point of
vm_unmapped_area") always prints info->align_offset even though it is
uninitialized.
Fix this uninitialized value issue by setting it to 0 explicitly.
Before:
vm_unmapped_area: addr=0x755b155000 err=0 total_vm=0x15aaf0 flags=0x1 len=0x109000 lo=0x8000 hi=0x75eed48000 mask=0x0 ofs=0x4022
After:
vm_unmapped_area: addr=0x74a4ca1000 err=0 total_vm=0x168ab1 flags=0x1 len=0x9000 lo=0x8000 hi=0x753d94b000 mask=0x0 ofs=0x0
Signed-off-by: Jaewon Kim <jaewon31.kim@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michel Lespinasse <walken@google.com>
Cc: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/20200409094035.19457-1-jaewon31.kim@samsung.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Roman Gushchin [Fri, 10 Apr 2020 21:32:45 +0000 (14:32 -0700)]
mm: hugetlb: optionally allocate gigantic hugepages using cma
Commit
944d9fec8d7a ("hugetlb: add support for gigantic page allocation
at runtime") has added the run-time allocation of gigantic pages.
However it actually works only at early stages of the system loading,
when the majority of memory is free. After some time the memory gets
fragmented by non-movable pages, so the chances to find a contiguous 1GB
block are getting close to zero. Even dropping caches manually doesn't
help a lot.
At large scale rebooting servers in order to allocate gigantic hugepages
is quite expensive and complex. At the same time keeping some constant
percentage of memory in reserved hugepages even if the workload isn't
using it is a big waste: not all workloads can benefit from using 1 GB
pages.
The following solution can solve the problem:
1) On boot time a dedicated cma area* is reserved. The size is passed
as a kernel argument.
2) Run-time allocations of gigantic hugepages are performed using the
cma allocator and the dedicated cma area
In this case gigantic hugepages can be allocated successfully with a
high probability, however the memory isn't completely wasted if nobody
is using 1GB hugepages: it can be used for pagecache, anon memory, THPs,
etc.
* On a multi-node machine a per-node cma area is allocated on each node.
Following gigantic hugetlb allocation are using the first available
numa node if the mask isn't specified by a user.
Usage:
1) configure the kernel to allocate a cma area for hugetlb allocations:
pass hugetlb_cma=10G as a kernel argument
2) allocate hugetlb pages as usual, e.g.
echo 10 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
If the option isn't enabled or the allocation of the cma area failed,
the current behavior of the system is preserved.
x86 and arm-64 are covered by this patch, other architectures can be
trivially added later.
The patch contains clean-ups and fixes proposed and implemented by Aslan
Bakirov and Randy Dunlap. It also contains ideas and suggestions
proposed by Rik van Riel, Michal Hocko and Mike Kravetz. Thanks!
Signed-off-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Andreas Schaufler <andreas.schaufler@gmx.de>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Michal Hocko <mhocko@kernel.org>
Cc: Aslan Bakirov <aslan@fb.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Link: http://lkml.kernel.org/r/20200407163840.92263-3-guro@fb.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>