Dominique Martinet [Tue, 1 Dec 2020 13:17:30 +0000 (14:17 +0100)]
kbuild: don't hardcode depmod path
commit
436e980e2ed526832de822cbf13c317a458b78e1 upstream.
depmod is not guaranteed to be in /sbin, just let make look for
it in the path like all the other invoked programs
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jaegeuk Kim [Tue, 17 Nov 2020 16:58:35 +0000 (08:58 -0800)]
scsi: ufs: Clear UAC for FFU and RPMB LUNs
[ Upstream commit
4f3e900b628226011a5f71c19e53b175c014eb58 ]
In order to conduct FFU or RPMB operations, UFS needs to clear UNIT
ATTENTION condition. Clear it explicitly so that we get no failures during
initialization.
Link: https://lore.kernel.org/r/20201117165839.1643377-4-jaegeuk@kernel.org
Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Linus Torvalds [Mon, 28 Dec 2020 19:40:22 +0000 (11:40 -0800)]
depmod: handle the case of /sbin/depmod without /sbin in PATH
[ Upstream commit
cedd1862be7e666be87ec824dabc6a2b05618f36 ]
Commit
436e980e2ed5 ("kbuild: don't hardcode depmod path") stopped
hard-coding the path of depmod, but in the process caused trouble for
distributions that had that /sbin location, but didn't have it in the
PATH (generally because /sbin is limited to the super-user path).
Work around it for now by just adding /sbin to the end of PATH in the
depmod.sh script.
Reported-and-tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Huang Shijie [Tue, 29 Dec 2020 23:14:58 +0000 (15:14 -0800)]
lib/genalloc: fix the overflow when size is too big
[ Upstream commit
36845663843fc59c5d794e3dc0641472e3e572da ]
Some graphic card has very big memory on chip, such as 32G bytes.
In the following case, it will cause overflow:
pool = gen_pool_create(PAGE_SHIFT, NUMA_NO_NODE);
ret = gen_pool_add(pool, 0x1000000, SZ_32G, NUMA_NO_NODE);
va = gen_pool_alloc(pool, SZ_4G);
The overflow occurs in gen_pool_alloc_algo_owner():
....
size = nbits << order;
....
The @nbits is "int" type, so it will overflow.
Then the gen_pool_avail() will return the wrong value.
This patch converts some "int" to "unsigned long", and
changes the compare code in while.
Link: https://lkml.kernel.org/r/20201229060657.3389-1-sjhuang@iluvatar.ai
Signed-off-by: Huang Shijie <sjhuang@iluvatar.ai>
Reported-by: Shi Jiasheng <jiasheng.shi@iluvatar.ai>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Randy Dunlap [Tue, 29 Dec 2020 23:14:49 +0000 (15:14 -0800)]
local64.h: make <asm/local64.h> mandatory
[ Upstream commit
87dbc209ea04645fd2351981f09eff5d23f8e2e9 ]
Make <asm-generic/local64.h> mandatory in include/asm-generic/Kbuild and
remove all arch/*/include/asm/local64.h arch-specific files since they
only #include <asm-generic/local64.h>.
This fixes build errors on arch/c6x/ and arch/nios2/ for
block/blk-iocost.c.
Build-tested on 21 of 25 arch-es. (tools problems on the others)
Yes, we could even rename <asm-generic/local64.h> to
<linux/local64.h> and change all #includes to use
<linux/local64.h> instead.
Link: https://lkml.kernel.org/r/20201227024446.17018-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Suggested-by: Christoph Hellwig <hch@infradead.org>
Reviewed-by: Masahiro Yamada <masahiroy@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Aurelien Jacquiot <jacquiot.aurelien@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Bart Van Assche [Wed, 9 Dec 2020 05:29:49 +0000 (21:29 -0800)]
scsi: core: Only process PM requests if rpm_status != RPM_ACTIVE
[ Upstream commit
e6044f714b256259df9611ff49af433e5411c5c8 ]
Instead of submitting all SCSI commands submitted with scsi_execute() to a
SCSI device if rpm_status != RPM_ACTIVE, only submit RQF_PM (power
management requests) if rpm_status != RPM_ACTIVE. This patch makes the SCSI
core handle the runtime power management status (rpm_status) as it should
be handled.
Link: https://lore.kernel.org/r/20201209052951.16136-7-bvanassche@acm.org
Cc: Can Guo <cang@codeaurora.org>
Cc: Stanley Chu <stanley.chu@mediatek.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Martin Kepplinger <martin.kepplinger@puri.sm>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Bart Van Assche [Wed, 9 Dec 2020 05:29:48 +0000 (21:29 -0800)]
scsi: scsi_transport_spi: Set RQF_PM for domain validation commands
[ Upstream commit
cfefd9f8240a7b9fdd96fcd54cb029870b6d8d88 ]
Disable runtime power management during domain validation. Since a later
patch removes RQF_PREEMPT, set RQF_PM for domain validation commands such
that these are executed in the quiesced SCSI device state.
Link: https://lore.kernel.org/r/20201209052951.16136-6-bvanassche@acm.org
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Woody Suwalski <terraluna977@gmail.com>
Cc: Can Guo <cang@codeaurora.org>
Cc: Stanley Chu <stanley.chu@mediatek.com>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Stan Johnson <userm57@yahoo.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Bart Van Assche [Wed, 9 Dec 2020 05:29:47 +0000 (21:29 -0800)]
scsi: ide: Mark power management requests with RQF_PM instead of RQF_PREEMPT
[ Upstream commit
5ae65383fc7633e0247c31b0c8bf0e6ea63b95a3 ]
This is another step that prepares for the removal of RQF_PREEMPT.
Link: https://lore.kernel.org/r/20201209052951.16136-5-bvanassche@acm.org
Cc: David S. Miller <davem@davemloft.net>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Can Guo <cang@codeaurora.org>
Cc: Stanley Chu <stanley.chu@mediatek.com>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Bart Van Assche [Wed, 9 Dec 2020 05:29:46 +0000 (21:29 -0800)]
scsi: ide: Do not set the RQF_PREEMPT flag for sense requests
[ Upstream commit
96d86e6a80a3ab9aff81d12f9f1f2a0da2917d38 ]
RQF_PREEMPT is used for two different purposes in the legacy IDE code:
1. To mark power management requests.
2. To mark requests that should preempt another request. An (old)
explanation of that feature is as follows: "The IDE driver in the Linux
kernel normally uses a series of busywait delays during its
initialization. When the driver executes these busywaits, the kernel
does nothing for the duration of the wait. The time spent in these
waits could be used for other initialization activities, if they could
be run concurrently with these waits.
More specifically, busywait-style delays such as udelay() in module
init functions inhibit kernel preemption because the Big Kernel Lock is
held, while yielding APIs such as schedule_timeout() allow
preemption. This is true because the kernel handles the BKL specially
and releases and reacquires it across reschedules allowed by the
current thread.
This IDE-preempt specification requires that the driver eliminate these
busywaits and replace them with a mechanism that allows other work to
proceed while the IDE driver is initializing."
Since I haven't found an implementation of (2), do not set the PREEMPT flag
for sense requests. This patch causes sense requests to be postponed while
a drive is suspended instead of being submitted to ide_queue_rq().
If it would ever be necessary to restore the IDE PREEMPT functionality,
that can be done by introducing a new flag in struct ide_request.
Link: https://lore.kernel.org/r/20201209052951.16136-4-bvanassche@acm.org
Cc: David S. Miller <davem@davemloft.net>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Can Guo <cang@codeaurora.org>
Cc: Stanley Chu <stanley.chu@mediatek.com>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Bart Van Assche [Wed, 9 Dec 2020 05:29:45 +0000 (21:29 -0800)]
scsi: block: Introduce BLK_MQ_REQ_PM
[ Upstream commit
0854bcdcdec26aecdc92c303816f349ee1fba2bc ]
Introduce the BLK_MQ_REQ_PM flag. This flag makes the request allocation
functions set RQF_PM. This is the first step towards removing
BLK_MQ_REQ_PREEMPT.
Link: https://lore.kernel.org/r/20201209052951.16136-3-bvanassche@acm.org
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Stanley Chu <stanley.chu@mediatek.com>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Can Guo <cang@codeaurora.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Adrian Hunter [Mon, 7 Dec 2020 08:31:20 +0000 (10:31 +0200)]
scsi: ufs-pci: Enable UFSHCD_CAP_RPM_AUTOSUSPEND for Intel controllers
[ Upstream commit
dd78bdb6f810bdcb173b42379af558c676c8e0aa ]
Enable runtime PM auto-suspend by default for Intel host controllers.
Link: https://lore.kernel.org/r/20201207083120.26732-5-adrian.hunter@intel.com
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Adrian Hunter [Mon, 7 Dec 2020 08:31:19 +0000 (10:31 +0200)]
scsi: ufs-pci: Fix recovery from hibernate exit errors for Intel controllers
[ Upstream commit
044d5bda7117891d6d0d56f2f807b7b11e120abd ]
Intel controllers can end up in an unrecoverable state after a hibernate
exit error unless a full reset and restore is done before anything else.
Force that to happen.
Link: https://lore.kernel.org/r/20201207083120.26732-4-adrian.hunter@intel.com
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Adrian Hunter [Mon, 7 Dec 2020 08:31:18 +0000 (10:31 +0200)]
scsi: ufs-pci: Ensure UFS device is in PowerDown mode for suspend-to-disk ->poweroff()
[ Upstream commit
af423534d2de86cd0db729a5ac41f056ca8717de ]
The expectation for suspend-to-disk is that devices will be powered-off, so
the UFS device should be put in PowerDown mode. If spm_lvl is not 5, then
that will not happen. Change the pm callbacks to force spm_lvl 5 for
suspend-to-disk poweroff.
Link: https://lore.kernel.org/r/20201207083120.26732-3-adrian.hunter@intel.com
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Adrian Hunter [Mon, 7 Dec 2020 08:31:17 +0000 (10:31 +0200)]
scsi: ufs-pci: Fix restore from S4 for Intel controllers
[ Upstream commit
c763729a10e538d997744317cf4a1c4f25266066 ]
Currently, ufshcd-pci is the only UFS driver with support for
suspend-to-disk PM callbacks (i.e. freeze/thaw/restore/poweroff). These
callbacks are set by the macro SET_SYSTEM_SLEEP_PM_OPS to the same
functions as system suspend/resume. That will work with spm_lvl 5 because
spm_lvl 5 will result in a full restore for the ->restore() callback. In
the absence of a full restore, the host controller registers will have
values set up by the restore kernel (the kernel that boots and loads the
restore image) which are not necessarily the same. However it turns out,
the only registers that sometimes need restore are the base address
registers. This has gone un-noticed because, depending on IOMMU settings,
the kernel can end up allocating the same addresses every time.
For Intel controllers, an spm_lvl other than 5 can be used, so to support
S4 (suspend-to-disk) with spm_lvl other than 5, restore the base address
registers.
Link: https://lore.kernel.org/r/20201207083120.26732-2-adrian.hunter@intel.com
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Bean Huo [Mon, 7 Dec 2020 19:01:37 +0000 (20:01 +0100)]
scsi: ufs: Fix wrong print message in dev_err()
[ Upstream commit
1fa0570002e3f66db9b58c32c60de4183b857a19 ]
Change dev_err() print message from "dme-reset" to "dme_enable" in function
ufshcd_dme_enable().
Link: https://lore.kernel.org/r/20201207190137.6858-3-huobean@gmail.com
Acked-by: Alim Akhtar <alim.akhtar@samsung.com>
Acked-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Yunfeng Ye [Thu, 19 Nov 2020 06:21:25 +0000 (14:21 +0800)]
workqueue: Kick a worker based on the actual activation of delayed works
[ Upstream commit
01341fbd0d8d4e717fc1231cdffe00343088ce0b ]
In realtime scenario, We do not want to have interference on the
isolated cpu cores. but when invoking alloc_workqueue() for percpu wq
on the housekeeping cpu, it kick a kworker on the isolated cpu.
alloc_workqueue
pwq_adjust_max_active
wake_up_worker
The comment in pwq_adjust_max_active() said:
"Need to kick a worker after thawed or an unbound wq's
max_active is bumped"
So it is unnecessary to kick a kworker for percpu's wq when invoking
alloc_workqueue(). this patch only kick a worker based on the actual
activation of delayed works.
Signed-off-by: Yunfeng Ye <yeyunfeng@huawei.com>
Reviewed-by: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Andres Freund [Mon, 28 Dec 2020 19:27:18 +0000 (11:27 -0800)]
block: add debugfs stanza for QUEUE_FLAG_NOWAIT
[ Upstream commit
dc30432605bbbd486dfede3852ea4d42c40a84b4 ]
This was missed in
021a24460dc2. Leads to the numeric value of
QUEUE_FLAG_NOWAIT (i.e. 29) showing up in
/sys/kernel/debug/block/*/state.
Fixes:
021a24460dc28e7412aecfae89f60e1847e685c0
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andres Freund <andres@anarazel.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Harish [Tue, 29 Dec 2020 23:14:22 +0000 (15:14 -0800)]
selftests/vm: fix building protection keys test
[ Upstream commit
7cf22a1c88c05ea3807f95b1edfebb729016ae52 ]
Commit
d8cbe8bfa7d ("tools/testing/selftests/vm: fix build error") tried
to include a ARCH check for powerpc, however ARCH is not defined in the
Makefile before including lib.mk. This makes test building to skip on
both x86 and powerpc.
Fix the arch check by replacing it using machine type as it is already
defined and used in the test.
Link: https://lkml.kernel.org/r/20201215100402.257376-1-harish@linux.ibm.com
Fixes:
d8cbe8bfa7d ("tools/testing/selftests/vm: fix build error")
Signed-off-by: Harish <harish@linux.ibm.com>
Reviewed-by: Sandipan Das <sandipan@linux.ibm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Noor Azura Ahmad Tarmizi [Tue, 22 Dec 2020 16:03:37 +0000 (00:03 +0800)]
stmmac: intel: Add PCI IDs for TGL-H platform
[ Upstream commit
8450e23f142f629e40bd67afc8375c86c7fbf8f1 ]
Add TGL-H PCI info and PCI IDs for the new TSN Controller to the list
of supported devices.
Signed-off-by: Noor Azura Ahmad Tarmizi <noor.azura.ahmad.tarmizi@intel.com>
Signed-off-by: Voon Weifeng <weifeng.voon@intel.com>
Signed-off-by: Muhammad Husaini Zulkifli <muhammad.husaini.zulkifli@intel.com>
Link: https://lore.kernel.org/r/20201222160337.30870-1-muhammad.husaini.zulkifli@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ido Schimmel [Wed, 30 Dec 2020 11:42:51 +0000 (13:42 +0200)]
selftests: mlxsw: Set headroom size of correct port
[ Upstream commit
2ff2c7e274392871bfdee00ff2adbb8ebae5d240 ]
The test was setting the headroom size of the wrong port. This was not
visible because of a firmware bug that canceled this bug.
Set the headroom size of the correct port, so that the test will pass
with both old and new firmware versions.
Fixes:
bfa804784e32 ("selftests: mlxsw: Add a PFC test")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://lore.kernel.org/r/20201230114251.394009-1-idosch@idosch.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bjørn Mork [Wed, 30 Dec 2020 15:24:51 +0000 (16:24 +0100)]
net: usb: qmi_wwan: add Quectel EM160R-GL
[ Upstream commit
cfd82dfc9799c53ef109343a23af006a0f6860a9 ]
New modem using ff/ff/30 for QCDM, ff/00/00 for AT and NMEA,
and ff/ff/ff for RMNET/QMI.
T: Bus=02 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=5000 MxCh= 0
D: Ver= 3.20 Cls=ef(misc ) Sub=02 Prot=01 MxPS= 9 #Cfgs= 1
P: Vendor=2c7c ProdID=0620 Rev= 4.09
S: Manufacturer=Quectel
S: Product=EM160R-GL
S: SerialNumber=
e31cedc1
C:* #Ifs= 5 Cfg#= 1 Atr=a0 MxPwr=896mA
I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=(none)
E: Ad=81(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
E: Ad=01(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
I:* If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
E: Ad=83(I) Atr=03(Int.) MxPS= 10 Ivl=32ms
E: Ad=82(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
E: Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
E: Ad=85(I) Atr=03(Int.) MxPS= 10 Ivl=32ms
E: Ad=84(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
E: Ad=03(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
E: Ad=87(I) Atr=03(Int.) MxPS= 10 Ivl=32ms
E: Ad=86(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
E: Ad=04(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none)
E: Ad=88(I) Atr=03(Int.) MxPS= 8 Ivl=32ms
E: Ad=8e(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
E: Ad=0f(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Link: https://lore.kernel.org/r/20201230152451.245271-1-bjorn@mork.no
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
YANG LI [Wed, 30 Dec 2020 07:23:14 +0000 (15:23 +0800)]
ibmvnic: fix: NULL pointer dereference.
[ Upstream commit
862aecbd9569e563b979c0e23a908b43cda4b0b9 ]
The error is due to dereference a null pointer in function
reset_one_sub_crq_queue():
if (!scrq) {
netdev_dbg(adapter->netdev,
"Invalid scrq reset. irq (%d) or msgs(%p).\n",
scrq->irq, scrq->msgs);
return -EINVAL;
}
If the expression is true, scrq must be a null pointer and cannot
dereference.
Fixes:
9281cf2d5840 ("ibmvnic: avoid memset null scrq msgs")
Signed-off-by: YANG LI <abaci-bugfix@linux.alibaba.com>
Reported-by: Abaci <abaci@linux.alibaba.com>
Acked-by: Lijun Pan <ljp@linux.ibm.com>
Link: https://lore.kernel.org/r/1609312994-121032-1-git-send-email-abaci-bugfix@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Roland Dreier [Thu, 24 Dec 2020 03:21:16 +0000 (19:21 -0800)]
CDC-NCM: remove "connected" log message
[ Upstream commit
59b4a8fa27f5a895582ada1ae5034af7c94a57b5 ]
The cdc_ncm driver passes network connection notifications up to
usbnet_link_change(), which is the right place for any logging.
Remove the netdev_info() duplicating this from the driver itself.
This stops devices such as my "TRENDnet USB 10/100/1G/2.5G LAN"
(ID 20f4:e02b) adapter from spamming the kernel log with
cdc_ncm 2-2:2.0 enp0s2u2c2: network connection: connected
messages every 60 msec or so.
Signed-off-by: Roland Dreier <roland@kernel.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20201224032116.2453938-1-roland@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Martin Blumenstingl [Sun, 3 Jan 2021 01:25:44 +0000 (02:25 +0100)]
net: dsa: lantiq_gswip: Fix GSWIP_MII_CFG(p) register access
[ Upstream commit
709a3c9dff2a639966ae7d8ba6239d2b8aba036d ]
There is one GSWIP_MII_CFG register for each switch-port except the CPU
port. The register offset for the first port is 0x0, 0x02 for the
second, 0x04 for the third and so on.
Update the driver to not only restrict the GSWIP_MII_CFG registers to
ports 0, 1 and 5. Handle ports 0..5 instead but skip the CPU port. This
means we are not overwriting the configuration for the third port (port
two since we start counting from zero) with the settings for the sixth
port (with number five) anymore.
The GSWIP_MII_PCDU(p) registers are not updated because there's really
only three (one for each of the following ports: 0, 1, 5).
Fixes:
14fceff4771e51 ("net: dsa: Add Lantiq / Intel DSA driver for vrx200")
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Martin Blumenstingl [Sun, 3 Jan 2021 01:25:43 +0000 (02:25 +0100)]
net: dsa: lantiq_gswip: Enable GSWIP_MII_CFG_EN also for internal PHYs
[ Upstream commit
c1a9ec7e5d577a9391660800c806c53287fca991 ]
Enable GSWIP_MII_CFG_EN also for internal PHYs to make traffic flow.
Without this the PHY link is detected properly and ethtool statistics
for TX are increasing but there's no RX traffic coming in.
Fixes:
14fceff4771e51 ("net: dsa: Add Lantiq / Intel DSA driver for vrx200")
Suggested-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Heiner Kallweit [Wed, 30 Dec 2020 18:33:34 +0000 (19:33 +0100)]
r8169: work around power-saving bug on some chip versions
[ Upstream commit
e80bd76fbf563cc7ed8c9e9f3bbcdf59b0897f69 ]
A user reported failing network with RTL8168dp (a quite rare chip
version). Realtek confirmed that few chip versions suffer from a PLL
power-down hw bug.
Fixes:
07df5bd874f0 ("r8169: power down chip in probe")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/a1c39460-d533-7f9e-fa9d-2b8990b02426@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Yunjian Wang [Tue, 29 Dec 2020 02:01:48 +0000 (10:01 +0800)]
vhost_net: fix ubuf refcount incorrectly when sendmsg fails
[ Upstream commit
01e31bea7e622f1890c274f4aaaaf8bccd296aa5 ]
Currently the vhost_zerocopy_callback() maybe be called to decrease
the refcount when sendmsg fails in tun. The error handling in vhost
handle_tx_zerocopy() will try to decrease the same refcount again.
This is wrong. To fix this issue, we only call vhost_net_ubuf_put()
when vq->heads[nvq->desc].len == VHOST_DMA_IN_PROGRESS.
Fixes:
bab632d69ee4 ("vhost: vhost TX zero-copy support")
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/1609207308-20544-1-git-send-email-wangyunjian@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Taehee Yoo [Mon, 28 Dec 2020 15:21:46 +0000 (15:21 +0000)]
bareudp: Fix use of incorrect min_headroom size
[ Upstream commit
10ad3e998fa0c25315f27cf3002ff8b02dc31c38 ]
In the bareudp6_xmit_skb(), it calculates min_headroom.
At that point, it uses struct iphdr, but it's not correct.
So panic could occur.
The struct ipv6hdr should be used.
Test commands:
ip netns add A
ip netns add B
ip link add veth0 netns A type veth peer name veth1 netns B
ip netns exec A ip link set veth0 up
ip netns exec A ip a a 2001:db8:0::1/64 dev veth0
ip netns exec B ip link set veth1 up
ip netns exec B ip a a 2001:db8:0::2/64 dev veth1
for i in {10..1}
do
let A=$i-1
ip netns exec A ip link add bareudp$i type bareudp dstport $i \
ethertype 0x86dd
ip netns exec A ip link set bareudp$i up
ip netns exec A ip -6 a a 2001:db8:$i::1/64 dev bareudp$i
ip netns exec A ip -6 r a 2001:db8:$i::2 encap ip6 src \
2001:db8:$A::1 dst 2001:db8:$A::2 via 2001:db8:$i::2 \
dev bareudp$i
ip netns exec B ip link add bareudp$i type bareudp dstport $i \
ethertype 0x86dd
ip netns exec B ip link set bareudp$i up
ip netns exec B ip -6 a a 2001:db8:$i::2/64 dev bareudp$i
ip netns exec B ip -6 r a 2001:db8:$i::1 encap ip6 src \
2001:db8:$A::2 dst 2001:db8:$A::1 via 2001:db8:$i::1 \
dev bareudp$i
done
ip netns exec A ping 2001:db8:7::2
Splat looks like:
[ 66.436679][ C2] skbuff: skb_under_panic: text:
ffffffff928614c8 len:454 put:14 head:
ffff88810abb4000 data:
ffff88810abb3ffa tail:0x1c0 end:0x3ec0 dev:veth0
[ 66.441626][ C2] ------------[ cut here ]------------
[ 66.443458][ C2] kernel BUG at net/core/skbuff.c:109!
[ 66.445313][ C2] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[ 66.447606][ C2] CPU: 2 PID: 913 Comm: ping Not tainted 5.10.0+ #819
[ 66.450251][ C2] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[ 66.453713][ C2] RIP: 0010:skb_panic+0x15d/0x15f
[ 66.455345][ C2] Code: 98 fe 4c 8b 4c 24 10 53 8b 4d 70 45 89 e0 48 c7 c7 60 8b 78 93 41 57 41 56 41 55 48 8b 54 24 20 48 8b 74 24 28 e8 b5 40 f9 ff <0f> 0b 48 8b 6c 24 20 89 34 24 e8 08 c9 98 fe 8b 34 24 48 c7 c1 80
[ 66.462314][ C2] RSP: 0018:
ffff888119209648 EFLAGS:
00010286
[ 66.464281][ C2] RAX:
0000000000000089 RBX:
ffff888003159000 RCX:
0000000000000000
[ 66.467216][ C2] RDX:
0000000000000089 RSI:
0000000000000008 RDI:
ffffed10232412c0
[ 66.469768][ C2] RBP:
ffff88810a53d440 R08:
ffffed102328018d R09:
ffffed102328018d
[ 66.472297][ C2] R10:
ffff888119400c67 R11:
ffffed102328018c R12:
000000000000000e
[ 66.474833][ C2] R13:
ffff88810abb3ffa R14:
00000000000001c0 R15:
0000000000003ec0
[ 66.477361][ C2] FS:
00007f37c0c72f00(0000) GS:
ffff888119200000(0000) knlGS:
0000000000000000
[ 66.480214][ C2] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 66.482296][ C2] CR2:
000055a058808570 CR3:
000000011039e002 CR4:
00000000003706e0
[ 66.484811][ C2] DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
[ 66.487793][ C2] DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
[ 66.490424][ C2] Call Trace:
[ 66.491469][ C2] <IRQ>
[ 66.492374][ C2] ? eth_header+0x28/0x190
[ 66.494054][ C2] ? eth_header+0x28/0x190
[ 66.495401][ C2] skb_push.cold.99+0x22/0x22
[ 66.496700][ C2] eth_header+0x28/0x190
[ 66.497867][ C2] neigh_resolve_output+0x3de/0x720
[ 66.499615][ C2] ? __neigh_update+0x7e8/0x20a0
[ 66.501176][ C2] __neigh_update+0x8bd/0x20a0
[ 66.502749][ C2] ndisc_update+0x34/0xc0
[ 66.504010][ C2] ndisc_recv_na+0x8da/0xb80
[ 66.505041][ C2] ? pndisc_redo+0x20/0x20
[ 66.505888][ C2] ? rcu_read_lock_sched_held+0xc0/0xc0
[ 66.506965][ C2] ndisc_rcv+0x3a0/0x470
[ 66.507797][ C2] icmpv6_rcv+0xad9/0x1b00
[ 66.508645][ C2] ip6_protocol_deliver_rcu+0xcd6/0x1560
[ 66.509719][ C2] ip6_input_finish+0x5b/0xf0
[ 66.510615][ C2] ip6_input+0xcd/0x2d0
[ 66.511406][ C2] ? ip6_input_finish+0xf0/0xf0
[ 66.512327][ C2] ? rcu_read_lock_held+0x91/0xa0
[ 66.513279][ C2] ? ip6_protocol_deliver_rcu+0x1560/0x1560
[ 66.514414][ C2] ipv6_rcv+0xe8/0x300
[ ... ]
Acked-by: Guillaume Nault <gnault@redhat.com>
Fixes:
571912c69f0e ("net: UDP tunnel encapsulation module for tunnelling different protocols like MPLS, IP, NSH etc.")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Link: https://lore.kernel.org/r/20201228152146.24270-1-ap420073@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Taehee Yoo [Mon, 28 Dec 2020 15:21:36 +0000 (15:21 +0000)]
bareudp: set NETIF_F_LLTX flag
[ Upstream commit
d9e44981739a96f1a468c13bbbd54ace378caf1c ]
Like other tunneling interfaces, the bareudp doesn't need TXLOCK.
So, It is good to set the NETIF_F_LLTX flag to improve performance and
to avoid lockdep's false-positive warning.
Test commands:
ip netns add A
ip netns add B
ip link add veth0 netns A type veth peer name veth1 netns B
ip netns exec A ip link set veth0 up
ip netns exec A ip a a 10.0.0.1/24 dev veth0
ip netns exec B ip link set veth1 up
ip netns exec B ip a a 10.0.0.2/24 dev veth1
for i in {2..1}
do
let A=$i-1
ip netns exec A ip link add bareudp$i type bareudp \
dstport $i ethertype ip
ip netns exec A ip link set bareudp$i up
ip netns exec A ip a a 10.0.$i.1/24 dev bareudp$i
ip netns exec A ip r a 10.0.$i.2 encap ip src 10.0.$A.1 \
dst 10.0.$A.2 via 10.0.$i.2 dev bareudp$i
ip netns exec B ip link add bareudp$i type bareudp \
dstport $i ethertype ip
ip netns exec B ip link set bareudp$i up
ip netns exec B ip a a 10.0.$i.2/24 dev bareudp$i
ip netns exec B ip r a 10.0.$i.1 encap ip src 10.0.$A.2 \
dst 10.0.$A.1 via 10.0.$i.1 dev bareudp$i
done
ip netns exec A ping 10.0.2.2
Splat looks like:
[ 96.992803][ T822] ============================================
[ 96.993954][ T822] WARNING: possible recursive locking detected
[ 96.995102][ T822] 5.10.0+ #819 Not tainted
[ 96.995927][ T822] --------------------------------------------
[ 96.997091][ T822] ping/822 is trying to acquire lock:
[ 96.998083][ T822]
ffff88810f753898 (_xmit_NONE#2){+.-.}-{2:2}, at: __dev_queue_xmit+0x1f52/0x2960
[ 96.999813][ T822]
[ 96.999813][ T822] but task is already holding lock:
[ 97.001192][ T822]
ffff88810c385498 (_xmit_NONE#2){+.-.}-{2:2}, at: __dev_queue_xmit+0x1f52/0x2960
[ 97.002908][ T822]
[ 97.002908][ T822] other info that might help us debug this:
[ 97.004401][ T822] Possible unsafe locking scenario:
[ 97.004401][ T822]
[ 97.005784][ T822] CPU0
[ 97.006407][ T822] ----
[ 97.007010][ T822] lock(_xmit_NONE#2);
[ 97.007779][ T822] lock(_xmit_NONE#2);
[ 97.008550][ T822]
[ 97.008550][ T822] *** DEADLOCK ***
[ 97.008550][ T822]
[ 97.010057][ T822] May be due to missing lock nesting notation
[ 97.010057][ T822]
[ 97.011594][ T822] 7 locks held by ping/822:
[ 97.012426][ T822] #0:
ffff888109a144f0 (sk_lock-AF_INET){+.+.}-{0:0}, at: raw_sendmsg+0x12f7/0x2b00
[ 97.014191][ T822] #1:
ffffffffbce2f5a0 (rcu_read_lock_bh){....}-{1:2}, at: ip_finish_output2+0x249/0x2020
[ 97.016045][ T822] #2:
ffffffffbce2f5a0 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x1fd/0x2960
[ 97.017897][ T822] #3:
ffff88810c385498 (_xmit_NONE#2){+.-.}-{2:2}, at: __dev_queue_xmit+0x1f52/0x2960
[ 97.019684][ T822] #4:
ffffffffbce2f600 (rcu_read_lock){....}-{1:2}, at: bareudp_xmit+0x31b/0x3690 [bareudp]
[ 97.021573][ T822] #5:
ffffffffbce2f5a0 (rcu_read_lock_bh){....}-{1:2}, at: ip_finish_output2+0x249/0x2020
[ 97.023424][ T822] #6:
ffffffffbce2f5a0 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x1fd/0x2960
[ 97.025259][ T822]
[ 97.025259][ T822] stack backtrace:
[ 97.026349][ T822] CPU: 3 PID: 822 Comm: ping Not tainted 5.10.0+ #819
[ 97.027609][ T822] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[ 97.029407][ T822] Call Trace:
[ 97.030015][ T822] dump_stack+0x99/0xcb
[ 97.030783][ T822] __lock_acquire.cold.77+0x149/0x3a9
[ 97.031773][ T822] ? stack_trace_save+0x81/0xa0
[ 97.032661][ T822] ? register_lock_class+0x1910/0x1910
[ 97.033673][ T822] ? register_lock_class+0x1910/0x1910
[ 97.034679][ T822] ? rcu_read_lock_sched_held+0x91/0xc0
[ 97.035697][ T822] ? rcu_read_lock_bh_held+0xa0/0xa0
[ 97.036690][ T822] lock_acquire+0x1b2/0x730
[ 97.037515][ T822] ? __dev_queue_xmit+0x1f52/0x2960
[ 97.038466][ T822] ? check_flags+0x50/0x50
[ 97.039277][ T822] ? netif_skb_features+0x296/0x9c0
[ 97.040226][ T822] ? validate_xmit_skb+0x29/0xb10
[ 97.041151][ T822] _raw_spin_lock+0x30/0x70
[ 97.041977][ T822] ? __dev_queue_xmit+0x1f52/0x2960
[ 97.042927][ T822] __dev_queue_xmit+0x1f52/0x2960
[ 97.043852][ T822] ? netdev_core_pick_tx+0x290/0x290
[ 97.044824][ T822] ? mark_held_locks+0xb7/0x120
[ 97.045712][ T822] ? lockdep_hardirqs_on_prepare+0x12c/0x3e0
[ 97.046824][ T822] ? __local_bh_enable_ip+0xa5/0xf0
[ 97.047771][ T822] ? ___neigh_create+0x12a8/0x1eb0
[ 97.048710][ T822] ? trace_hardirqs_on+0x41/0x120
[ 97.049626][ T822] ? ___neigh_create+0x12a8/0x1eb0
[ 97.050556][ T822] ? __local_bh_enable_ip+0xa5/0xf0
[ 97.051509][ T822] ? ___neigh_create+0x12a8/0x1eb0
[ 97.052443][ T822] ? check_chain_key+0x244/0x5f0
[ 97.053352][ T822] ? rcu_read_lock_bh_held+0x56/0xa0
[ 97.054317][ T822] ? ip_finish_output2+0x6ea/0x2020
[ 97.055263][ T822] ? pneigh_lookup+0x410/0x410
[ 97.056135][ T822] ip_finish_output2+0x6ea/0x2020
[ ... ]
Acked-by: Guillaume Nault <gnault@redhat.com>
Fixes:
571912c69f0e ("net: UDP tunnel encapsulation module for tunnelling different protocols like MPLS, IP, NSH etc.")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Link: https://lore.kernel.org/r/20201228152136.24215-1-ap420073@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Xie He [Mon, 28 Dec 2020 02:53:39 +0000 (18:53 -0800)]
net: hdlc_ppp: Fix issues when mod_timer is called while timer is running
[ Upstream commit
1fef73597fa545c35fddc953979013882fbd4e55 ]
ppp_cp_event is called directly or indirectly by ppp_rx with "ppp->lock"
held. It may call mod_timer to add a new timer. However, at the same time
ppp_timer may be already running and waiting for "ppp->lock". In this
case, there's no need for ppp_timer to continue running and it can just
exit.
If we let ppp_timer continue running, it may call add_timer. This causes
kernel panic because add_timer can't be called with a timer pending.
This patch fixes this problem.
Fixes:
e022c2f07ae5 ("WAN: new synchronous PPP implementation for generic HDLC.")
Cc: Krzysztof Halasa <khc@pm.waw.pl>
Signed-off-by: Xie He <xie.he.0141@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cong Wang [Sat, 26 Dec 2020 23:44:53 +0000 (15:44 -0800)]
erspan: fix version 1 check in gre_parse_header()
[ Upstream commit
085c7c4e1c0e50d90b7d90f61a12e12b317a91e2 ]
Both version 0 and version 1 use ETH_P_ERSPAN, but version 0 does not
have an erspan header. So the check in gre_parse_header() is wrong,
we have to distinguish version 1 from version 0.
We can just check the gre header length like is_erspan_type1().
Fixes:
cb73ee40b1b3 ("net: ip_gre: use erspan key field for tunnel lookup")
Reported-by: syzbot+f583ce3d4ddf9836b27a@syzkaller.appspotmail.com
Cc: William Tu <u9012063@gmail.com>
Cc: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Yunjian Wang [Sat, 26 Dec 2020 08:10:05 +0000 (16:10 +0800)]
net: hns: fix return value check in __lb_other_process()
[ Upstream commit
5ede3ada3da7f050519112b81badc058190b9f9f ]
The function skb_copy() could return NULL, the return value
need to be checked.
Fixes:
b5996f11ea54 ("net: add Hisilicon Network Subsystem basic ethernet support")
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Randy Dunlap [Fri, 25 Dec 2020 06:23:44 +0000 (22:23 -0800)]
net: sched: prevent invalid Scell_log shift count
[ Upstream commit
bd1248f1ddbc48b0c30565fce897a3b6423313b8 ]
Check Scell_log shift size in red_check_params() and modify all callers
of red_check_params() to pass Scell_log.
This prevents a shift out-of-bounds as detected by UBSAN:
UBSAN: shift-out-of-bounds in ./include/net/red.h:252:22
shift exponent 72 is too large for 32-bit type 'int'
Fixes:
8afa10cbe281 ("net_sched: red: Avoid illegal values")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: syzbot+97c5bd9cc81eca63d36e@syzkaller.appspotmail.com
Cc: Nogah Frankel <nogahf@mellanox.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: netdev@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Guillaume Nault [Thu, 24 Dec 2020 19:01:09 +0000 (20:01 +0100)]
ipv4: Ignore ECN bits for fib lookups in fib_compute_spec_dst()
[ Upstream commit
21fdca22eb7df2a1e194b8adb812ce370748b733 ]
RT_TOS() only clears one of the ECN bits. Therefore, when
fib_compute_spec_dst() resorts to a fib lookup, it can return
different results depending on the value of the second ECN bit.
For example, ECT(0) and ECT(1) packets could be treated differently.
$ ip netns add ns0
$ ip netns add ns1
$ ip link add name veth01 netns ns0 type veth peer name veth10 netns ns1
$ ip -netns ns0 link set dev lo up
$ ip -netns ns1 link set dev lo up
$ ip -netns ns0 link set dev veth01 up
$ ip -netns ns1 link set dev veth10 up
$ ip -netns ns0 address add 192.0.2.10/24 dev veth01
$ ip -netns ns1 address add 192.0.2.11/24 dev veth10
$ ip -netns ns1 address add 192.0.2.21/32 dev lo
$ ip -netns ns1 route add 192.0.2.10/32 tos 4 dev veth10 src 192.0.2.21
$ ip netns exec ns1 sysctl -wq net.ipv4.icmp_echo_ignore_broadcasts=0
With TOS 4 and ECT(1), ns1 replies using source address 192.0.2.21
(ping uses -Q to set all TOS and ECN bits):
$ ip netns exec ns0 ping -c 1 -b -Q 5 192.0.2.255
[...]
64 bytes from 192.0.2.21: icmp_seq=1 ttl=64 time=0.544 ms
But with TOS 4 and ECT(0), ns1 replies using source address 192.0.2.11
because the "tos 4" route isn't matched:
$ ip netns exec ns0 ping -c 1 -b -Q 6 192.0.2.255
[...]
64 bytes from 192.0.2.11: icmp_seq=1 ttl=64 time=0.597 ms
After this patch the ECN bits don't affect the result anymore:
$ ip netns exec ns0 ping -c 1 -b -Q 6 192.0.2.255
[...]
64 bytes from 192.0.2.21: icmp_seq=1 ttl=64 time=0.591 ms
Fixes:
35ebf65e851c ("ipv4: Create and use fib_compute_spec_dst() helper.")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Vasundhara Volam [Sun, 27 Dec 2020 19:18:17 +0000 (14:18 -0500)]
bnxt_en: Fix AER recovery.
[ Upstream commit
fb1e6e562b37b39adfe251919c9abfdb3e01f921 ]
A recent change skips sending firmware messages to the firmware when
pci_channel_offline() is true during fatal AER error. To make this
complete, we need to move the re-initialization sequence to
bnxt_io_resume(), otherwise the firmware messages to re-initialize
will all be skipped. In any case, it is more correct to re-initialize
in bnxt_io_resume().
Also, fix the reverse x-mas tree format when defining variables
in bnxt_io_slot_reset().
Fixes:
b340dc680ed4 ("bnxt_en: Avoid sending firmware messages when AER error is detected.")
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Stefan Chulski [Wed, 23 Dec 2020 18:35:21 +0000 (20:35 +0200)]
net: mvpp2: fix pkt coalescing int-threshold configuration
[ Upstream commit
4f374d2c43a9e5e773f1dee56db63bd6b8a36276 ]
The packet coalescing interrupt threshold has separated registers
for different aggregated/cpu (sw-thread). The required value should
be loaded for every thread but not only for 1 current cpu.
Fixes:
213f428f5056 ("net: mvpp2: add support for TX interrupts and RX queue distribution modes")
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Link: https://lore.kernel.org/r/1608748521-11033-1-git-send-email-stefanc@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Michael Chan [Sun, 27 Dec 2020 19:18:18 +0000 (14:18 -0500)]
bnxt_en: Check TQM rings for maximum supported value.
[ Upstream commit
a029a2fef5d11bb85587433c3783615442abac96 ]
TQM rings are hardware resources that require host context memory
managed by the driver. The driver supports up to 9 TQM rings and
the number of rings to use is requested by firmware during run-time.
Cap this number to the maximum supported to prevent accessing beyond
the array. Future firmware may request more than 9 TQM rings. Define
macros to remove the magic number 9 from the C code.
Fixes:
ac3158cb0108 ("bnxt_en: Allocate TQM ring context memory according to fw specification.")
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Mario Limonciello [Mon, 14 Dec 2020 19:29:35 +0000 (13:29 -0600)]
e1000e: Export S0ix flags to ethtool
[ Upstream commit
3c98cbf22a96c1b12f48c1b2a4680dfe5cb280f9 ]
This flag can be used by an end user to disable S0ix flows on a
buggy system or by an OEM for development purposes.
If you need this flag to be persisted across reboots, it's suggested
to use a udev rule to call adjust it until the kernel could have your
configuration in a disallow list.
Signed-off-by: Mario Limonciello <mario.limonciello@dell.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Tested-by: Yijun Shen <Yijun.shen@dell.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Mario Limonciello [Mon, 14 Dec 2020 19:29:34 +0000 (13:29 -0600)]
Revert "e1000e: disable s0ix entry and exit flows for ME systems"
[ Upstream commit
6cecf02e77ab9bf97e9252f9fcb8f0738a6de12c ]
commit
e086ba2fccda ("e1000e: disable s0ix entry and exit flows for ME
systems") disabled s0ix flows for systems that have various incarnations of
the i219-LM ethernet controller. This changed caused power consumption
regressions on the following shipping Dell Comet Lake based laptops:
* Latitude 5310
* Latitude 5410
* Latitude 5410
* Latitude 5510
* Precision 3550
* Latitude 5411
* Latitude 5511
* Precision 3551
* Precision 7550
* Precision 7750
This commit was introduced because of some regressions on certain Thinkpad
laptops. This comment was potentially caused by an earlier
commit
632fbd5eb5b0e ("e1000e: fix S0ix flows for cable connected case").
or it was possibly caused by a system not meeting platform architectural
requirements for low power consumption. Other changes made in the driver
with extended timeouts are expected to make the driver more impervious to
platform firmware behavior.
Fixes:
e086ba2fccda ("e1000e: disable s0ix entry and exit flows for ME systems")
Reviewed-by: Alexander Duyck <alexander.duyck@gmail.com>
Signed-off-by: Mario Limonciello <mario.limonciello@dell.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Tested-by: Yijun Shen <Yijun.shen@dell.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Mario Limonciello [Mon, 14 Dec 2020 19:29:33 +0000 (13:29 -0600)]
e1000e: bump up timeout to wait when ME un-configures ULP mode
[ Upstream commit
3cf31b1a9effd859bb3d6ff9f8b5b0d5e6cac952 ]
Per guidance from Intel ethernet architecture team, it may take
up to 1 second for unconfiguring ULP mode.
However in practice this seems to be taking up to 2 seconds on
some Lenovo machines. Detect scenarios that take more than 1 second
but less than 2.5 seconds and emit a warning on resume for those
scenarios.
Suggested-by: Aaron Ma <aaron.ma@canonical.com>
Suggested-by: Sasha Netfin <sasha.neftin@intel.com>
Suggested-by: Hans de Goede <hdegoede@redhat.com>
CC: Mark Pearson <markpearson@lenovo.com>
Fixes:
f15bb6dde738cc8fa0 ("e1000e: Add support for S0ix")
BugLink: https://bugs.launchpad.net/bugs/1865570
Link: https://patchwork.ozlabs.org/project/intel-wired-lan/patch/20200323191639.48826-1-aaron.ma@canonical.com/
Link: https://lkml.org/lkml/2020/12/13/15
Link: https://lkml.org/lkml/2020/12/14/708
Signed-off-by: Mario Limonciello <mario.limonciello@dell.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Tested-by: Yijun Shen <Yijun.shen@dell.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Mario Limonciello [Mon, 14 Dec 2020 19:29:32 +0000 (13:29 -0600)]
e1000e: Only run S0ix flows if shutdown succeeded
[ Upstream commit
808e0d8832cc81738f3e8df12dff0688352baf50 ]
If the shutdown failed, the part will be thawed and running
S0ix flows will put it into an undefined state.
Reported-by: Alexander Duyck <alexander.duyck@gmail.com>
Reviewed-by: Alexander Duyck <alexander.duyck@gmail.com>
Signed-off-by: Mario Limonciello <mario.limonciello@dell.com>
Tested-by: Yijun Shen <Yijun.shen@dell.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Yunjian Wang [Fri, 25 Dec 2020 02:52:16 +0000 (10:52 +0800)]
tun: fix return value when the number of iovs exceeds MAX_SKB_FRAGS
[ Upstream commit
950271d7cc0b4546af3549d8143c4132d6e1f138 ]
Currently the tun_napi_alloc_frags() function returns -ENOMEM when the
number of iovs exceeds MAX_SKB_FRAGS + 1. However this is inappropriate,
we should use -EMSGSIZE instead of -ENOMEM.
The following distinctions are matters:
1. the caller need to drop the bad packet when -EMSGSIZE is returned,
which means meeting a persistent failure.
2. the caller can try again when -ENOMEM is returned, which means
meeting a transient failure.
Fixes:
90e33d459407 ("tun: enable napi_gro_frags() for TUN/TAP driver")
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/r/1608864736-24332-1-git-send-email-wangyunjian@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Grygorii Strashko [Thu, 24 Dec 2020 16:24:05 +0000 (18:24 +0200)]
net: ethernet: ti: cpts: fix ethtool output when no ptp_clock registered
[ Upstream commit
4614792eebcbf81c60ad3604c1aeeb2b0899cea4 ]
The CPTS driver registers PTP PHC clock when first netif is going up and
unregister it when all netif are down. Now ethtool will show:
- PTP PHC clock index 0 after boot until first netif is up;
- the last assigned PTP PHC clock index even if PTP PHC clock is not
registered any more after all netifs are down.
This patch ensures that -1 is returned by ethtool when PTP PHC clock is not
registered any more.
Fixes:
8a2c9a5ab4b9 ("net: ethernet: ti: cpts: rework initialization/deinitialization")
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Link: https://lore.kernel.org/r/20201224162405.28032-1-grygorii.strashko@ti.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Antoine Tenart [Wed, 23 Dec 2020 21:23:23 +0000 (22:23 +0100)]
net-sysfs: take the rtnl lock when accessing xps_rxqs_map and num_tc
[ Upstream commit
4ae2bb81649dc03dfc95875f02126b14b773f7ab ]
Accesses to dev->xps_rxqs_map (when using dev->num_tc) should be
protected by the rtnl lock, like we do for netif_set_xps_queue. I didn't
see an actual bug being triggered, but let's be safe here and take the
rtnl lock while accessing the map in sysfs.
Fixes:
8af2c06ff4b1 ("net-sysfs: Add interface for Rx queue(s) map per Tx queue")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Antoine Tenart [Wed, 23 Dec 2020 21:23:22 +0000 (22:23 +0100)]
net-sysfs: take the rtnl lock when storing xps_rxqs
[ Upstream commit
2d57b4f142e0b03e854612b8e28978935414bced ]
Two race conditions can be triggered when storing xps rxqs, resulting in
various oops and invalid memory accesses:
1. Calling netdev_set_num_tc while netif_set_xps_queue:
- netif_set_xps_queue uses dev->tc_num as one of the parameters to
compute the size of new_dev_maps when allocating it. dev->tc_num is
also used to access the map, and the compiler may generate code to
retrieve this field multiple times in the function.
- netdev_set_num_tc sets dev->tc_num.
If new_dev_maps is allocated using dev->tc_num and then dev->tc_num
is set to a higher value through netdev_set_num_tc, later accesses to
new_dev_maps in netif_set_xps_queue could lead to accessing memory
outside of new_dev_maps; triggering an oops.
2. Calling netif_set_xps_queue while netdev_set_num_tc is running:
2.1. netdev_set_num_tc starts by resetting the xps queues,
dev->tc_num isn't updated yet.
2.2. netif_set_xps_queue is called, setting up the map with the
*old* dev->num_tc.
2.3. netdev_set_num_tc updates dev->tc_num.
2.4. Later accesses to the map lead to out of bound accesses and
oops.
A similar issue can be found with netdev_reset_tc.
One way of triggering this is to set an iface up (for which the driver
uses netdev_set_num_tc in the open path, such as bnx2x) and writing to
xps_rxqs in a concurrent thread. With the right timing an oops is
triggered.
Both issues have the same fix: netif_set_xps_queue, netdev_set_num_tc
and netdev_reset_tc should be mutually exclusive. We do that by taking
the rtnl lock in xps_rxqs_store.
Fixes:
8af2c06ff4b1 ("net-sysfs: Add interface for Rx queue(s) map per Tx queue")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Antoine Tenart [Wed, 23 Dec 2020 21:23:21 +0000 (22:23 +0100)]
net-sysfs: take the rtnl lock when accessing xps_cpus_map and num_tc
[ Upstream commit
fb25038586d0064123e393cadf1fadd70a9df97a ]
Accesses to dev->xps_cpus_map (when using dev->num_tc) should be
protected by the rtnl lock, like we do for netif_set_xps_queue. I didn't
see an actual bug being triggered, but let's be safe here and take the
rtnl lock while accessing the map in sysfs.
Fixes:
184c449f91fe ("net: Add support for XPS with QoS via traffic classes")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Antoine Tenart [Wed, 23 Dec 2020 21:23:20 +0000 (22:23 +0100)]
net-sysfs: take the rtnl lock when storing xps_cpus
[ Upstream commit
1ad58225dba3f2f598d2c6daed4323f24547168f ]
Two race conditions can be triggered when storing xps cpus, resulting in
various oops and invalid memory accesses:
1. Calling netdev_set_num_tc while netif_set_xps_queue:
- netif_set_xps_queue uses dev->tc_num as one of the parameters to
compute the size of new_dev_maps when allocating it. dev->tc_num is
also used to access the map, and the compiler may generate code to
retrieve this field multiple times in the function.
- netdev_set_num_tc sets dev->tc_num.
If new_dev_maps is allocated using dev->tc_num and then dev->tc_num
is set to a higher value through netdev_set_num_tc, later accesses to
new_dev_maps in netif_set_xps_queue could lead to accessing memory
outside of new_dev_maps; triggering an oops.
2. Calling netif_set_xps_queue while netdev_set_num_tc is running:
2.1. netdev_set_num_tc starts by resetting the xps queues,
dev->tc_num isn't updated yet.
2.2. netif_set_xps_queue is called, setting up the map with the
*old* dev->num_tc.
2.3. netdev_set_num_tc updates dev->tc_num.
2.4. Later accesses to the map lead to out of bound accesses and
oops.
A similar issue can be found with netdev_reset_tc.
One way of triggering this is to set an iface up (for which the driver
uses netdev_set_num_tc in the open path, such as bnx2x) and writing to
xps_cpus in a concurrent thread. With the right timing an oops is
triggered.
Both issues have the same fix: netif_set_xps_queue, netdev_set_num_tc
and netdev_reset_tc should be mutually exclusive. We do that by taking
the rtnl lock in xps_cpus_store.
Fixes:
184c449f91fe ("net: Add support for XPS with QoS via traffic classes")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dinghao Liu [Wed, 23 Dec 2020 11:06:12 +0000 (19:06 +0800)]
net: ethernet: Fix memleak in ethoc_probe
[ Upstream commit
5d41f9b7ee7a5a5138894f58846a4ffed601498a ]
When mdiobus_register() fails, priv->mdio allocated
by mdiobus_alloc() has not been freed, which leads
to memleak.
Fixes:
e7f4dc3536a4 ("mdio: Move allocation of interrupts into core")
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20201223110615.31389-1-dinghao.liu@zju.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
John Wang [Wed, 23 Dec 2020 05:55:23 +0000 (13:55 +0800)]
net/ncsi: Use real net-device for response handler
[ Upstream commit
427c940558560bff2583d07fc119a21094675982 ]
When aggregating ncsi interfaces and dedicated interfaces to bond
interfaces, the ncsi response handler will use the wrong net device to
find ncsi_dev, so that the ncsi interface will not work properly.
Here, we use the original net device to fix it.
Fixes:
138635cc27c9 ("net/ncsi: NCSI response packet handler")
Signed-off-by: John Wang <wangzhiqiang.bj@bytedance.com>
Link: https://lore.kernel.org/r/20201223055523.2069-1-wangzhiqiang.bj@bytedance.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jeff Dike [Wed, 23 Dec 2020 02:54:21 +0000 (21:54 -0500)]
virtio_net: Fix recursive call to cpus_read_lock()
[ Upstream commit
de33212f768c5d9e2fe791b008cb26f92f0aa31c ]
virtnet_set_channels can recursively call cpus_read_lock if CONFIG_XPS
and CONFIG_HOTPLUG are enabled.
The path is:
virtnet_set_channels - calls get_online_cpus(), which is a trivial
wrapper around cpus_read_lock()
netif_set_real_num_tx_queues
netif_reset_xps_queues_gt
netif_reset_xps_queues - calls cpus_read_lock()
This call chain and potential deadlock happens when the number of TX
queues is reduced.
This commit the removes netif_set_real_num_[tr]x_queues calls from
inside the get/put_online_cpus section, as they don't require that it
be held.
Fixes:
47be24796c13 ("virtio-net: fix the set affinity bug when CPU IDs are not consecutive")
Signed-off-by: Jeff Dike <jdike@akamai.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/r/20201223025421.671-1-jdike@akamai.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Manish Chopra [Mon, 21 Dec 2020 14:55:30 +0000 (06:55 -0800)]
qede: fix offload for IPIP tunnel packets
[ Upstream commit
5d5647dad259bb416fd5d3d87012760386d97530 ]
IPIP tunnels packets are unknown to device,
hence these packets are incorrectly parsed and
caused the packet corruption, so disable offlods
for such packets at run time.
Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Sudarsana Kalluru <skalluru@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Link: https://lore.kernel.org/r/20201221145530.7771-1-manishc@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dinghao Liu [Sun, 20 Dec 2020 08:29:30 +0000 (16:29 +0800)]
net: ethernet: mvneta: Fix error handling in mvneta_probe
[ Upstream commit
58f60329a6be35a5653edb3fd2023ccef9eb9943 ]
When mvneta_port_power_up() fails, we should execute
cleanup functions after label err_netdev to avoid memleak.
Fixes:
41c2b6b4f0f80 ("net: ethernet: mvneta: Add back interface mode validation")
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Link: https://lore.kernel.org/r/20201220082930.21623-1-dinghao.liu@zju.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Lijun Pan [Sat, 19 Dec 2020 21:40:34 +0000 (15:40 -0600)]
ibmvnic: continue fatal error reset after passive init
[ Upstream commit
1f45dc22066797479072978feeada0852502e180 ]
Commit
f9c6cea0b385 ("ibmvnic: Skip fatal error reset after passive init")
says "If the passive
CRQ initialization occurs before the FATAL reset task is processed,
the FATAL error reset task would try to access a CRQ message queue
that was freed, causing an oops. The problem may be most likely to
occur during DLPAR add vNIC with a non-default MTU, because the DLPAR
process will automatically issue a change MTU request.
Fix this by not processing fatal error reset if CRQ is passively
initialized after client-driven CRQ initialization fails."
Even with this commit, we still see similar kernel crashes. In order
to completely solve this problem, we'd better continue the fatal error
reset, capture the kernel crash, and try to fix it from that end.
Fixes:
f9c6cea0b385 ("ibmvnic: Skip fatal error reset after passive init")
Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Link: https://lore.kernel.org/r/20201219214034.21123-1-ljp@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Lijun Pan [Sat, 19 Dec 2020 21:39:19 +0000 (15:39 -0600)]
ibmvnic: fix login buffer memory leak
[ Upstream commit
a0c8be56affa7d5ffbdec24c992223be54db3b6e ]
Commit
34f0f4e3f488 ("ibmvnic: Fix login buffer memory leaks") frees
login_rsp_buffer in release_resources() and send_login()
because handle_login_rsp() does not free it.
Commit
f3ae59c0c015 ("ibmvnic: store RX and TX subCRQ handle array in
ibmvnic_adapter struct") frees login_rsp_buffer in handle_login_rsp().
It seems unnecessary to free it in release_resources() and send_login().
There are chances that handle_login_rsp returns earlier without freeing
buffers. Double-checking the buffer is harmless since
release_login_buffer and release_login_rsp_buffer will
do nothing if buffer is already freed.
Fixes:
f3ae59c0c015 ("ibmvnic: store RX and TX subCRQ handle array in ibmvnic_adapter struct")
Fixes:
34f0f4e3f488 ("ibmvnic: Fix login buffer memory leaks")
Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Link: https://lore.kernel.org/r/20201219213919.21045-1-ljp@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Martin Blumenstingl [Sat, 19 Dec 2020 13:50:36 +0000 (14:50 +0100)]
net: stmmac: dwmac-meson8b: ignore the second clock input
[ Upstream commit
f87777a3c30cf50c66a20e1d153f0e003bb30774 ]
The dwmac glue registers on Amlogic Meson8b and newer SoCs has two clock
inputs:
- Meson8b and Meson8m2: MPLL2 and MPLL2 (the same parent is wired to
both inputs)
- GXBB, GXL, GXM, AXG, G12A, G12B, SM1: FCLK_DIV2 and MPLL2
All known vendor kernels and u-boots are using the first input only. We
let the common clock framework automatically choose the "right" parent.
For some boards this causes a problem though, specificially with G12A and
newer SoCs. The clock input is used for generating the 125MHz RGMII TX
clock. For the two input clocks this means on G12A:
- FCLK_DIV2: 999999985Hz / 8 =
124999998.125Hz
- MPLL2: 499999993Hz / 4 =
124999998.25Hz
In theory MPLL2 is the "better" clock input because it's gets us 0.125Hz
closer to the requested frequency than FCLK_DIV2. In reality however
there is a resource conflict because MPLL2 is needed to generate some of
the audio clocks. dwmac-meson8b probes first and sets up the clock tree
with MPLL2. This works fine until the audio driver comes and "steals"
the MPLL2 clocks and configures it with it's own rate (294909637Hz). The
common clock framework happily changes the MPLL2 rate but does not
reconfigure our RGMII TX clock tree, which then ends up at 73727409Hz,
which is more than 40% off the requested 125MHz.
Don't use the second clock input for now to force the common clock
framework to always select the first parent. This mimics the behavior
from the vendor driver and fixes the clock resource conflict with the
audio driver on G12A boards. Once the common clock framework can handle
this situation this change can be reverted again.
Fixes:
566e8251625304 ("net: stmmac: add a glue driver for the Amlogic Meson 8b / GXBB DWMAC")
Reported-by: Thomas Graichen <thomas.graichen@gmail.com>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Tested-by: thomas graichen <thomas.graichen@gmail.com>
Link: https://lore.kernel.org/r/20201219135036.3216017-1-martin.blumenstingl@googlemail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Stefan Chulski [Sun, 20 Dec 2020 11:02:29 +0000 (13:02 +0200)]
net: mvpp2: Fix GoP port 3 Networking Complex Control configurations
[ Upstream commit
2575bc1aa9d52a62342b57a0b7d0a12146cf6aed ]
During GoP port 2 Networking Complex Control mode of operation configurations,
also GoP port 3 mode of operation was wrongly set.
Patch removes these configurations.
Fixes:
f84bf386f395 ("net: mvpp2: initialize the GoP")
Acked-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Link: https://lore.kernel.org/r/1608462149-1702-1-git-send-email-stefanc@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dan Carpenter [Sat, 19 Dec 2020 11:01:44 +0000 (14:01 +0300)]
atm: idt77252: call pci_disable_device() on error path
[ Upstream commit
8df66af5c1e5f80562fe728db5ec069b21810144 ]
This error path needs to disable the pci device before returning.
Fixes:
ede58ef28e10 ("atm: remove deprecated use of pci api")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/X93dmC4NX0vbTpGp@mwanda
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Shannon Nelson [Fri, 18 Dec 2020 21:50:01 +0000 (13:50 -0800)]
ionic: account for vlan tag len in rx buffer len
[ Upstream commit
83469893204281ecf65d572bddf02de29a19787c ]
Let the FW know we have enough receive buffer space for the
vlan tag if it isn't stripped.
Fixes:
0f3154e6bcb3 ("ionic: Add Tx and Rx handling")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Link: https://lore.kernel.org/r/20201218215001.64696-1-snelson@pensando.io
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Rasmus Villemoes [Fri, 18 Dec 2020 10:55:36 +0000 (11:55 +0100)]
ethernet: ucc_geth: set dev->max_mtu to 1518
[ Upstream commit
1385ae5c30f238f81bc6528d897c6d7a0816783f ]
All the buffers and registers are already set up appropriately for an
MTU slightly above 1500, so we just need to expose this to the
networking stack. AFAICT, there's no need to implement .ndo_change_mtu
when the receive buffers are always set up to support the max_mtu.
This fixes several warnings during boot on our mpc8309-board with an
embedded mv88e6250 switch:
mv88e6085 mdio@
e0102120:10: nonfatal error -34 setting MTU 1500 on port 0
...
mv88e6085 mdio@
e0102120:10: nonfatal error -34 setting MTU 1500 on port 4
ucc_geth
e0102000.ethernet eth1: error -22 setting MTU to 1504 to include DSA overhead
The last line explains what the DSA stack tries to do: achieving an MTU
of 1500 on-the-wire requires that the master netdevice connected to
the CPU port supports an MTU of 1500+the tagging overhead.
Fixes:
bfcb813203e6 ("net: dsa: configure the MTU for switch ports")
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Rasmus Villemoes [Fri, 18 Dec 2020 10:55:38 +0000 (11:55 +0100)]
ethernet: ucc_geth: fix use-after-free in ucc_geth_remove()
[ Upstream commit
e925e0cd2a705aaacb0b907bb3691fcac3a973a4 ]
ugeth is the netdiv_priv() part of the netdevice. Accessing the memory
pointed to by ugeth (such as done by ucc_geth_memclean() and the two
of_node_puts) after free_netdev() is thus use-after-free.
Fixes:
80a9fad8e89a ("ucc_geth: fix module removal")
Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Florian Fainelli [Fri, 18 Dec 2020 17:38:43 +0000 (09:38 -0800)]
net: systemport: set dev->max_mtu to UMAC_MAX_MTU_SIZE
[ Upstream commit
54ddbdb024882e226055cc4c3c246592ddde2ee5 ]
The driver is already allocating receive buffers of 2KiB and the
Ethernet MAC is configured to accept frames up to UMAC_MAX_MTU_SIZE.
Fixes:
bfcb813203e6 ("net: dsa: configure the MTU for switch ports")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20201218173843.141046-1-f.fainelli@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Stefan Chulski [Thu, 17 Dec 2020 18:37:46 +0000 (20:37 +0200)]
net: mvpp2: prs: fix PPPoE with ipv6 packet parse
[ Upstream commit
fec6079b2eeab319d9e3d074f54d3b6f623e9701 ]
Current PPPoE+IPv6 entry is jumping to 'next-hdr'
field and not to 'DIP' field as done for IPv4.
Fixes:
3f518509dedc ("ethernet: Add new driver for Marvell Armada 375 network unit")
Reported-by: Liron Himi <lironh@marvell.com>
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Link: https://lore.kernel.org/r/1608230266-22111-1-git-send-email-stefanc@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Stefan Chulski [Thu, 17 Dec 2020 18:30:17 +0000 (20:30 +0200)]
net: mvpp2: Add TCAM entry to drop flow control pause frames
[ Upstream commit
3f48fab62bb81a7f9d01e9d43c40395fad011dd5 ]
Issue:
Flow control frame used to pause GoP(MAC) was delivered to the CPU
and created a load on the CPU. Since XOFF/XON frames are used only
by MAC, these frames should be dropped inside MAC.
Fix:
According to 802.3-2012 - IEEE Standard for Ethernet pause frame
has unique destination MAC address 01-80-C2-00-00-01.
Add TCAM parser entry to track and drop pause frames by destination MAC.
Fixes:
3f518509dedc ("ethernet: Add new driver for Marvell Armada 375 network unit")
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Link: https://lore.kernel.org/r/1608229817-21951-1-git-send-email-stefanc@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Davide Caratti [Thu, 17 Dec 2020 21:29:46 +0000 (22:29 +0100)]
net/sched: sch_taprio: ensure to reset/destroy all child qdiscs
[ Upstream commit
698285da79f5b0b099db15a37ac661ac408c80eb ]
taprio_graft() can insert a NULL element in the array of child qdiscs. As
a consquence, taprio_reset() might not reset child qdiscs completely, and
taprio_destroy() might leak resources. Fix it by ensuring that loops that
iterate over q->qdiscs[] don't end when they find the first NULL item.
Fixes:
44d4775ca518 ("net/sched: sch_taprio: reset child qdiscs before freeing them")
Fixes:
5a781ccbd19e ("tc: Add support for configuring the taprio scheduler")
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Link: https://lore.kernel.org/r/13edef6778fef03adc751582562fba4a13e06d6a.1608240532.git.dcaratti@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jakub Kicinski [Thu, 3 Dec 2020 02:18:06 +0000 (18:18 -0800)]
iavf: fix double-release of rtnl_lock
[ Upstream commit
f1340265726e0edf8a8cef28e665b28ad6302ce9 ]
This code does not jump to exit on an error in iavf_lan_add_device(),
so the rtnl_unlock() from the normal path will follow.
Fixes:
b66c7bc1cd4d ("iavf: Refactor init state machine")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sylwester Dziedziuch [Thu, 22 Oct 2020 10:39:36 +0000 (12:39 +0200)]
i40e: Fix Error I40E_AQ_RC_EINVAL when removing VFs
[ Upstream commit
3ac874fa84d1baaf0c0175f2a1499f5d88d528b2 ]
When removing VFs for PF added to bridge there was
an error I40E_AQ_RC_EINVAL. It was caused by not properly
resetting and reinitializing PF when adding/removing VFs.
Changed how reset is performed when adding/removing VFs
to properly reinitialize PFs VSI.
Fixes:
fc60861e9b00 ("i40e: start up in VEPA mode by default")
Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Sat, 9 Jan 2021 12:46:25 +0000 (13:46 +0100)]
Linux 5.10.6
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Shuah Khan <skhan@linuxfoundation.org>
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20210107143052.392839477@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Zhang Xiaohui [Sun, 6 Dec 2020 08:48:01 +0000 (16:48 +0800)]
mwifiex: Fix possible buffer overflows in mwifiex_cmd_802_11_ad_hoc_start
[ Upstream commit
5c455c5ab332773464d02ba17015acdca198f03d ]
mwifiex_cmd_802_11_ad_hoc_start() calls memcpy() without checking
the destination size may trigger a buffer overflower,
which a local user could use to cause denial of service
or the execution of arbitrary code.
Fix it by putting the length check before calling memcpy().
Signed-off-by: Zhang Xiaohui <ruc_zhangxiaohui@163.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20201206084801.26479-1-ruc_zhangxiaohui@163.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Eric W. Biederman [Thu, 3 Dec 2020 20:12:00 +0000 (14:12 -0600)]
exec: Transform exec_update_mutex into a rw_semaphore
[ Upstream commit
f7cfd871ae0c5008d94b6f66834e7845caa93c15 ]
Recently syzbot reported[0] that there is a deadlock amongst the users
of exec_update_mutex. The problematic lock ordering found by lockdep
was:
perf_event_open (exec_update_mutex -> ovl_i_mutex)
chown (ovl_i_mutex -> sb_writes)
sendfile (sb_writes -> p->lock)
by reading from a proc file and writing to overlayfs
proc_pid_syscall (p->lock -> exec_update_mutex)
While looking at possible solutions it occured to me that all of the
users and possible users involved only wanted to state of the given
process to remain the same. They are all readers. The only writer is
exec.
There is no reason for readers to block on each other. So fix
this deadlock by transforming exec_update_mutex into a rw_semaphore
named exec_update_lock that only exec takes for writing.
Cc: Jann Horn <jannh@google.com>
Cc: Vasiliy Kulikov <segoon@openwall.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Bernd Edlinger <bernd.edlinger@hotmail.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Christopher Yeoh <cyeoh@au1.ibm.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Sargun Dhillon <sargun@sargun.me>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Fixes:
eea9673250db ("exec: Add exec_update_mutex to replace cred_guard_mutex")
[0] https://lkml.kernel.org/r/
00000000000063640c05ade8e3de@google.com
Reported-by: syzbot+db9cdf3dd1f64252c6ef@syzkaller.appspotmail.com
Link: https://lkml.kernel.org/r/87ft4mbqen.fsf@x220.int.ebiederm.org
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Eric W. Biederman [Thu, 3 Dec 2020 20:11:13 +0000 (14:11 -0600)]
rwsem: Implement down_read_interruptible
[ Upstream commit
31784cff7ee073b34d6eddabb95e3be2880a425c ]
In preparation for converting exec_update_mutex to a rwsem so that
multiple readers can execute in parallel and not deadlock, add
down_read_interruptible. This is needed for perf_event_open to be
converted (with no semantic changes) from working on a mutex to
wroking on a rwsem.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/87k0tybqfy.fsf@x220.int.ebiederm.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Eric W. Biederman [Thu, 3 Dec 2020 20:10:32 +0000 (14:10 -0600)]
rwsem: Implement down_read_killable_nested
[ Upstream commit
0f9368b5bf6db0c04afc5454b1be79022a681615 ]
In preparation for converting exec_update_mutex to a rwsem so that
multiple readers can execute in parallel and not deadlock, add
down_read_killable_nested. This is needed so that kcmp_lock
can be converted from working on a mutexes to working on rw_semaphores.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/87o8jabqh3.fsf@x220.int.ebiederm.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
peterz@infradead.org [Fri, 28 Aug 2020 12:37:20 +0000 (14:37 +0200)]
perf: Break deadlock involving exec_update_mutex
[ Upstream commit
78af4dc949daaa37b3fcd5f348f373085b4e858f ]
Syzbot reported a lock inversion involving perf. The sore point being
perf holding exec_update_mutex() for a very long time, specifically
across a whole bunch of filesystem ops in pmu::event_init() (uprobes)
and anon_inode_getfile().
This then inverts against procfs code trying to take
exec_update_mutex.
Move the permission checks later, such that we need to hold the mutex
over less code.
Reported-by: syzbot+db9cdf3dd1f64252c6ef@syzkaller.appspotmail.com
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Miklos Szeredi [Thu, 10 Dec 2020 14:33:14 +0000 (15:33 +0100)]
fuse: fix bad inode
[ Upstream commit
5d069dbe8aaf2a197142558b6fb2978189ba3454 ]
Jan Kara's analysis of the syzbot report (edited):
The reproducer opens a directory on FUSE filesystem, it then attaches
dnotify mark to the open directory. After that a fuse_do_getattr() call
finds that attributes returned by the server are inconsistent, and calls
make_bad_inode() which, among other things does:
inode->i_mode = S_IFREG;
This then confuses dnotify which doesn't tear down its structures
properly and eventually crashes.
Avoid calling make_bad_inode() on a live inode: switch to a private flag on
the fuse inode. Also add the test to ops which the bad_inode_ops would
have caught.
This bug goes back to the initial merge of fuse in 2.6.14...
Reported-by: syzbot+f427adf9324b92652ccc@syzkaller.appspotmail.com
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Tested-by: Jan Kara <jack@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Jason Gunthorpe [Fri, 6 Nov 2020 14:00:49 +0000 (10:00 -0400)]
RDMA/siw,rxe: Make emulated devices virtual in the device tree
[ Upstream commit
a9d2e9ae953f0ddd0327479c81a085adaa76d903 ]
This moves siw and rxe to be virtual devices in the device tree:
lrwxrwxrwx 1 root root 0 Nov 6 13:55 /sys/class/infiniband/rxe0 -> ../../devices/virtual/infiniband/rxe0/
Previously they were trying to parent themselves to the physical device of
their attached netdev, which doesn't make alot of sense.
My hope is this will solve some weird syzkaller hits related to sysfs as
it could be possible that the parent of a netdev is another netdev, eg
under bonding or some other syzkaller found netdev configuration.
Nesting a ib_device under anything but a physical device is going to cause
inconsistencies in sysfs during destructions.
Link: https://lore.kernel.org/r/0-v1-dcbfc68c4b4a+d6-virtual_dev_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Christoph Hellwig [Fri, 6 Nov 2020 18:19:38 +0000 (19:19 +0100)]
RDMA/core: remove use of dma_virt_ops
[ Upstream commit
5a7a9e038b032137ae9c45d5429f18a2ffdf7d42 ]
Use the ib_dma_* helpers to skip the DMA translation instead. This
removes the last user if dma_virt_ops and keeps the weird layering
violation inside the RDMA core instead of burderning the DMA mapping
subsystems with it. This also means the software RDMA drivers now don't
have to mess with DMA parameters that are not relevant to them at all, and
that in the future we can use PCI P2P transfers even for software RDMA, as
there is no first fake layer of DMA mapping that the P2P DMA support.
Link: https://lore.kernel.org/r/20201106181941.1878556-8-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Stanley Chu [Tue, 8 Dec 2020 13:56:34 +0000 (21:56 +0800)]
scsi: ufs: Re-enable WriteBooster after device reset
[ Upstream commit
bd14bf0e4a084514aa62d24d2109e0f09a93822f ]
UFS 3.1 specification mentions that the WriteBooster flags listed below
will be set to their default values, i.e. disabled, after power cycle or
any type of reset event. Thus we need to reset the flag variables kept in
struct hba to align with the device status and ensure that
WriteBooster-related functions are configured properly after device reset.
Without this fix, WriteBooster will not be enabled successfully after by
ufshcd_wb_ctrl() after device reset because hba->wb_enabled remains true.
Flags required to be reset to default values:
- fWriteBoosterEn: hba->wb_enabled
- fWriteBoosterBufferFlushEn: hba->wb_buf_flush_enabled
- fWriteBoosterBufferFlushDuringHibernate: No variable mapped
Link: https://lore.kernel.org/r/20201208135635.15326-2-stanley.chu@mediatek.com
Fixes:
3d17b9b5ab11 ("scsi: ufs: Add write booster feature support")
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Adrian Hunter [Tue, 3 Nov 2020 14:14:03 +0000 (16:14 +0200)]
scsi: ufs: Allow an error return value from ->device_reset()
[ Upstream commit
151f1b664ffbb847c7fbbce5a5b8580f1b9b1d98 ]
It is simpler for drivers to provide a ->device_reset() callback
irrespective of whether the GPIO, or firmware interface necessary to do the
reset, is discovered during probe.
Change ->device_reset() to return an error code. Drivers that provide the
callback, but do not do the reset operation should return -EOPNOTSUPP.
Link: https://lore.kernel.org/r/20201103141403.2142-3-adrian.hunter@intel.com
Reviewed-by: Asutosh Das <asutoshd@codeaurora.org>
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Reviewed-by: Bean huo <beanhuo@micron.com>
Reviewed-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Imre Deak [Sat, 3 Oct 2020 00:18:46 +0000 (03:18 +0300)]
drm/i915/tgl: Fix Combo PHY DPLL fractional divider for 38.4MHz ref clock
commit
0e2497e334de42dbaaee8e325241b5b5b34ede7e upstream.
Apply Display WA #
22010492432 for combo PHY PLLs too. This should fix a
problem where the PLL output frequency is slightly off with the current
PLL fractional divider value.
I haven't seen an actual case where this causes a problem, but let's
follow the spec. It's also needed on some EHL platforms, but for that we
also need a way to distinguish the affected EHL SKUs, so I leave that
for a follow-up.
v2:
- Apply the WA at one place when calculating the PLL dividers from the
frequency and the frequency from the dividers for all the combo PLL
use cases (DP, HDMI, TBT). (Ville)
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201003001846.1271151-6-imre.deak@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Takashi Iwai [Fri, 1 Jan 2021 08:38:52 +0000 (09:38 +0100)]
ALSA: hda/hdmi: Fix incorrect mutex unlock in silent_stream_disable()
commit
3d5c5fdcee0f9a94deb0472e594706018b00aa31 upstream.
The silent_stream_disable() function introduced by the commit
b1a5039759cb ("ALSA: hda/hdmi: fix silent stream for first playback to
DP") takes the per_pin->lock mutex, but it unlocks the wrong one,
spec->pcm_lock, which causes a deadlock. This patch corrects it.
Fixes:
b1a5039759cb ("ALSA: hda/hdmi: fix silent stream for first playback to DP")
Reported-by: Jan Alexander Steffens (heftig) <heftig@archlinux.org>
Cc: <stable@vger.kernel.org>
Acked-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Link: https://lore.kernel.org/r/20210101083852.12094-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Kailang Yang [Wed, 23 Dec 2020 07:34:57 +0000 (15:34 +0800)]
ALSA: hda/realtek - Modify Dell platform name
commit
c1e8952395c1f44a6304c71401519d19ed2ac56a upstream.
Dell platform SSID:0x0a58 change platform name.
Use the generic name instead for avoiding confusion.
Fixes:
150927c3674d ("ALSA: hda/realtek - Supported Dell fixed type headset")
Signed-off-by: Kailang Yang <kailang@realtek.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/efe7c196158241aa817229df7835d645@realtek.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Edward Vear [Tue, 27 Oct 2020 07:02:03 +0000 (00:02 -0700)]
Bluetooth: Fix attempting to set RPA timeout when unsupported
commit
a31489d2a368d2f9225ed6a6f595c63bc7d10de8 upstream.
During controller initialization, an LE Set RPA Timeout command is sent
to the controller if supported. However, the value checked to determine
if the command is supported is incorrect. Page 1921 of the Bluetooth
Core Spec v5.2 shows that bit 2 of octet 35 of the Supported_Commands
field corresponds to the LE Set RPA Timeout command, but currently
bit 6 of octet 35 is checked. This patch checks the correct value
instead.
This issue led to the error seen in the following btmon output during
initialization of an adapter (rtl8761b) and prevented initialization
from completing.
< HCI Command: LE Set Resolvable Private Address Timeout (0x08|0x002e) plen 2
Timeout: 900 seconds
> HCI Event: Command Complete (0x0e) plen 4
LE Set Resolvable Private Address Timeout (0x08|0x002e) ncmd 2
Status: Unsupported Remote Feature / Unsupported LMP Feature (0x1a)
= Close Index: 00:E0:4C:6B:E5:03
The error did not appear when running with this patch.
Signed-off-by: Edward Vear <edwardvear@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Cc: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Josh Poimboeuf [Tue, 29 Dec 2020 23:14:55 +0000 (15:14 -0800)]
kdev_t: always inline major/minor helper functions
commit
aa8c7db494d0a83ecae583aa193f1134ef25d506 upstream.
Silly GCC doesn't always inline these trivial functions.
Fixes the following warning:
arch/x86/kernel/sys_ia32.o: warning: objtool: cp_stat64()+0xd8: call to new_encode_dev() with UACCESS enabled
Link: https://lkml.kernel.org/r/984353b44a4484d86ba9f73884b7306232e25e30.1608737428.git.jpoimboe@redhat.com
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org> [build-tested]
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Rasmus Villemoes [Fri, 18 Dec 2020 10:10:53 +0000 (11:10 +0100)]
dt-bindings: rtc: add reset-source property
commit
320d159e2d63a97a40f24cd6dfda5a57eec65b91 upstream.
Some RTCs, e.g. the pcf2127, can be used as a hardware watchdog. But
if the reset pin is not actually wired up, the driver exposes a
watchdog device that doesn't actually work.
Provide a standard binding that can be used to indicate that a given
RTC can perform a reset of the machine, similar to wakeup-source.
Suggested-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20201218101054.25416-2-rasmus.villemoes@prevas.dk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Uwe Kleine-König [Fri, 18 Dec 2020 10:10:54 +0000 (11:10 +0100)]
rtc: pcf2127: only use watchdog when explicitly available
commit
71ac13457d9d1007effde65b54818106b2c2b525 upstream.
Most boards using the pcf2127 chip (in my bubble) don't make use of the
watchdog functionality and the respective output is not connected. The
effect on such a board is that there is a watchdog device provided that
doesn't work.
So only register the watchdog if the device tree has a "reset-source"
property.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
[RV: s/has-watchdog/reset-source/]
Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20201218101054.25416-3-rasmus.villemoes@prevas.dk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Uwe Kleine-König [Thu, 24 Sep 2020 10:52:55 +0000 (12:52 +0200)]
rtc: pcf2127: move watchdog initialisation to a separate function
commit
5d78533a0c53af9659227c803df944ba27cd56e0 upstream.
The obvious advantages are:
- The linker can drop the watchdog functions if CONFIG_WATCHDOG is off.
- All watchdog stuff grouped together with only a single function call
left in generic code.
- Watchdog register is only read when it is actually used.
- Less #ifdefery
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20200924105256.18162-2-u.kleine-koenig@pengutronix.de
Cc: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Felix Fietkau [Tue, 5 Jan 2021 10:18:21 +0000 (11:18 +0100)]
Revert "mtd: spinand: Fix OOB read"
This reverts stable commit
baad618d078c857f99cc286ea249e9629159901f.
This commit is adding lines to spinand_write_to_cache_op, wheras the upstream
commit
868cbe2a6dcee451bd8f87cbbb2a73cf463b57e5 that this was supposed to
backport was touching spinand_read_from_cache_op.
It causes a crash on writing OOB data by attempting to write to read-only
kernel memory.
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Alex Deucher [Tue, 5 Jan 2021 16:45:45 +0000 (11:45 -0500)]
Revert "drm/amd/display: Fix memory leaks in S3 resume"
This reverts commit
a135a1b4c4db1f3b8cbed9676a40ede39feb3362.
This leads to blank screens on some boards after replugging a
display. Revert until we understand the root cause and can
fix both the leak and the blank screen after replug.
Cc: Stylon Wang <stylon.wang@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Cc: Andre Tomt <andre@tomt.net>
Cc: Oleksandr Natalenko <oleksandr@natalenko.name>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Wed, 6 Jan 2021 13:56:56 +0000 (14:56 +0100)]
Linux 5.10.5
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Tested-by: Jeffrin Jose T <jeffrin@rajagiritech.edu.in>
Tested-by: Shuah Khan <skhan@linuxfoundation.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20210104155708.800470590@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dan Williams [Sat, 19 Dec 2020 02:41:41 +0000 (18:41 -0800)]
device-dax: Fix range release
[ Upstream commit
6268d7da4d192af339f4d688942b9ccb45a65e04 ]
There are multiple locations that open-code the release of the last
range in a device-dax instance. Consolidate this into a new
dev_dax_trim_range() helper.
This also addresses a kmemleak report:
# cat /sys/kernel/debug/kmemleak
[..]
unreferenced object 0xffff976bd46f6240 (size 64):
comm "ndctl", pid 23556, jiffies
4299514316 (age 5406.733s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 20 c3 37 00 00 00 .......... .7...
ff ff ff 7f 38 00 00 00 00 00 00 00 00 00 00 00 ....8...........
backtrace:
[<
00000000064003cf>] __kmalloc_track_caller+0x136/0x379
[<
00000000d85e3c52>] krealloc+0x67/0x92
[<
00000000d7d3ba8a>] __alloc_dev_dax_range+0x73/0x25c
[<
0000000027d58626>] devm_create_dev_dax+0x27d/0x416
[<
00000000434abd43>] __dax_pmem_probe+0x1c9/0x1000 [dax_pmem_core]
[<
0000000083726c1c>] dax_pmem_probe+0x10/0x1f [dax_pmem]
[<
00000000b5f2319c>] nvdimm_bus_probe+0x9d/0x340 [libnvdimm]
[<
00000000c055e544>] really_probe+0x230/0x48d
[<
000000006cabd38e>] driver_probe_device+0x122/0x13b
[<
0000000029c7b95a>] device_driver_attach+0x5b/0x60
[<
0000000053e5659b>] bind_store+0xb7/0xc3
[<
00000000d3bdaadc>] drv_attr_store+0x27/0x31
[<
00000000949069c5>] sysfs_kf_write+0x4a/0x57
[<
000000004a8b5adf>] kernfs_fop_write+0x150/0x1e5
[<
00000000bded60f0>] __vfs_write+0x1b/0x34
[<
00000000b92900f0>] vfs_write+0xd8/0x1d1
Reported-by: Jane Chu <jane.chu@oracle.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/160834570161.1791850.14911670304441510419.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Chunguang Xu [Fri, 4 Dec 2020 03:05:43 +0000 (11:05 +0800)]
ext4: avoid s_mb_prefetch to be zero in individual scenarios
[ Upstream commit
82ef1370b0c1757ab4ce29f34c52b4e93839b0aa ]
Commit
cfd732377221 ("ext4: add prefetching for block allocation
bitmaps") introduced block bitmap prefetch, and expects to read block
bitmaps of flex_bg through an IO. However, it seems to ignore the
value range of s_log_groups_per_flex. In the scenario where the value
of s_log_groups_per_flex is greater than 27, s_mb_prefetch or
s_mb_prefetch_limit will overflow, cause a divide zero exception.
In addition, the logic of calculating nr is also flawed, because the
size of flexbg is fixed during a single mount, but s_mb_prefetch can
be modified, which causes nr to fail to meet the value condition of
[1, flexbg_size].
To solve this problem, we need to set the upper limit of
s_mb_prefetch. Since we expect to load block bitmaps of a flex_bg
through an IO, we can consider determining a reasonable upper limit
among the IO limit parameters. After consideration, we chose
BLK_MAX_SEGMENT_SIZE. This is a good choice to solve divide zero
problem and avoiding performance degradation.
[ Some minor code simplifications to make the changes easy to follow -- TYT ]
Reported-by: Tosk Robot <tencent_os_robot@tencent.com>
Signed-off-by: Chunguang Xu <brookxu@tencent.com>
Reviewed-by: Samuel Liao <samuelliao@tencent.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Link: https://lore.kernel.org/r/1607051143-24508-1-git-send-email-brookxu@tencent.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Hyeongseok Kim [Thu, 3 Dec 2020 00:46:59 +0000 (09:46 +0900)]
dm verity: skip verity work if I/O error when system is shutting down
[ Upstream commit
252bd1256396cebc6fc3526127fdb0b317601318 ]
If emergency system shutdown is called, like by thermal shutdown,
a dm device could be alive when the block device couldn't process
I/O requests anymore. In this state, the handling of I/O errors
by new dm I/O requests or by those already in-flight can lead to
a verity corruption state, which is a misjudgment.
So, skip verity work in response to I/O error when system is shutting
down.
Signed-off-by: Hyeongseok Kim <hyeongseok@gmail.com>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Takashi Iwai [Fri, 18 Dec 2020 14:56:25 +0000 (15:56 +0100)]
ALSA: pcm: Clear the full allocated memory at hw_params
[ Upstream commit
618de0f4ef11acd8cf26902e65493d46cc20cc89 ]
The PCM hw_params core function tries to clear up the PCM buffer
before actually using for avoiding the information leak from the
previous usages or the usage before a new allocation. It performs the
memset() with runtime->dma_bytes, but this might still leave some
remaining bytes untouched; namely, the PCM buffer size is aligned in
page size for mmap, hence runtime->dma_bytes doesn't necessarily cover
all PCM buffer pages, and the remaining bytes are exposed via mmap.
This patch changes the memory clearance to cover the all buffer pages
if the stream is supposed to be mmap-ready (that guarantees that the
buffer size is aligned in page size).
Reviewed-by: Lars-Peter Clausen <lars@metafoo.de>
Link: https://lore.kernel.org/r/20201218145625.2045-3-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Pavel Begunkov [Thu, 17 Dec 2020 00:24:36 +0000 (00:24 +0000)]
io_uring: remove racy overflow list fast checks
[ Upstream commit
9cd2be519d05ee78876d55e8e902b7125f78b74f ]
list_empty_careful() is not racy only if some conditions are met, i.e.
no re-adds after del_init. io_cqring_overflow_flush() does list_move(),
so it's actually racy.
Remove those checks, we have ->cq_check_overflow for the fast path.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Heiko Carstens [Fri, 4 Dec 2020 16:56:57 +0000 (17:56 +0100)]
s390: always clear kernel stack backchain before calling functions
[ Upstream commit
9365965db0c7ca7fc81eee27c21d8522d7102c32 ]
Clear the kernel stack backchain before potentially calling the
lockdep trace_hardirqs_off/on functions. Without this walking the
kernel backchain, e.g. during a panic, might stop too early.
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Thomas Gleixner [Sun, 6 Dec 2020 21:12:55 +0000 (22:12 +0100)]
tick/sched: Remove bogus boot "safety" check
[ Upstream commit
ba8ea8e7dd6e1662e34e730eadfc52aa6816f9dd ]
can_stop_idle_tick() checks whether the do_timer() duty has been taken over
by a CPU on boot. That's silly because the boot CPU always takes over with
the initial clockevent device.
But even if no CPU would have installed a clockevent and taken over the
duty then the question whether the tick on the current CPU can be stopped
or not is moot. In that case the current CPU would have no clockevent
either, so there would be nothing to keep ticking.
Remove it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/r/20201206212002.725238293@linutronix.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
Jake Wang [Thu, 3 Dec 2020 19:05:56 +0000 (14:05 -0500)]
drm/amd/display: updated wm table for Renoir
[ Upstream commit
410066d24cfc1071be25e402510367aca9db5cb6 ]
[Why]
For certain timings, Renoir may underflow due to sr exit
latency being too slow.
[How]
Updated wm table for renoir.
Signed-off-by: Jake Wang <haonan.wang2@amd.com>
Reviewed-by: Yongqiang Sun <yongqiang.sun@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Jeff Layton [Thu, 12 Nov 2020 14:37:59 +0000 (09:37 -0500)]
ceph: fix inode refcount leak when ceph_fill_inode on non-I_NEW inode fails
[ Upstream commit
68cbb8056a4c24c6a38ad2b79e0a9764b235e8fa ]
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Trond Myklebust [Tue, 8 Dec 2020 12:51:29 +0000 (07:51 -0500)]
NFSv4.2: Don't error when exiting early on a READ_PLUS buffer overflow
[ Upstream commit
503b934a752f7e789a5f33217520e0a79f3096ac ]
Expanding the READ_PLUS extents can cause the read buffer to overflow.
If it does, then don't error, but just exit early.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Gabriel Krisman Bertazi [Sun, 22 Nov 2020 04:13:56 +0000 (23:13 -0500)]
um: ubd: Submit all data segments atomically
[ Upstream commit
fc6b6a872dcd48c6f39c7975836d75113db67d37 ]
Internally, UBD treats each physical IO segment as a separate command to
be submitted in the execution pipe. If the pipe returns a transient
error after a few segments have already been written, UBD will tell the
block layer to requeue the request, but there is no way to reclaim the
segments already submitted. When a new attempt to dispatch the request
is done, those segments already submitted will get duplicated, causing
the WARN_ON below in the best case, and potentially data corruption.
In my system, running a UML instance with 2GB of RAM and a 50M UBD disk,
I can reproduce the WARN_ON by simply running mkfs.fvat against the
disk on a freshly booted system.
There are a few ways to around this, like reducing the pressure on
the pipe by reducing the queue depth, which almost eliminates the
occurrence of the problem, increasing the pipe buffer size on the host
system, or by limiting the request to one physical segment, which causes
the block layer to submit way more requests to resolve a single
operation.
Instead, this patch modifies the format of a UBD command, such that all
segments are sent through a single element in the communication pipe,
turning the command submission atomic from the point of view of the
block layer. The new format has a variable size, depending on the
number of elements, and looks like this:
+------------+-----------+-----------+------------
| cmd_header | segment 0 | segment 1 | segment ...
+------------+-----------+-----------+------------
With this format, we push a pointer to cmd_header in the submission
pipe.
This has the advantage of reducing the memory footprint of executing a
single request, since it allow us to merge some fields in the header.
It is possible to reduce even further each segment memory footprint, by
merging bitmap_words and cow_offset, for instance, but this is not the
focus of this patch and is left as future work. One issue with the
patch is that for a big number of segments, we now perform one big
memory allocation instead of multiple small ones, but I wasn't able to
trigger any real issues or -ENOMEM because of this change, that wouldn't
be reproduced otherwise.
This was tested using fio with the verify-crc32 option, and by running
an ext4 filesystem over this UBD device.
The original WARN_ON was:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 0 at lib/refcount.c:28 refcount_warn_saturate+0x13f/0x141
refcount_t: underflow; use-after-free.
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 5.5.0-rc6-00002-g2a5bb2cf75c8 #346
Stack:
6084eed0 6063dc77 00000009 6084ef60
00000000 604b8d9f 6084eee0 6063dcbc
6084ef40 6006ab8d e013d780 1c00000000
Call Trace:
[<
600a0c1c>] ? printk+0x0/0x94
[<
6004a888>] show_stack+0x13b/0x155
[<
6063dc77>] ? dump_stack_print_info+0xdf/0xe8
[<
604b8d9f>] ? refcount_warn_saturate+0x13f/0x141
[<
6063dcbc>] dump_stack+0x2a/0x2c
[<
6006ab8d>] __warn+0x107/0x134
[<
6008da6c>] ? wake_up_process+0x17/0x19
[<
60487628>] ? blk_queue_max_discard_sectors+0x0/0xd
[<
6006b05f>] warn_slowpath_fmt+0xd1/0xdf
[<
6006af8e>] ? warn_slowpath_fmt+0x0/0xdf
[<
600acc14>] ? raw_read_seqcount_begin.constprop.0+0x0/0x15
[<
600619ae>] ? os_nsecs+0x1d/0x2b
[<
604b8d9f>] refcount_warn_saturate+0x13f/0x141
[<
6048bc8f>] refcount_sub_and_test.constprop.0+0x2f/0x37
[<
6048c8de>] blk_mq_free_request+0xf1/0x10d
[<
6048ca06>] __blk_mq_end_request+0x10c/0x114
[<
6005ac0f>] ubd_intr+0xb5/0x169
[<
600a1a37>] __handle_irq_event_percpu+0x6b/0x17e
[<
600a1b70>] handle_irq_event_percpu+0x26/0x69
[<
600a1bd9>] handle_irq_event+0x26/0x34
[<
600a1bb3>] ? handle_irq_event+0x0/0x34
[<
600a5186>] ? unmask_irq+0x0/0x37
[<
600a57e6>] handle_edge_irq+0xbc/0xd6
[<
600a131a>] generic_handle_irq+0x21/0x29
[<
60048f6e>] do_IRQ+0x39/0x54
[...]
---[ end trace
c6e7444e55386c0f ]---
Cc: Christopher Obbard <chris.obbard@collabora.com>
Reported-by: Martyn Welch <martyn@collabora.com>
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Tested-by: Christopher Obbard <chris.obbard@collabora.com>
Acked-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Christopher Obbard [Tue, 27 Oct 2020 15:30:22 +0000 (15:30 +0000)]
um: random: Register random as hwrng-core device
[ Upstream commit
72d3e093afae79611fa38f8f2cfab9a888fe66f2 ]
The UML random driver creates a dummy device under the guest,
/dev/hw_random. When this file is read from the guest, the driver
reads from the host machine's /dev/random, in-turn reading from
the host kernel's entropy pool. This entropy pool could have been
filled by a hardware random number generator or just the host
kernel's internal software entropy generator.
Currently the driver does not fill the guests kernel entropy pool,
this requires a userspace tool running inside the guest (like
rng-tools) to read from the dummy device provided by this driver,
which then would fill the guest's internal entropy pool.
This all seems quite pointless when we are already reading from an
entropy pool, so this patch aims to register the device as a hwrng
device using the hwrng-core framework. This not only improves and
cleans up the driver, but also fills the guest's entropy pool
without having to resort to using extra userspace tools in the guest.
This is typically a nuisance when booting a guest: the random pool
takes a long time (~200s) to build up enough entropy since the dummy
hwrng is not used to fill the guest's pool.
This port was originally attempted by Alexander Neville "dark" (in CC,
discussion in Link), but the conversation there stalled since the
handling of -EAGAIN errors were no removed and longer handled by the
driver. This patch attempts to use the existing method of error
handling but utilises the new hwrng core.
The issue can be noticed when booting a UML guest:
[ 2.560000] random: fast init done
[ 214.000000] random: crng init done
With the patch applied, filling the pool becomes a lot quicker:
[ 2.560000] random: fast init done
[ 12.000000] random: crng init done
Cc: Alexander Neville <dark@volatile.bz>
Link: https://lore.kernel.org/lkml/20190828204609.02a7ff70@TheDarkness/
Link: https://lore.kernel.org/lkml/20190829135001.6a5ff940@TheDarkness.local/
Cc: Sjoerd Simons <sjoerd.simons@collabora.co.uk>
Signed-off-by: Christopher Obbard <chris.obbard@collabora.com>
Acked-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Sasha Levin <sashal@kernel.org>