Martin Schiller [Wed, 21 Apr 2021 05:50:47 +0000 (07:50 +0200)]
net: phy: intel-xway: enable integrated led functions
The Intel xway phys offer the possibility to deactivate the integrated
LED function and to control the LEDs manually.
If this was set by the bootloader, it must be ensured that the
integrated LED function is enabled for all LEDs when loading the driver.
Before commit
6e2d85ec0559 ("net: phy: Stop with excessive soft reset")
the LEDs were enabled by a soft-reset of the PHY (using
genphy_soft_reset). Initialize the XWAY_MDIO_LED with it's default
value (which is applied during a soft reset) instead of adding back
the soft reset. This brings back the default LED configuration while
still preventing an excessive amount of soft resets.
Fixes:
6e2d85ec0559 ("net: phy: Stop with excessive soft reset")
Signed-off-by: Martin Schiller <ms@dev.tdt.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yoshihiro Shimoda [Wed, 21 Apr 2021 04:52:46 +0000 (13:52 +0900)]
net: renesas: ravb: Fix a stuck issue when a lot of frames are received
When a lot of frames were received in the short term, the driver
caused a stuck of receiving until a new frame was received. For example,
the following command from other device could cause this issue.
$ sudo ping -f -l 1000 -c 1000 <this driver's ipaddress>
The previous code always cleared the interrupt flag of RX but checks
the interrupt flags in ravb_poll(). So, ravb_poll() could not call
ravb_rx() in the next time until a new RX frame was received if
ravb_rx() returned true. To fix the issue, always calls ravb_rx()
regardless the interrupt flags condition.
Fixes:
c156633f1353 ("Renesas Ethernet AVB driver proper")
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ong Boon Leong [Wed, 21 Apr 2021 09:11:49 +0000 (17:11 +0800)]
net: stmmac: fix TSO and TBS feature enabling during driver open
TSO and TBS cannot co-exist and current implementation requires two
fixes:
1) stmmac_open() does not need to call stmmac_enable_tbs() because
the MAC is reset in stmmac_init_dma_engine() anyway.
2) Inside stmmac_hw_setup(), we should call stmmac_enable_tso() for
TX Q that is _not_ configured for TBS.
Fixes:
579a25a854d4 ("net: stmmac: Initial support for TBS")
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yinjun Zhang [Wed, 21 Apr 2021 09:24:15 +0000 (11:24 +0200)]
nfp: devlink: initialize the devlink port attribute "lanes"
The number of lanes of devlink port should be correctly initialized
when registering the port, so that the input check when running
"devlink port split <port> count <N>" can pass.
Fixes:
a21cf0a8330b ("devlink: Add a new devlink port lanes attribute and pass to netlink")
Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 21 Apr 2021 17:18:09 +0000 (10:18 -0700)]
Merge tag 'wireless-drivers-2021-04-21' of git://git./linux/kernel/git/kvalo/wireless-drivers
Kalle Valo says:
====================
wireless-drivers fixes for v5.12
As there was -rc8 release, one more important fix for v5.12.
iwlwifi
* fix spinlock warning in gen2 devices
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Tue, 20 Apr 2021 17:16:14 +0000 (18:16 +0100)]
net: davinci_emac: Fix incorrect masking of tx and rx error channel
The bit-masks used for the TXERRCH and RXERRCH (tx and rx error channels)
are incorrect and always lead to a zero result. The mask values are
currently the incorrect post-right shifted values, fix this by setting
them to the currect values.
(I double checked these against the TMS320TCI6482 data sheet, section
5.30, page 127 to ensure I had the correct mask values for the TXERRCH
and RXERRCH fields in the MACSTATUS register).
Addresses-Coverity: ("Operands don't affect result")
Fixes:
a6286ee630f6 ("net: Add TI DaVinci EMAC driver")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vadym Kochan [Tue, 20 Apr 2021 13:31:51 +0000 (16:31 +0300)]
net: marvell: prestera: fix port event handling on init
For some reason there might be a crash during ports creation if port
events are handling at the same time because fw may send initial
port event with down state.
The crash points to cancel_delayed_work() which is called when port went
is down. Currently I did not find out the real cause of the issue, so
fixed it by cancel port stats work only if previous port's state was up
& runnig.
The following is the crash which can be triggered:
[ 28.311104] Unable to handle kernel paging request at virtual address
000071775f776600
[ 28.319097] Mem abort info:
[ 28.321914] ESR = 0x96000004
[ 28.324996] EC = 0x25: DABT (current EL), IL = 32 bits
[ 28.330350] SET = 0, FnV = 0
[ 28.333430] EA = 0, S1PTW = 0
[ 28.336597] Data abort info:
[ 28.339499] ISV = 0, ISS = 0x00000004
[ 28.343362] CM = 0, WnR = 0
[ 28.346354] user pgtable: 4k pages, 48-bit VAs, pgdp=
0000000100bf7000
[ 28.352842] [
000071775f776600] pgd=
0000000000000000,
p4d=
0000000000000000
[ 28.359695] Internal error: Oops:
96000004 [#1] PREEMPT SMP
[ 28.365310] Modules linked in: prestera_pci(+) prestera
uio_pdrv_genirq
[ 28.372005] CPU: 0 PID: 1291 Comm: kworker/0:1H Not tainted
5.11.0-rc4 #1
[ 28.378846] Hardware name: DNI AmazonGo1 A7040 board (DT)
[ 28.384283] Workqueue: prestera_fw_wq prestera_fw_evt_work_fn
[prestera_pci]
[ 28.391413] pstate:
60000085 (nZCv daIf -PAN -UAO -TCO BTYPE=--)
[ 28.397468] pc : get_work_pool+0x48/0x60
[ 28.401442] lr : try_to_grab_pending+0x6c/0x1b0
[ 28.406018] sp :
ffff80001391bc60
[ 28.409358] x29:
ffff80001391bc60 x28:
0000000000000000
[ 28.414725] x27:
ffff000104fc8b40 x26:
ffff80001127de88
[ 28.420089] x25:
0000000000000000 x24:
ffff000106119760
[ 28.425452] x23:
ffff00010775dd60 x22:
ffff00010567e000
[ 28.430814] x21:
0000000000000000 x20:
ffff80001391bcb0
[ 28.436175] x19:
ffff00010775deb8 x18:
00000000000000c0
[ 28.441537] x17:
0000000000000000 x16:
000000008d9b0e88
[ 28.446898] x15:
0000000000000001 x14:
00000000000002ba
[ 28.452261] x13:
80a3002c00000002 x12:
00000000000005f4
[ 28.457622] x11:
0000000000000030 x10:
000000000000000c
[ 28.462985] x9 :
000000000000000c x8 :
0000000000000030
[ 28.468346] x7 :
ffff800014400000 x6 :
ffff000106119758
[ 28.473708] x5 :
0000000000000003 x4 :
ffff00010775dc60
[ 28.479068] x3 :
0000000000000000 x2 :
0000000000000060
[ 28.484429] x1 :
000071775f776600 x0 :
ffff00010775deb8
[ 28.489791] Call trace:
[ 28.492259] get_work_pool+0x48/0x60
[ 28.495874] cancel_delayed_work+0x38/0xb0
[ 28.500011] prestera_port_handle_event+0x90/0xa0 [prestera]
[ 28.505743] prestera_evt_recv+0x98/0xe0 [prestera]
[ 28.510683] prestera_fw_evt_work_fn+0x180/0x228 [prestera_pci]
[ 28.516660] process_one_work+0x1e8/0x360
[ 28.520710] worker_thread+0x44/0x480
[ 28.524412] kthread+0x154/0x160
[ 28.527670] ret_from_fork+0x10/0x38
[ 28.531290] Code:
a8c17bfd d50323bf d65f03c0 9278dc21 (
f9400020)
[ 28.537429] ---[ end trace
5eced933df3a080b ]---
Fixes:
501ef3066c89 ("net: marvell: prestera: Add driver for Prestera family ASIC devices")
Signed-off-by: Vadym Kochan <vkochan@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefano Garzarella [Tue, 20 Apr 2021 11:07:27 +0000 (13:07 +0200)]
vsock/virtio: free queued packets when closing socket
As reported by syzbot [1], there is a memory leak while closing the
socket. We partially solved this issue with commit
ac03046ece2b
("vsock/virtio: free packets during the socket release"), but we
forgot to drain the RX queue when the socket is definitely closed by
the scheduled work.
To avoid future issues, let's use the new virtio_transport_remove_sock()
to drain the RX queue before removing the socket from the af_vsock lists
calling vsock_remove_sock().
[1] https://syzkaller.appspot.com/bug?extid=
24452624fc4c571eedd9
Fixes:
ac03046ece2b ("vsock/virtio: free packets during the socket release")
Reported-and-tested-by: syzbot+24452624fc4c571eedd9@syzkaller.appspotmail.com
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 21 Apr 2021 00:03:53 +0000 (17:03 -0700)]
Merge branch 'sfc-txq-lookups'
Edward Cree says:
====================
sfc: fix TXQ lookups
The TXQ handling changes in
12804793b17c ("sfc: decouple TXQ type from label")
which were made as part of the support for encap offloads on EF10 caused some
breakage on Siena (5000- and 6000-series) NICs, which caused null-dereference
kernel panics.
This series fixes those issues, and also a similarly incorrect code-path on
EF10 which worked by chance.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Edward Cree [Tue, 20 Apr 2021 12:29:35 +0000 (13:29 +0100)]
sfc: ef10: fix TX queue lookup in TX event handling
We're starting from a TXQ label, not a TXQ type, so
efx_channel_get_tx_queue() is inappropriate. This worked by chance,
because labels and types currently match on EF10, but we shouldn't
rely on that.
Fixes:
12804793b17c ("sfc: decouple TXQ type from label")
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Edward Cree [Tue, 20 Apr 2021 12:28:28 +0000 (13:28 +0100)]
sfc: farch: fix TX queue lookup in TX event handling
We're starting from a TXQ label, not a TXQ type, so
efx_channel_get_tx_queue() is inappropriate (and could return NULL,
leading to panics).
Fixes:
12804793b17c ("sfc: decouple TXQ type from label")
Cc: stable@vger.kernel.org
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Edward Cree [Tue, 20 Apr 2021 12:27:22 +0000 (13:27 +0100)]
sfc: farch: fix TX queue lookup in TX flush done handling
We're starting from a TXQ instance number ('qid'), not a TXQ type, so
efx_get_tx_queue() is inappropriate (and could return NULL, leading
to panics).
Fixes:
12804793b17c ("sfc: decouple TXQ type from label")
Reported-by: Trevor Hemsley <themsley@voiceflex.com>
Cc: stable@vger.kernel.org
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lijun Pan [Mon, 19 Apr 2021 21:25:25 +0000 (16:25 -0500)]
MAINTAINERS: update
I am making this change again since I received the following instruction.
"As an IBM employee, you are not allowed to use your gmail account to work
in any way on VNIC. You are not allowed to use your personal email account
as a "hobby". You are an IBM employee 100% of the time.
Please remove yourself completely from the maintainers file.
I grant you a 1 time exception on contributions to VNIC to make this
change."
Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Di Zhu [Mon, 19 Apr 2021 13:56:41 +0000 (21:56 +0800)]
net: fix a data race when get vlan device
We encountered a crash: in the packet receiving process, we got an
illegal VLAN device address, but the VLAN device address saved in vmcore
is correct. After checking the code, we found a possible data
competition:
CPU 0: CPU 1:
(RCU read lock) (RTNL lock)
vlan_do_receive() register_vlan_dev()
vlan_find_dev()
->__vlan_group_get_device() ->vlan_group_prealloc_vid()
In vlan_group_prealloc_vid(), We need to make sure that memset()
in kzalloc() is executed before assigning value to vlan devices array:
=================================
kzalloc()
->memset(object, 0, size)
smp_wmb()
vg->vlan_devices_arrays[pidx][vidx] = array;
==================================
Because __vlan_group_get_device() function depends on this order.
otherwise we may get a wrong address from the hardware cache on
another cpu.
So fix it by adding memory barrier instruction to ensure the order
of memory operations.
Signed-off-by: Di Zhu <zhudi21@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Lobakin [Mon, 19 Apr 2021 12:53:06 +0000 (12:53 +0000)]
gro: fix napi_gro_frags() Fast GRO breakage due to IP alignment check
Commit
38ec4944b593 ("gro: ensure frag0 meets IP header alignment")
did the right thing, but missed the fact that napi_gro_frags() logics
calls for skb_gro_reset_offset() *before* pulling Ethernet header
to the skb linear space.
That said, the introduced check for frag0 address being aligned to 4
always fails for it as Ethernet header is obviously 14 bytes long,
and in case with NET_IP_ALIGN its start is not aligned to 4.
Fix this by adding @nhoff argument to skb_gro_reset_offset() which
tells if an IP header is placed right at the start of frag0 or not.
This restores Fast GRO for napi_gro_frags() that became very slow
after the mentioned commit, and preserves the introduced check to
avoid silent unaligned accesses.
From v1 [0]:
- inline tiny skb_gro_reset_offset() to let the code be optimized
more efficively (esp. for the !NET_IP_ALIGN case) (Eric);
- pull in Reviewed-by from Eric.
[0] https://lore.kernel.org/netdev/
20210418114200.5839-1-alobakin@pm.me
Fixes:
38ec4944b593 ("gro: ensure frag0 meets IP header alignment")
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Walleij [Sun, 18 Apr 2021 18:28:53 +0000 (20:28 +0200)]
net: ethernet: ixp4xx: Set the DMA masks explicitly
The former fix only papered over the actual problem: the
ethernet core expects the netdev .dev member to have the
proper DMA masks set, or there will be BUG_ON() triggered
in kernel/dma/mapping.c.
Fix this by simply copying dma_mask and dma_mask_coherent
from the parent device.
Fixes:
e45d0fad4a5f ("net: ethernet: ixp4xx: Use parent dev for DMA pool")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Du Cheng [Fri, 16 Apr 2021 23:30:46 +0000 (07:30 +0800)]
net: sched: tapr: prevent cycle_time == 0 in parse_taprio_schedule
There is a reproducible sequence from the userland that will trigger a WARN_ON()
condition in taprio_get_start_time, which causes kernel to panic if configured
as "panic_on_warn". Catch this condition in parse_taprio_schedule to
prevent this condition.
Reported as bug on syzkaller:
https://syzkaller.appspot.com/bug?extid=
d50710fd0873a9c6b40c
Reported-by: syzbot+d50710fd0873a9c6b40c@syzkaller.appspotmail.com
Signed-off-by: Du Cheng <ducheng2@gmail.com>
Acked-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefano Garzarella [Fri, 16 Apr 2021 10:44:16 +0000 (12:44 +0200)]
vsock/vmci: log once the failed queue pair allocation
VMCI feature is not supported in conjunction with the vSphere Fault
Tolerance (FT) feature.
VMware Tools can repeatedly try to create a vsock connection. If FT is
enabled the kernel logs is flooded with the following messages:
qp_alloc_hypercall result = -20
Could not attach to queue pair with -20
"qp_alloc_hypercall result = -20" was hidden by commit
e8266c4c3307
("VMCI: Stop log spew when qp allocation isn't possible"), but "Could
not attach to queue pair with -20" is still there flooding the log.
Since the error message can be useful in some cases, print it only once.
Fixes:
d021c344051a ("VSOCK: Introduce VM Sockets")
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Jorgen Hansen <jhansen@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Kosina [Sat, 17 Apr 2021 09:13:39 +0000 (11:13 +0200)]
iwlwifi: Fix softirq/hardirq disabling in iwl_pcie_gen2_enqueue_hcmd()
Analogically to what we did in
2800aadc18a6 ("iwlwifi: Fix softirq/hardirq
disabling in iwl_pcie_enqueue_hcmd()"), we must apply the same fix to
iwl_pcie_gen2_enqueue_hcmd(), as it's being called from exactly the same
contexts.
Reported-by: Heiner Kallweit <hkallweit1@gmail.com
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/nycvar.YFH.7.76.2104171112390.18270@cbobk.fhfr.pm
Linus Torvalds [Sat, 17 Apr 2021 16:57:15 +0000 (09:57 -0700)]
Merge tag 'net-5.12-rc8' of git://git./linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Networking fixes for 5.12-rc8, including fixes from netfilter, and
bpf. BPF verifier changes stand out, otherwise things have slowed
down.
Current release - regressions:
- gro: ensure frag0 meets IP header alignment
- Revert "net: stmmac: re-init rx buffers when mac resume back"
- ethernet: macb: fix the restore of cmp registers
Previous releases - regressions:
- ixgbe: Fix NULL pointer dereference in ethtool loopback test
- ixgbe: fix unbalanced device enable/disable in suspend/resume
- phy: marvell: fix detection of PHY on Topaz switches
- make tcp_allowed_congestion_control readonly in non-init netns
- xen-netback: Check for hotplug-status existence before watching
Previous releases - always broken:
- bpf: mitigate a speculative oob read of up to map value size by
tightening the masking window
- sctp: fix race condition in sctp_destroy_sock
- sit, ip6_tunnel: Unregister catch-all devices
- netfilter: nftables: clone set element expression template
- netfilter: flowtable: fix NAT IPv6 offload mangling
- net: geneve: check skb is large enough for IPv4/IPv6 header
- netlink: don't call ->netlink_bind with table lock held"
* tag 'net-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (52 commits)
netlink: don't call ->netlink_bind with table lock held
MAINTAINERS: update my email
bpf: Update selftests to reflect new error states
bpf: Tighten speculative pointer arithmetic mask
bpf: Move sanitize_val_alu out of op switch
bpf: Refactor and streamline bounds check into helper
bpf: Improve verifier error messages for users
bpf: Rework ptr_limit into alu_limit and add common error path
bpf: Ensure off_reg has no mixed signed bounds for all types
bpf: Move off_reg into sanitize_ptr_alu
bpf: Use correct permission flag for mixed signed bounds arithmetic
ch_ktls: do not send snd_una update to TCB in middle
ch_ktls: tcb close causes tls connection failure
ch_ktls: fix device connection close
ch_ktls: Fix kernel panic
i40e: fix the panic when running bpf in xdpdrv mode
net/mlx5e: fix ingress_ifindex check in mlx5e_flower_parse_meta
net/mlx5e: Fix setting of RS FEC mode
net/mlx5: Fix setting of devlink traps in switchdev mode
Revert "net: stmmac: re-init rx buffers when mac resume back"
...
Linus Torvalds [Sat, 17 Apr 2021 16:40:44 +0000 (09:40 -0700)]
Merge tag 'libnvdimm-fixes-for-5.12-rc8' of git://git./linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fixes from Dan Williams:
"The largest change is for a regression that landed during -rc1 for
block-device read-only handling. Vaibhav found a new use for the
ability (originally introduced by virtio_pmem) to call back to the
platform to flush data, but also found an original bug in that
implementation. Lastly, Arnd cleans up some compile warnings in dax.
This has all appeared in -next with no reported issues.
Summary:
- Fix a regression of read-only handling in the pmem driver
- Fix a compile warning
- Fix support for platform cache flush commands on powerpc/papr"
* tag 'libnvdimm-fixes-for-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
libnvdimm/region: Fix nvdimm_has_flush() to handle ND_REGION_ASYNC
libnvdimm: Notify disk drivers to revalidate region read-only
dax: avoid -Wempty-body warnings
Linus Torvalds [Sat, 17 Apr 2021 16:30:58 +0000 (09:30 -0700)]
Merge tag 'cxl-fixes-for-5.12-rc8' of git://git./linux/kernel/git/cxl/cxl
Pull CXL memory class fixes from Dan Williams:
"A collection of fixes for the CXL memory class driver introduced in
this release cycle.
The driver was primarily developed on a work-in-progress QEMU
emulation of the interface and we have since found a couple places
where it hid spec compliance bugs in the driver, or had a spec
implementation bug itself.
The biggest change here is replacing a percpu_ref with an rwsem to
cleanup a couple bugs in the error unwind path during ioctl device
init. Lastly there were some minor cleanups to not export the
power-management sysfs-ABI for the ioctl device, use the proper sysfs
helper for emitting values, and prevent subtle bugs as new
administration commands are added to the supported list.
The bulk of it has appeared in -next save for the top commit which was
found today and validated on a fixed-up QEMU model.
Summary:
- Fix support for CXL memory devices with registers offset from the
BAR base.
- Fix the reporting of device capacity.
- Fix the driver commands list definition to be disconnected from the
UAPI command list.
- Replace percpu_ref with rwsem to fix initialization error path.
- Fix leaks in the driver initialization error path.
- Drop the power/ directory from CXL device sysfs.
- Use the recommended sysfs helper for attribute 'show'
implementations"
* tag 'cxl-fixes-for-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
cxl/mem: Fix memory device capacity probing
cxl/mem: Fix register block offset calculation
cxl/mem: Force array size of mem_commands[] to CXL_MEM_COMMAND_ID_MAX
cxl/mem: Disable cxl device power management
cxl/mem: Do not rely on device_add() side effects for dev_set_name() failures
cxl/mem: Fix synchronization mechanism for device removal vs ioctl operations
cxl/mem: Use sysfs_emit() for attribute show routines
Linus Torvalds [Sat, 17 Apr 2021 15:38:23 +0000 (08:38 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
"12 patches.
Subsystems affected by this patch series: mm (documentation, kasan,
and pagemap), csky, ia64, gcov, and lib"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
lib: remove "expecting prototype" kernel-doc warnings
gcov: clang: fix clang-11+ build
mm: ptdump: fix build failure
mm/mapping_dirty_helpers: guard hugepage pud's usage
ia64: tools: remove duplicate definition of ia64_mf() on ia64
ia64: tools: remove inclusion of ia64-specific version of errno.h header
ia64: fix discontig.c section mismatches
ia64: remove duplicate entries in generic_defconfig
csky: change a Kconfig symbol name to fix e1000 build error
kasan: remove redundant config option
kasan: fix hwasan build for gcc
mm: eliminate "expecting prototype" kernel-doc warnings
Dan Williams [Sat, 17 Apr 2021 00:43:30 +0000 (17:43 -0700)]
cxl/mem: Fix memory device capacity probing
The CXL Identify Memory Device output payload emits capacity in 256MB
units. The driver is treating the capacity field as bytes. This was
missed because QEMU reports bytes when it should report bytes / 256MB.
Fixes:
8adaf747c9f0 ("cxl/mem: Find device capabilities")
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Cc: Ben Widawsky <ben.widawsky@intel.com>
Link: https://lore.kernel.org/r/161862021044.3259705.7008520073059739760.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Florian Westphal [Fri, 16 Apr 2021 19:29:13 +0000 (21:29 +0200)]
netlink: don't call ->netlink_bind with table lock held
When I added support to allow generic netlink multicast groups to be
restricted to subscribers with CAP_NET_ADMIN I was unaware that a
genl_bind implementation already existed in the past.
It was reverted due to ABBA deadlock:
1. ->netlink_bind gets called with the table lock held.
2. genetlink bind callback is invoked, it grabs the genl lock.
But when a new genl subsystem is (un)registered, these two locks are
taken in reverse order.
One solution would be to revert again and add a comment in genl
referring
1e82a62fec613, "genetlink: remove genl_bind").
This would need a second change in mptcp to not expose the raw token
value anymore, e.g. by hashing the token with a secret key so userspace
can still associate subflow events with the correct mptcp connection.
However, Paolo Abeni reminded me to double-check why the netlink table is
locked in the first place.
I can't find one. netlink_bind() is already called without this lock
when userspace joins a group via NETLINK_ADD_MEMBERSHIP setsockopt.
Same holds for the netlink_unbind operation.
Digging through the history, commit
f773608026ee1
("netlink: access nlk groups safely in netlink bind and getname")
expanded the lock scope.
commit
3a20773beeeeade ("net: netlink: cap max groups which will be considered in netlink_bind()")
... removed the nlk->ngroups access that the lock scope
extension was all about.
Reduce the lock scope again and always call ->netlink_bind without
the table lock.
The Fixes tag should be vs. the patch mentioned in the link below,
but that one got squash-merged into the patch that came earlier in the
series.
Fixes:
4d54cc32112d8d ("mptcp: avoid lock_fast usage in accept path")
Link: https://lore.kernel.org/mptcp/20210213000001.379332-8-mathew.j.martineau@linux.intel.com/T/#u
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Xin Long <lucien.xin@gmail.com>
Cc: Johannes Berg <johannes.berg@intel.com>
Cc: Sean Tranchetti <stranche@codeaurora.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 16 Apr 2021 23:18:53 +0000 (16:18 -0700)]
Merge tag 'io_uring-5.12-2021-04-16' of git://git.kernel.dk/linux-block
Pull io_uring fix from Jens Axboe:
"Fix for a potential hang at exit with SQPOLL from Pavel"
* tag 'io_uring-5.12-2021-04-16' of git://git.kernel.dk/linux-block:
io_uring: fix early sqd_list removal sqpoll hangs
Randy Dunlap [Fri, 16 Apr 2021 22:46:26 +0000 (15:46 -0700)]
lib: remove "expecting prototype" kernel-doc warnings
Fix various kernel-doc warnings in lib/ due to missing or erroneous
function names.
Add kernel-doc for some function parameters that was missing. Use
kernel-doc "Return:" notation in earlycpio.c.
Quietens the following warnings:
lib/earlycpio.c:61: warning: expecting prototype for cpio_data find_cpio_data(). Prototype was for find_cpio_data() instead
lib/lru_cache.c:640: warning: expecting prototype for lc_dump(). Prototype was for lc_seq_dump_details() instead
lru_cache.c:90: warning: Function parameter or member 'cache' not described in 'lc_create'
lib/parman.c:368: warning: expecting prototype for parman_item_del(). Prototype was for parman_item_remove() instead
parman.c:309: warning: Excess function parameter 'prority' description in 'parman_prio_init'
lib/radix-tree.c:703: warning: expecting prototype for __radix_tree_insert(). Prototype was for radix_tree_insert() instead
radix-tree.c:180: warning: Excess function parameter 'addr' description in 'radix_tree_find_next_bit'
radix-tree.c:180: warning: Excess function parameter 'size' description in 'radix_tree_find_next_bit'
radix-tree.c:931: warning: Function parameter or member 'iter' not described in 'radix_tree_iter_replace'
Link: https://lkml.kernel.org/r/20210411221756.15461-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Lars Ellenberg <lars.ellenberg@linbit.com>
Cc: Jiri Pirko <jiri@nvidia.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johannes Berg [Fri, 16 Apr 2021 22:46:23 +0000 (15:46 -0700)]
gcov: clang: fix clang-11+ build
With clang-11+, the code is broken due to my kvmalloc() conversion
(which predated the clang-11 support code) leaving one vmalloc() in
place. Fix that.
Link: https://lkml.kernel.org/r/20210412214210.6e1ecca9cdc5.I24459763acf0591d5e6b31c7e3a59890d802f79c@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christophe Leroy [Fri, 16 Apr 2021 22:46:20 +0000 (15:46 -0700)]
mm: ptdump: fix build failure
READ_ONCE() cannot be used for reading PTEs. Use ptep_get() instead, to
avoid the following errors:
CC mm/ptdump.o
In file included from <command-line>:
mm/ptdump.c: In function 'ptdump_pte_entry':
include/linux/compiler_types.h:320:38: error: call to '__compiletime_assert_207' declared with attribute error: Unsupported access size for {READ,WRITE}_ONCE().
320 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
| ^
include/linux/compiler_types.h:301:4: note: in definition of macro '__compiletime_assert'
301 | prefix ## suffix(); \
| ^~~~~~
include/linux/compiler_types.h:320:2: note: in expansion of macro '_compiletime_assert'
320 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
| ^~~~~~~~~~~~~~~~~~~
include/asm-generic/rwonce.h:36:2: note: in expansion of macro 'compiletime_assert'
36 | compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
| ^~~~~~~~~~~~~~~~~~
include/asm-generic/rwonce.h:49:2: note: in expansion of macro 'compiletime_assert_rwonce_type'
49 | compiletime_assert_rwonce_type(x); \
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mm/ptdump.c:114:14: note: in expansion of macro 'READ_ONCE'
114 | pte_t val = READ_ONCE(*pte);
| ^~~~~~~~~
make[2]: *** [mm/ptdump.o] Error 1
See commit
481e980a7c19 ("mm: Allow arches to provide ptep_get()") and
commit
c0e1c8c22beb ("powerpc/8xx: Provide ptep_get() with 16k pages")
for details.
Link: https://lkml.kernel.org/r/912b349e2bcaa88939904815ca0af945740c6bd4.1618478922.git.christophe.leroy@csgroup.eu
Fixes:
30d621f6723b ("mm: add generic ptdump")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Steven Price <steven.price@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Zack Rusin [Fri, 16 Apr 2021 22:46:18 +0000 (15:46 -0700)]
mm/mapping_dirty_helpers: guard hugepage pud's usage
Mapping dirty helpers have, so far, been only used on X86, but a port of
vmwgfx to ARM64 exposed a problem which results in a compilation error
on ARM64 systems:
mm/mapping_dirty_helpers.c: In function `wp_clean_pud_entry':
mm/mapping_dirty_helpers.c:172:32: error: implicit declaration of function `pud_dirty'; did you mean `pmd_dirty'? [-Werror=implicit-function-declaration]
This is due to the fact that mapping_dirty_helpers code assumes that
pud_dirty is always defined, which is not the case for architectures
that don't define CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD.
ARM64 arch is a little inconsistent when it comes to PUD hugepage
helpers, e.g. it defines pud_young but not pud_dirty but regardless of
that the core kernel code shouldn't assume that any of the PUD hugepage
helpers are available unless CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
is defined. This prevents compilation errors whenever one of the
drivers is ported to new architectures.
Link: https://lkml.kernel.org/r/20210409165151.694574-1-zackr@vmware.com
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Thomas Hellstrm (Intel) <thomas_os@shipmail.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
John Paul Adrian Glaubitz [Fri, 16 Apr 2021 22:46:15 +0000 (15:46 -0700)]
ia64: tools: remove duplicate definition of ia64_mf() on ia64
The ia64_mf() macro defined in tools/arch/ia64/include/asm/barrier.h is
already defined in <asm/gcc_intrin.h> on ia64 which causes libbpf
failing to build:
CC /usr/src/linux/tools/bpf/bpftool//libbpf/staticobjs/libbpf.o
In file included from /usr/src/linux/tools/include/asm/barrier.h:24,
from /usr/src/linux/tools/include/linux/ring_buffer.h:4,
from libbpf.c:37:
/usr/src/linux/tools/include/asm/../../arch/ia64/include/asm/barrier.h:43: error: "ia64_mf" redefined [-Werror]
43 | #define ia64_mf() asm volatile ("mf" ::: "memory")
|
In file included from /usr/include/ia64-linux-gnu/asm/intrinsics.h:20,
from /usr/include/ia64-linux-gnu/asm/swab.h:11,
from /usr/include/linux/swab.h:8,
from /usr/include/linux/byteorder/little_endian.h:13,
from /usr/include/ia64-linux-gnu/asm/byteorder.h:5,
from /usr/src/linux/tools/include/uapi/linux/perf_event.h:20,
from libbpf.c:36:
/usr/include/ia64-linux-gnu/asm/gcc_intrin.h:382: note: this is the location of the previous definition
382 | #define ia64_mf() __asm__ volatile ("mf" ::: "memory")
|
cc1: all warnings being treated as errors
Thus, remove the definition from tools/arch/ia64/include/asm/barrier.h.
Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
John Paul Adrian Glaubitz [Fri, 16 Apr 2021 22:46:12 +0000 (15:46 -0700)]
ia64: tools: remove inclusion of ia64-specific version of errno.h header
There is no longer an ia64-specific version of the errno.h header below
arch/ia64/include/uapi/asm/, so trying to build tools/bpf fails with:
CC /usr/src/linux/tools/bpf/bpftool/btf_dumper.o
In file included from /usr/src/linux/tools/include/linux/err.h:8,
from btf_dumper.c:11:
/usr/src/linux/tools/include/uapi/asm/errno.h:13:10: fatal error: ../../../arch/ia64/include/uapi/asm/errno.h: No such file or directory
13 | #include "../../../arch/ia64/include/uapi/asm/errno.h"
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
Thus, just remove the inclusion of the ia64-specific errno.h so that the
build will use the generic errno.h header on this target which was used
there anyway as the ia64-specific errno.h was just a wrapper for the
generic header.
Fixes:
c25f867ddd00 ("ia64: remove unneeded uapi asm-generic wrappers")
Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Randy Dunlap [Fri, 16 Apr 2021 22:46:09 +0000 (15:46 -0700)]
ia64: fix discontig.c section mismatches
Fix IA64 discontig.c Section mismatch warnings.
When CONFIG_SPARSEMEM=y and CONFIG_MEMORY_HOTPLUG=y, the functions
computer_pernodesize() and scatter_node_data() should not be marked as
__meminit because they are needed after init, on any memory hotplug
event. Also, early_nr_cpus_node() is called by compute_pernodesize(),
so early_nr_cpus_node() cannot be __meminit either.
WARNING: modpost: vmlinux.o(.text.unlikely+0x1612): Section mismatch in reference from the function arch_alloc_nodedata() to the function .meminit.text:compute_pernodesize()
The function arch_alloc_nodedata() references the function __meminit compute_pernodesize().
This is often because arch_alloc_nodedata lacks a __meminit annotation or the annotation of compute_pernodesize is wrong.
WARNING: modpost: vmlinux.o(.text.unlikely+0x1692): Section mismatch in reference from the function arch_refresh_nodedata() to the function .meminit.text:scatter_node_data()
The function arch_refresh_nodedata() references the function __meminit scatter_node_data().
This is often because arch_refresh_nodedata lacks a __meminit annotation or the annotation of scatter_node_data is wrong.
WARNING: modpost: vmlinux.o(.text.unlikely+0x1502): Section mismatch in reference from the function compute_pernodesize() to the function .meminit.text:early_nr_cpus_node()
The function compute_pernodesize() references the function __meminit early_nr_cpus_node().
This is often because compute_pernodesize lacks a __meminit annotation or the annotation of early_nr_cpus_node is wrong.
Link: https://lkml.kernel.org/r/20210411001201.3069-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Mike Rapoport <rppt@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Randy Dunlap [Fri, 16 Apr 2021 22:46:06 +0000 (15:46 -0700)]
ia64: remove duplicate entries in generic_defconfig
Fix ia64 generic_defconfig duplicate entries, as warned by:
arch/ia64/configs/generic_defconfig: warning: override: reassigning to symbol ATA: => 58
arch/ia64/configs/generic_defconfig: warning: override: reassigning to symbol ATA_PIIX: => 59
These 2 symbols still have the same value as in the removed lines.
Link: https://lkml.kernel.org/r/20210411020255.18052-1-rdunlap@infradead.org
Fixes:
c331649e6371 ("ia64: Use libata instead of the legacy ide driver in defconfigs")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Randy Dunlap [Fri, 16 Apr 2021 22:46:03 +0000 (15:46 -0700)]
csky: change a Kconfig symbol name to fix e1000 build error
e1000's #define of CONFIG_RAM_BASE conflicts with a Kconfig symbol in
arch/csky/Kconfig.
The symbol in e1000 has been around longer, so change arch/csky/ to use
DRAM_BASE instead of RAM_BASE to remove the conflict. (although e1000
is also a 2-line change)
Link: https://lkml.kernel.org/r/20210411055335.7111-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Acked-by: Guo Ren <guoren@kernel.org>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Walter Wu [Fri, 16 Apr 2021 22:46:00 +0000 (15:46 -0700)]
kasan: remove redundant config option
CONFIG_KASAN_STACK and CONFIG_KASAN_STACK_ENABLE both enable KASAN stack
instrumentation, but we should only need one config, so that we remove
CONFIG_KASAN_STACK_ENABLE and make CONFIG_KASAN_STACK workable. see [1].
When enable KASAN stack instrumentation, then for gcc we could do no
prompt and default value y, and for clang prompt and default value n.
This patch fixes the following compilation warning:
include/linux/kasan.h:333:30: warning: 'CONFIG_KASAN_STACK' is not defined, evaluates to 0 [-Wundef]
[akpm@linux-foundation.org: fix merge snafu]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=210221
Link: https://lkml.kernel.org/r/20210226012531.29231-1-walter-zh.wu@mediatek.com
Fixes:
d9b571c885a8 ("kasan: fix KASAN_STACK dependency for HW_TAGS")
Signed-off-by: Walter Wu <walter-zh.wu@mediatek.com>
Suggested-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arnd Bergmann [Fri, 16 Apr 2021 22:45:57 +0000 (15:45 -0700)]
kasan: fix hwasan build for gcc
gcc-11 adds support for -fsanitize=kernel-hwaddress, so it becomes
possible to enable CONFIG_KASAN_SW_TAGS.
Unfortunately this fails to build at the moment, because the
corresponding command line arguments use llvm specific syntax.
Change it to use the cc-param macro instead, which works on both clang
and gcc.
[elver@google.com: fixup for "kasan: fix hwasan build for gcc"]
Link: https://lkml.kernel.org/r/YHQZVfVVLE/LDK2v@elver.google.com
Link: https://lkml.kernel.org/r/20210323124112.1229772-1-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Marco Elver <elver@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Acked-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Randy Dunlap [Fri, 16 Apr 2021 22:45:54 +0000 (15:45 -0700)]
mm: eliminate "expecting prototype" kernel-doc warnings
Fix stray kernel-doc warnings in mm/ due to mis-typed or missing function
names.
Quietens these kernel-doc warnings:
mm/mmu_gather.c:264: warning: expecting prototype for tlb_gather_mmu(). Prototype was for __tlb_gather_mmu() instead
mm/oom_kill.c:180: warning: expecting prototype for Check whether unreclaimable slab amount is greater than(). Prototype was for should_dump_unreclaim_slab() instead
mm/shuffle.c:155: warning: expecting prototype for shuffle_free_memory(). Prototype was for __shuffle_free_memory() instead
Link: https://lkml.kernel.org/r/20210411210642.11362-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David S. Miller [Fri, 16 Apr 2021 22:48:08 +0000 (15:48 -0700)]
Merge git://git./pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:
====================
pull-request: bpf 2021-04-17
The following pull-request contains BPF updates for your *net* tree.
We've added 10 non-merge commits during the last 9 day(s) which contain
a total of 8 files changed, 175 insertions(+), 111 deletions(-).
The main changes are:
1) Fix a potential NULL pointer dereference in libbpf's xsk
umem handling, from Ciara Loftus.
2) Mitigate a speculative oob read of up to map value size by
tightening the masking window, from Daniel Borkmann.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Lijun Pan [Fri, 16 Apr 2021 04:18:13 +0000 (23:18 -0500)]
MAINTAINERS: update my email
Update my email and change myself to Reviewer.
Signed-off-by: Lijun Pan <lijunp213@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Wed, 24 Mar 2021 13:52:31 +0000 (14:52 +0100)]
bpf: Update selftests to reflect new error states
Update various selftest error messages:
* The 'Rx tried to sub from different maps, paths, or prohibited types'
is reworked into more specific/differentiated error messages for better
guidance.
* The change into 'value -
4294967168 makes map_value pointer be out of
bounds' is due to moving the mixed bounds check into the speculation
handling and thus occuring slightly later than above mentioned sanity
check.
* The change into 'math between map_value pointer and register with
unbounded min value' is similarly due to register sanity check coming
before the mixed bounds check.
* The case of 'map access: known scalar += value_ptr from different maps'
now loads fine given masks are the same from the different paths (despite
max map value size being different).
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Daniel Borkmann [Wed, 24 Mar 2021 09:38:26 +0000 (10:38 +0100)]
bpf: Tighten speculative pointer arithmetic mask
This work tightens the offset mask we use for unprivileged pointer arithmetic
in order to mitigate a corner case reported by Piotr and Benedict where in
the speculative domain it is possible to advance, for example, the map value
pointer by up to value_size-1 out-of-bounds in order to leak kernel memory
via side-channel to user space.
Before this change, the computed ptr_limit for retrieve_ptr_limit() helper
represents largest valid distance when moving pointer to the right or left
which is then fed as aux->alu_limit to generate masking instructions against
the offset register. After the change, the derived aux->alu_limit represents
the largest potential value of the offset register which we mask against which
is just a narrower subset of the former limit.
For minimal complexity, we call sanitize_ptr_alu() from 2 observation points
in adjust_ptr_min_max_vals(), that is, before and after the simulated alu
operation. In the first step, we retieve the alu_state and alu_limit before
the operation as well as we branch-off a verifier path and push it to the
verification stack as we did before which checks the dst_reg under truncation,
in other words, when the speculative domain would attempt to move the pointer
out-of-bounds.
In the second step, we retrieve the new alu_limit and calculate the absolute
distance between both. Moreover, we commit the alu_state and final alu_limit
via update_alu_sanitation_state() to the env's instruction aux data, and bail
out from there if there is a mismatch due to coming from different verification
paths with different states.
Reported-by: Piotr Krysiuk <piotras@gmail.com>
Reported-by: Benedict Schlueter <benedict.schlueter@rub.de>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: Benedict Schlueter <benedict.schlueter@rub.de>
Daniel Borkmann [Wed, 24 Mar 2021 10:25:39 +0000 (11:25 +0100)]
bpf: Move sanitize_val_alu out of op switch
Add a small sanitize_needed() helper function and move sanitize_val_alu()
out of the main opcode switch. In upcoming work, we'll move sanitize_ptr_alu()
as well out of its opcode switch so this helps to streamline both.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Daniel Borkmann [Tue, 23 Mar 2021 14:05:48 +0000 (15:05 +0100)]
bpf: Refactor and streamline bounds check into helper
Move the bounds check in adjust_ptr_min_max_vals() into a small helper named
sanitize_check_bounds() in order to simplify the former a bit.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Daniel Borkmann [Tue, 23 Mar 2021 08:30:01 +0000 (09:30 +0100)]
bpf: Improve verifier error messages for users
Consolidate all error handling and provide more user-friendly error messages
from sanitize_ptr_alu() and sanitize_val_alu().
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Daniel Borkmann [Tue, 23 Mar 2021 08:04:10 +0000 (09:04 +0100)]
bpf: Rework ptr_limit into alu_limit and add common error path
Small refactor with no semantic changes in order to consolidate the max
ptr_limit boundary check.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Daniel Borkmann [Tue, 23 Mar 2021 07:51:02 +0000 (08:51 +0100)]
bpf: Ensure off_reg has no mixed signed bounds for all types
The mixed signed bounds check really belongs into retrieve_ptr_limit()
instead of outside of it in adjust_ptr_min_max_vals(). The reason is
that this check is not tied to PTR_TO_MAP_VALUE only, but to all pointer
types that we handle in retrieve_ptr_limit() and given errors from the latter
propagate back to adjust_ptr_min_max_vals() and lead to rejection of the
program, it's a better place to reside to avoid anything slipping through
for future types. The reason why we must reject such off_reg is that we
otherwise would not be able to derive a mask, see details in
9d7eceede769
("bpf: restrict unknown scalars of mixed signed bounds for unprivileged").
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Daniel Borkmann [Mon, 22 Mar 2021 14:45:52 +0000 (15:45 +0100)]
bpf: Move off_reg into sanitize_ptr_alu
Small refactor to drag off_reg into sanitize_ptr_alu(), so we later on can
use off_reg for generalizing some of the checks for all pointer types.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Daniel Borkmann [Tue, 23 Mar 2021 07:32:59 +0000 (08:32 +0100)]
bpf: Use correct permission flag for mixed signed bounds arithmetic
We forbid adding unknown scalars with mixed signed bounds due to the
spectre v1 masking mitigation. Hence this also needs bypass_spec_v1
flag instead of allow_ptr_leaks.
Fixes:
2c78ee898d8f ("bpf: Implement CAP_BPF")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Linus Torvalds [Fri, 16 Apr 2021 17:05:42 +0000 (10:05 -0700)]
Merge tag 'riscv-for-linus-5.12-rc8' of git://git./linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
"A handful of fixes:
- a fix to properly select SPARSEMEM_STATIC on rv32
- a few fixes to kprobes
I don't generally like sending stuff this late, but these all seem
pretty safe"
* tag 'riscv-for-linus-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: keep interrupts disabled for BREAKPOINT exception
riscv: kprobes/ftrace: Add recursion protection to the ftrace callback
riscv: add do_page_fault and do_trap_break into the kprobes blacklist
riscv: Fix spelling mistake "SPARSEMEM" to "SPARSMEM"
Linus Torvalds [Fri, 16 Apr 2021 16:45:30 +0000 (09:45 -0700)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux
Pull arm64 fix from Catalin Marinas:
"Fix kernel compilation when using the LLVM integrated assembly.
A recent commit (
2decad92f473, "arm64: mte: Ensure TIF_MTE_ASYNC_FAULT
is set atomically") broke the kernel build when using the LLVM
integrated assembly (only noticeable with clang-12 as MTE is not
supported by earlier versions and the code in question not compiled).
The Fixes: tag in the commit refers to the original patch introducing
subsections for the alternative code sequences"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: alternatives: Move length validation in alternative_{insn, endif}
Linus Torvalds [Fri, 16 Apr 2021 14:49:04 +0000 (07:49 -0700)]
Merge tag 'drm-fixes-2021-04-16' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Daniel Vetter:
"I pinged the usual suspects, only intel fixes pending"
* tag 'drm-fixes-2021-04-16' of git://anongit.freedesktop.org/drm/drm:
drm/i915/display/vlv_dsi: Do not skip panel_pwr_cycle_delay when disabling the panel
drm/i915: Don't zero out the Y plane's watermarks
drm/i915/dpcd_bl: Don't try vesa interface unless specified by VBT
Jisheng Zhang [Mon, 29 Mar 2021 18:16:24 +0000 (02:16 +0800)]
riscv: keep interrupts disabled for BREAKPOINT exception
Current riscv's kprobe handlers are run with both preemption and
interrupt enabled, this violates kprobe requirements. Fix this issue
by keeping interrupts disabled for BREAKPOINT exception.
Fixes:
c22b0bcb1dd0 ("riscv: Add kprobes supported")
Cc: stable@vger.kernel.org
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
[Palmer: add a comment]
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Jisheng Zhang [Mon, 29 Mar 2021 18:14:40 +0000 (02:14 +0800)]
riscv: kprobes/ftrace: Add recursion protection to the ftrace callback
Currently, the riscv's kprobes(powerred by ftrace) handler is
preemptible. Futher check indicates we miss something similar as the
commit
c536aa1c5b17 ("kprobes/ftrace: Add recursion protection to the
ftrace callback"), so do similar modifications as the commit does.
Fixes:
829adda597fe ("riscv: Add KPROBES_ON_FTRACE supported")
Cc: stable@vger.kernel.org
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Jisheng Zhang [Mon, 29 Mar 2021 18:12:26 +0000 (02:12 +0800)]
riscv: add do_page_fault and do_trap_break into the kprobes blacklist
These two functions are used to implement the kprobes feature so they
can't be kprobed.
Fixes:
c22b0bcb1dd0 ("riscv: Add kprobes supported")
Cc: stable@vger.kernel.org
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Kefeng Wang [Mon, 29 Mar 2021 03:13:07 +0000 (11:13 +0800)]
riscv: Fix spelling mistake "SPARSEMEM" to "SPARSMEM"
There is a spelling mistake when SPARSEMEM Kconfig copy.
Fixes:
a5406a7ff56e ("riscv: Correct SPARSEMEM configuration")
Cc: stable@vger.kernel.org
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Ben Widawsky [Thu, 15 Apr 2021 23:26:08 +0000 (16:26 -0700)]
cxl/mem: Fix register block offset calculation
The "Register Offset Low" register of a "DVSEC Register Locator"
contains the 64K aligned offset for the registers along with the BAR
indicator and an id. The implementation was treating the "Register Block
Offset Low" field a value rather than as a pre-aligned component of the
64-bit offset. So, just mask, don't mask and shift (FIELD_GET).
The user visible result of this bug is that the driver fails to bind to
the device after none of the required blocks are found.
This was missed earlier because the primary development done in the QEMU
environment only uses 0 offsets, i.e. 0 shifted is still 0.
Fixes:
8adaf747c9f0 ("cxl/mem: Find device capabilities")
Reported-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Link: https://lore.kernel.org/r/20210415232610.603273-1-ben.widawsky@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
David S. Miller [Thu, 15 Apr 2021 23:55:49 +0000 (16:55 -0700)]
Merge branch 'ch_tlss-fixes'
Vinay Kumar Yadav says:
====================
chelsio/ch_ktls: chelsio inline tls driver bug fixes
This series of patches fix following bugs in Chelsio inline tls driver.
Patch1: kernel panic.
Patch2: connection close issue.
Patch3: tcb close call issue.
Patch4: unnecessary snd_una update.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vinay Kumar Yadav [Thu, 15 Apr 2021 07:47:48 +0000 (13:17 +0530)]
ch_ktls: do not send snd_una update to TCB in middle
snd_una update should not be done when the same skb is being
sent out.chcr_short_record_handler() sends it again even
though SND_UNA update is already sent for the skb in
chcr_ktls_xmit(), which causes mismatch in un-acked
TCP seq number, later causes problem in sending out
complete record.
Fixes:
429765a149f1 ("chcr: handle partial end part of a record")
Signed-off-by: Vinay Kumar Yadav <vinay.yadav@chelsio.com>
Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vinay Kumar Yadav [Thu, 15 Apr 2021 07:47:47 +0000 (13:17 +0530)]
ch_ktls: tcb close causes tls connection failure
HW doesn't need marking TCB closed. This TCB state change
sometimes causes problem to the new connection which gets
the same tid.
Fixes:
34aba2c45024 ("cxgb4/chcr : Register to tls add and del callback")
Signed-off-by: Vinay Kumar Yadav <vinay.yadav@chelsio.com>
Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vinay Kumar Yadav [Thu, 15 Apr 2021 07:47:46 +0000 (13:17 +0530)]
ch_ktls: fix device connection close
When sge queue is full and chcr_ktls_xmit_wr_complete()
returns failure, skb is not freed if it is not the last tls record in
this skb, causes refcount never gets freed and tls_dev_del()
never gets called on this connection.
Fixes:
5a4b9fe7fece ("cxgb4/chcr: complete record tx handling")
Signed-off-by: Vinay Kumar Yadav <vinay.yadav@chelsio.com>
Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vinay Kumar Yadav [Thu, 15 Apr 2021 07:47:45 +0000 (13:17 +0530)]
ch_ktls: Fix kernel panic
Taking page refcount is not ideal and causes kernel panic
sometimes. It's better to take tx_ctx lock for the complete
skb transmit, to avoid page cleanup if ACK received in middle.
Fixes:
5a4b9fe7fece ("cxgb4/chcr: complete record tx handling")
Signed-off-by: Vinay Kumar Yadav <vinay.yadav@chelsio.com>
Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 15 Apr 2021 23:43:29 +0000 (16:43 -0700)]
Merge tag 'mlx5-fixes-2021-04-14' of git://git./linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5 fixes 2021-04-14
This series provides 3 small fixes to mlx5 driver.
Please pull and let me know if there is any problem.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Xing [Wed, 14 Apr 2021 02:34:28 +0000 (10:34 +0800)]
i40e: fix the panic when running bpf in xdpdrv mode
Fix this panic by adding more rules to calculate the value of @rss_size_max
which could be used in allocating the queues when bpf is loaded, which,
however, could cause the failure and then trigger the NULL pointer of
vsi->rx_rings. Prio to this fix, the machine doesn't care about how many
cpus are online and then allocates 256 queues on the machine with 32 cpus
online actually.
Once the load of bpf begins, the log will go like this "failed to get
tracking for 256 queues for VSI 0 err -12" and this "setup of MAIN VSI
failed".
Thus, I attach the key information of the crash-log here.
BUG: unable to handle kernel NULL pointer dereference at
0000000000000000
RIP: 0010:i40e_xdp+0xdd/0x1b0 [i40e]
Call Trace:
[2160294.717292] ? i40e_reconfig_rss_queues+0x170/0x170 [i40e]
[2160294.717666] dev_xdp_install+0x4f/0x70
[2160294.718036] dev_change_xdp_fd+0x11f/0x230
[2160294.718380] ? dev_disable_lro+0xe0/0xe0
[2160294.718705] do_setlink+0xac7/0xe70
[2160294.719035] ? __nla_parse+0xed/0x120
[2160294.719365] rtnl_newlink+0x73b/0x860
Fixes:
41c445ff0f48 ("i40e: main driver core")
Co-developed-by: Shujin Li <lishujin@kuaishou.com>
Signed-off-by: Shujin Li <lishujin@kuaishou.com>
Signed-off-by: Jason Xing <xingwanli@kuaishou.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 15 Apr 2021 17:53:39 +0000 (10:53 -0700)]
Merge tag 'acpi-5.12-rc8' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
"Restore the initrd-based ACPI table override functionality broken by
one of the recent fixes"
* tag 'acpi-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: x86: Call acpi_boot_table_init() after acpi_table_upgrade()
Linus Torvalds [Thu, 15 Apr 2021 17:48:51 +0000 (10:48 -0700)]
Merge tag 'gpio-fixes-for-v5.12-rc8' of git://git./linux/kernel/git/brgl/linux
Pull gpio fix from Bartosz Golaszewski:
"A single fix for an older problem with the sysfs interface: do not
allow exporting GPIO lines which were marked invalid by the driver"
* tag 'gpio-fixes-for-v5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio: sysfs: Obey valid_mask
Linus Torvalds [Thu, 15 Apr 2021 17:43:18 +0000 (10:43 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/hid/hid
Pull HID fixes from Jiri Kosina:
"The changes are all device/driver specific fixes:
- EV_KEY and EV_ABS regression fix for Wacom from Ping Cheng
- BIOS-specific quirk to fix some of the AMD_SFH-based systems, from
Hans de Goede
- other small error handling fixes and device ID additions"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
HID: wacom: set EV_KEY and EV_ABS only for non-HID_GENERIC type of devices
AMD_SFH: Add DMI quirk table for BIOS-es which don't set the activestatus bits
AMD_SFH: Add sensor_mask module parameter
AMD_SFH: Removed unused activecontrolstatus member from the amd_mp2_dev struct
HID: wacom: Assign boolean values to a bool variable
HID cp2112: fix support for multiple gpiochips
HID: alps: fix error return code in alps_input_configured()
HID: asus: Add support for 2021 ASUS N-Key keyboard
HID: google: add don USB id
Nathan Chancellor [Wed, 14 Apr 2021 00:08:04 +0000 (17:08 -0700)]
arm64: alternatives: Move length validation in alternative_{insn, endif}
After commit
2decad92f473 ("arm64: mte: Ensure TIF_MTE_ASYNC_FAULT is
set atomically"), LLVM's integrated assembler fails to build entry.S:
<instantiation>:5:7: error: expected assembly-time absolute expression
.org . - (664b-663b) + (662b-661b)
^
<instantiation>:6:7: error: expected assembly-time absolute expression
.org . - (662b-661b) + (664b-663b)
^
The root cause is LLVM's assembler has a one-pass design, meaning it
cannot figure out these instruction lengths when the .org directive is
outside of the subsection that they are in, which was changed by the
.arch_extension directive added in the above commit.
Apply the same fix from commit
966a0acce2fc ("arm64/alternatives: move
length validation inside the subsection") to the alternative_endif
macro, shuffling the .org directives so that the length validation
happen will always happen in the same subsections. alternative_insn has
not shown any issue yet but it appears that it could have the same issue
in the future so just preemptively change it.
Fixes:
f7b93d42945c ("arm64/alternatives: use subsections for replacement sequences")
Cc: <stable@vger.kernel.org> # 5.8.x
Link: https://github.com/ClangBuiltLinux/linux/issues/1347
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Tested-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lore.kernel.org/r/20210414000803.662534-1-nathan@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Linus Torvalds [Thu, 15 Apr 2021 17:23:44 +0000 (10:23 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:
"Just a few driver fixes here"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: elants_i2c - drop zero-checking of ABS_MT_TOUCH_MAJOR resolution
Input: elants_i2c - fix division by zero if firmware reports zero phys size
Input: nspire-keypad - enable interrupts only when opened
Input: i8042 - fix Pegatron C15B ID entry
Input: n64joy - fix return value check in n64joy_probe()
Input: s6sy761 - fix coordinate read bit shift
Daniel Vetter [Thu, 15 Apr 2021 13:24:17 +0000 (15:24 +0200)]
Merge tag 'drm-intel-fixes-2021-04-15' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
Display panel & power related fixes:
- Backlight fix (Lyude)
- Display watermark fix (Ville)
- VLV panel power fix (Hans)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YHg4nz/ndzDRmPjd@intel.com
wenxu [Fri, 9 Apr 2021 05:33:48 +0000 (13:33 +0800)]
net/mlx5e: fix ingress_ifindex check in mlx5e_flower_parse_meta
In the nft_offload there is the mate flow_dissector with no
ingress_ifindex but with ingress_iftype that only be used
in the software. So if the mask of ingress_ifindex in meta is
0, this meta check should be bypass.
Fixes:
6d65bc64e232 ("net/mlx5e: Add mlx5e_flower_parse_meta support")
Signed-off-by: wenxu <wenxu@ucloud.cn>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Aya Levin [Sun, 11 Apr 2021 06:33:12 +0000 (09:33 +0300)]
net/mlx5e: Fix setting of RS FEC mode
Change register setting from bit number to bit mask.
Fixes:
b5ede32d3329 ("net/mlx5e: Add support for FEC modes based on 50G per lane links")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Aya Levin [Mon, 12 Apr 2021 14:50:08 +0000 (17:50 +0300)]
net/mlx5: Fix setting of devlink traps in switchdev mode
Prevent setting of devlink traps on the uplink while in switchdev mode.
In this mode, it is the SW switch responsibility to handle both packets
with a mismatch in destination MAC or VLAN ID. Therefore, there are no
flow steering tables to trap undesirable packets and driver crashes upon
setting a trap.
Fixes:
241dc159391f ("net/mlx5: Notify on trap action by blocking event")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
David S. Miller [Wed, 14 Apr 2021 21:12:12 +0000 (14:12 -0700)]
Merge branch '10GbE' of git://git./linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2021-04-14
This series contains updates to ixgbe and ice drivers.
Alex Duyck fixes a NULL pointer dereference for ixgbe.
Yongxin Liu fixes an unbalanced enable/disable which was causing a call
trace with suspend for ixgbe.
Colin King fixes a potential infinite loop for ice.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Thierry Reding [Wed, 14 Apr 2021 15:10:07 +0000 (17:10 +0200)]
Revert "net: stmmac: re-init rx buffers when mac resume back"
This reverts commit
9c63faaa931e443e7abbbee9de0169f1d4710546, which
introduces a suspend/resume regression on Jetson TX2 boards that can be
reproduced every time. Given that the issue that this was supposed to
fix only occurs very sporadically the safest course of action is to
revert before v5.12 and then we can have another go at fixing the more
rare issue in the next release (and perhaps backport it if necessary).
The root cause of the observed problem seems to be that when the system
is suspended, some packets are still in transit. When the descriptors
for these buffers are cleared on resume, the descriptors become invalid
and cause a fatal bus error.
Link: https://lore.kernel.org/r/708edb92-a5df-ecc4-3126-5ab36707e275@nvidia.com/
Reported-by: Jonathan Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wan Jiabing [Wed, 14 Apr 2021 11:31:48 +0000 (19:31 +0800)]
cavium/liquidio: Fix duplicate argument
Fix the following coccicheck warning:
./drivers/net/ethernet/cavium/liquidio/cn66xx_regs.h:413:6-28:
duplicated argument to & or |
The CN6XXX_INTR_M1UPB0_ERR here is duplicate.
Here should be CN6XXX_INTR_M1UNB0_ERR.
Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Claudiu Beznea [Wed, 14 Apr 2021 11:20:29 +0000 (14:20 +0300)]
net: macb: fix the restore of cmp registers
Commit
a14d273ba159 ("net: macb: restore cmp registers on resume path")
introduces the restore of CMP registers on resume path. In case the IP
doesn't support type 2 screeners (zero on DCFG8 register) the
struct macb::rx_fs_list::list is not initialized and thus the
list_for_each_entry(item, &bp->rx_fs_list.list, list) loop introduced in
commit
a14d273ba159 ("net: macb: restore cmp registers on resume path")
will access an uninitialized list leading to crash. Thus, initialize
the struct macb::rx_fs_list::list without taking into account if the
IP supports type 2 screeners or not.
Fixes:
a14d273ba159 ("net: macb: restore cmp registers on resume path")
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 14 Apr 2021 20:23:54 +0000 (13:23 -0700)]
Merge tag 'for-5.12/dm-fixes-3' of git://git./linux/kernel/git/device-mapper/linux-dm
Pull device mapper fix from Mike Snitzer:
"Fix DM verity target FEC support's RS roots IO to always be aligned.
This fixes a previous stable@ fix that overcorrected for a different
configuration that also resulted in misaligned roots IO"
* tag 'for-5.12/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm verity fec: fix misaligned RS roots IO
Nicolas Dichtel [Wed, 14 Apr 2021 10:03:25 +0000 (12:03 +0200)]
vrf: fix a comment about loopback device
This is a leftover of the below commit.
Fixes:
4f04256c983a ("net: vrf: Drop local rtable and rt6_info")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Dichtel [Wed, 14 Apr 2021 10:00:27 +0000 (12:00 +0200)]
doc: move seg6_flowlabel to seg6-sysctl.rst
Let's have all seg6 sysctl at the same place.
Fixes:
a6dc6670cd7e ("ipv6: sr: Add documentation for seg_flowlabel sysctl")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 14 Apr 2021 20:10:58 +0000 (13:10 -0700)]
Merge branch 'ibmvnic-napi-fixes'
Lijun Pan says:
====================
ibmvnic: correctly call NAPI APIs
This series correct some misuse of NAPI APIs in the driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Lijun Pan [Wed, 14 Apr 2021 07:46:16 +0000 (02:46 -0500)]
ibmvnic: remove duplicate napi_schedule call in open function
Remove the unnecessary napi_schedule() call in __ibmvnic_open() since
interrupt_rx() calls napi_schedule_prep/__napi_schedule during every
receive interrupt.
Fixes:
ed651a10875f ("ibmvnic: Updated reset handling")
Signed-off-by: Lijun Pan <lijunp213@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lijun Pan [Wed, 14 Apr 2021 07:46:15 +0000 (02:46 -0500)]
ibmvnic: remove duplicate napi_schedule call in do_reset function
During adapter reset, do_reset/do_hard_reset calls ibmvnic_open(),
which will calls napi_schedule if previous state is VNIC_CLOSED
(i.e, the reset case, and "ifconfig down" case). So there is no need
for do_reset to call napi_schedule again at the end of the function
though napi_schedule will neglect the request if napi is already
scheduled.
Fixes:
ed651a10875f ("ibmvnic: Updated reset handling")
Signed-off-by: Lijun Pan <lijunp213@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lijun Pan [Wed, 14 Apr 2021 07:46:14 +0000 (02:46 -0500)]
ibmvnic: avoid calling napi_disable() twice
__ibmvnic_open calls napi_disable without checking whether NAPI polling
has already been disabled or not. This could cause napi_disable
being called twice, which could generate deadlock. For example,
the first napi_disable will spin until NAPI_STATE_SCHED is cleared
by napi_complete_done, then set it again.
When napi_disable is called the second time, it will loop infinitely
because no dev->poll will be running to clear NAPI_STATE_SCHED.
To prevent above scenario from happening, call ibmvnic_napi_disable()
which checks if napi is disabled or not before calling napi_disable.
Fixes:
bfc32f297337 ("ibmvnic: Move resource initialization to its own routine")
Suggested-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: Lijun Pan <lijunp213@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Wed, 14 Apr 2021 08:47:10 +0000 (10:47 +0200)]
r8169: don't advertise pause in jumbo mode
It has been reported [0] that using pause frames in jumbo mode impacts
performance. There's no available chip documentation, but vendor
drivers r8168 and r8125 don't advertise pause in jumbo mode. So let's
do the same, according to Roman it fixes the issue.
[0] https://bugzilla.kernel.org/show_bug.cgi?id=212617
Fixes:
9cf9b84cc701 ("r8169: make use of phy_set_asym_pause")
Reported-by: Roman Mamedov <rm+bko@romanrm.net>
Tested-by: Roman Mamedov <rm+bko@romanrm.net>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Wed, 14 Apr 2021 03:46:14 +0000 (20:46 -0700)]
ethtool: pause: make sure we init driver stats
The intention was for pause statistics to not be reported
when driver does not have the relevant callback (only
report an empty netlink nest). What happens currently
we report all 0s instead. Make sure statistics are
initialized to "not set" (which is -1) so the dumping
code skips them.
Fixes:
9a27a33027f2 ("ethtool: add standard pause stats")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Begunkov [Tue, 13 Apr 2021 10:43:00 +0000 (11:43 +0100)]
io_uring: fix early sqd_list removal sqpoll hangs
[ 245.463317] INFO: task iou-sqp-1374:1377 blocked for more than 122 seconds.
[ 245.463334] task:iou-sqp-1374 state:D flags:0x00004000
[ 245.463345] Call Trace:
[ 245.463352] __schedule+0x36b/0x950
[ 245.463376] schedule+0x68/0xe0
[ 245.463385] __io_uring_cancel+0xfb/0x1a0
[ 245.463407] do_exit+0xc0/0xb40
[ 245.463423] io_sq_thread+0x49b/0x710
[ 245.463445] ret_from_fork+0x22/0x30
It happens when sqpoll forgot to run park_task_work and goes to exit,
then exiting user may remove ctx from sqd_list, and so corresponding
io_sq_thread() -> io_uring_cancel_sqpoll() won't be executed. Hopefully
it just stucks in do_exit() in this case.
Fixes:
dbe1bdbb39db ("io_uring: handle signals for IO threads like a normal thread")
Reported-by: Joakim Hassila <joj@mac.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jaegeuk Kim [Wed, 14 Apr 2021 15:28:28 +0000 (08:28 -0700)]
dm verity fec: fix misaligned RS roots IO
commit
df7b59ba9245 ("dm verity: fix FEC for RS roots unaligned to
block size") introduced the possibility for misaligned roots IO
relative to the underlying device's logical block size. E.g. Android's
default RS roots=2 results in dm_bufio->block_size=1024, which causes
the following EIO if the logical block size of the device is 4096,
given v->data_dev_block_bits=12:
E sd 0 : 0:0:0: [sda] tag#30 request not aligned to the logical block size
E blk_update_request: I/O error, dev sda, sector
10368424 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
E device-mapper: verity-fec: 254:8: FEC 9244672: parity read failed (block 18056): -5
Fix this by onlu using f->roots for dm_bufio blocksize IFF it is
aligned to v->data_dev_block_bits.
Fixes:
df7b59ba9245 ("dm verity: fix FEC for RS roots unaligned to block size")
Cc: stable@vger.kernel.org
Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Linus Torvalds [Wed, 14 Apr 2021 17:55:56 +0000 (10:55 -0700)]
Merge tag 's390-5.12-7' of git://git./linux/kernel/git/s390/linux
Pull s390 fixes from Heiko Carstens:
- setup stack backchain properly in external and i/o interrupt handler
to fix stack unwinding. This broke when converting to generic entry
- save caller address of psw_idle to get a sane stacktrace
* tag 's390-5.12-7' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/entry: save the caller of psw_idle
s390/entry: avoid setting up backchain in ext|io handlers
Linus Torvalds [Wed, 14 Apr 2021 17:36:03 +0000 (10:36 -0700)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
- Fix incorrect asm constraint for load_unaligned_zeropad() fixup
- Fix thread flag update when setting TIF_MTE_ASYNC_FAULT
- Fix restored irq state when handling fault on kprobe
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: kprobes: Restore local irqflag if kprobes is cancelled
arm64: mte: Ensure TIF_MTE_ASYNC_FAULT is set atomically
arm64: fix inline asm in load_unaligned_zeropad()
Linus Torvalds [Wed, 14 Apr 2021 16:36:54 +0000 (09:36 -0700)]
Merge tag 'dmaengine-fix-5.12' of git://git./linux/kernel/git/vkoul/dmaengine
Pull dmaengine fixes from Vinod Koul:
"A couple of dmaengine driver fixes for:
- race and descriptor issue for xilinx driver
- fix interrupt handling, wq state & cleanup, field sizes for
completion, msix permissions for idxd driver
- runtime pm fix for tegra driver
- double free fix in dma_async_device_register"
* tag 'dmaengine-fix-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine:
dmaengine: idxd: fix wq cleanup of WQCFG registers
dmaengine: idxd: clear MSIX permission entry on shutdown
dmaengine: plx_dma: add a missing put_device() on error path
dmaengine: tegra20: Fix runtime PM imbalance on error
dmaengine: Fix a double free in dma_async_device_register
dmaengine: dw: Make it dependent to HAS_IOMEM
dmaengine: idxd: fix wq size store permission state
dmaengine: idxd: fix opcap sysfs attribute output
dmaengine: idxd: fix delta_rec and crc size field for completion record
dmaengine: idxd: Fix clobbering of SWERR overflow bit on writeback
dmaengine: xilinx: dpdma: Fix race condition in done IRQ
dmaengine: xilinx: dpdma: Fix descriptor issuing on video group
Linus Torvalds [Wed, 14 Apr 2021 16:10:54 +0000 (09:10 -0700)]
Merge tag 'vfio-v5.12-rc8' of git://github.com/awilliam/linux-vfio
Pull VFIO fix from Alex Williamson:
"Verify mmap region within range (Christian A. Ehrhardt)"
* tag 'vfio-v5.12-rc8' of git://github.com/awilliam/linux-vfio:
vfio/pci: Add missing range check in vfio_pci_mmap
Linus Torvalds [Wed, 14 Apr 2021 15:50:46 +0000 (08:50 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull kvm fix from Paolo Bonzini:
"Fix for a possible out-of-bounds access"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: VMX: Don't use vcpu->run->internal.ndata as an array index
Linus Torvalds [Wed, 14 Apr 2021 01:40:00 +0000 (18:40 -0700)]
Merge tag 'trace-v5.12-rc7' of git://git./linux/kernel/git/rostedt/linux-trace
Pull tracing fix from Steven Rostedt:
"Fix a memory link in dyn_event_release().
An error path exited the function before freeing the allocated 'argv'
variable"
* tag 'trace-v5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing/dynevent: Fix a memory leak in an error handling path
Michael Brown [Tue, 13 Apr 2021 15:25:12 +0000 (16:25 +0100)]
xen-netback: Check for hotplug-status existence before watching
The logic in connect() is currently written with the assumption that
xenbus_watch_pathfmt() will return an error for a node that does not
exist. This assumption is incorrect: xenstore does allow a watch to
be registered for a nonexistent node (and will send notifications
should the node be subsequently created).
As of commit
1f2565780 ("xen-netback: remove 'hotplug-status' once it
has served its purpose"), this leads to a failure when a domU
transitions into XenbusStateConnected more than once. On the first
domU transition into Connected state, the "hotplug-status" node will
be deleted by the hotplug_status_changed() callback in dom0. On the
second or subsequent domU transition into Connected state, the
hotplug_status_changed() callback will therefore never be invoked, and
so the backend will remain stuck in InitWait.
This failure prevents scenarios such as reloading the xen-netfront
module within a domU, or booting a domU via iPXE. There is
unfortunately no way for the domU to work around this dom0 bug.
Fix by explicitly checking for existence of the "hotplug-status" node,
thereby creating the behaviour that was previously assumed to exist.
Signed-off-by: Michael Brown <mbrown@fensystems.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reiji Watanabe [Tue, 13 Apr 2021 15:47:40 +0000 (15:47 +0000)]
KVM: VMX: Don't use vcpu->run->internal.ndata as an array index
__vmx_handle_exit() uses vcpu->run->internal.ndata as an index for
an array access. Since vcpu->run is (can be) mapped to a user address
space with a writer permission, the 'ndata' could be updated by the
user process at anytime (the user process can set it to outside the
bounds of the array).
So, it is not safe that __vmx_handle_exit() uses the 'ndata' that way.
Fixes:
1aa561b1a4c0 ("kvm: x86: Add "last CPU" to some KVM_EXIT information")
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Message-Id: <
20210413154739.490299-1-reijiw@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Eric Dumazet [Tue, 13 Apr 2021 12:41:35 +0000 (05:41 -0700)]
gro: ensure frag0 meets IP header alignment
After commit
0f6925b3e8da ("virtio_net: Do not pull payload in skb->head")
Guenter Roeck reported one failure in his tests using sh architecture.
After much debugging, we have been able to spot silent unaligned accesses
in inet_gro_receive()
The issue at hand is that upper networking stacks assume their header
is word-aligned. Low level drivers are supposed to reserve NET_IP_ALIGN
bytes before the Ethernet header to make that happen.
This patch hardens skb_gro_reset_offset() to not allow frag0 fast-path
if the fragment is not properly aligned.
Some arches like x86, arm64 and powerpc do not care and define NET_IP_ALIGN
as 0, this extra check will be a NOP for them.
Note that if frag0 is not used, GRO will call pskb_may_pull()
as many times as needed to pull network and transport headers.
Fixes:
0f6925b3e8da ("virtio_net: Do not pull payload in skb->head")
Fixes:
78a478d0efd9 ("gro: Inline skb_gro_header and cache frag0 virtual address")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Or Cohen [Tue, 13 Apr 2021 18:10:31 +0000 (21:10 +0300)]
net/sctp: fix race condition in sctp_destroy_sock
If sctp_destroy_sock is called without sock_net(sk)->sctp.addr_wq_lock
held and sp->do_auto_asconf is true, then an element is removed
from the auto_asconf_splist without any proper locking.
This can happen in the following functions:
1. In sctp_accept, if sctp_sock_migrate fails.
2. In inet_create or inet6_create, if there is a bpf program
attached to BPF_CGROUP_INET_SOCK_CREATE which denies
creation of the sctp socket.
The bug is fixed by acquiring addr_wq_lock in sctp_destroy_sock
instead of sctp_close.
This addresses CVE-2021-23133.
Reported-by: Or Cohen <orcohen@paloaltonetworks.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Fixes:
610236587600 ("bpf: Add new cgroup attach type to enable sock modifications")
Signed-off-by: Or Cohen <orcohen@paloaltonetworks.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lijun Pan [Tue, 13 Apr 2021 08:33:25 +0000 (03:33 -0500)]
ibmvnic: correctly use dev_consume/free_skb_irq
It is more correct to use dev_kfree_skb_irq when packets are dropped,
and to use dev_consume_skb_irq when packets are consumed.
Fixes:
0d973388185d ("ibmvnic: Introduce xmit_more support using batched subCRQ hcalls")
Suggested-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: Lijun Pan <lijunp213@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jonathon Reinhart [Tue, 13 Apr 2021 07:08:48 +0000 (03:08 -0400)]
net: Make tcp_allowed_congestion_control readonly in non-init netns
Currently, tcp_allowed_congestion_control is global and writable;
writing to it in any net namespace will leak into all other net
namespaces.
tcp_available_congestion_control and tcp_allowed_congestion_control are
the only sysctls in ipv4_net_table (the per-netns sysctl table) with a
NULL data pointer; their handlers (proc_tcp_available_congestion_control
and proc_allowed_congestion_control) have no other way of referencing a
struct net. Thus, they operate globally.
Because ipv4_net_table does not use designated initializers, there is no
easy way to fix up this one "bad" table entry. However, the data pointer
updating logic shouldn't be applied to NULL pointers anyway, so we
instead force these entries to be read-only.
These sysctls used to exist in ipv4_table (init-net only), but they were
moved to the per-net ipv4_net_table, presumably without realizing that
tcp_allowed_congestion_control was writable and thus introduced a leak.
Because the intent of that commit was only to know (i.e. read) "which
congestion algorithms are available or allowed", this read-only solution
should be sufficient.
The logic added in recent commit
31c4d2f160eb: ("net: Ensure net namespace isolation of sysctls")
does not and cannot check for NULL data pointers, because
other table entries (e.g. /proc/sys/net/netfilter/nf_log/) have
.data=NULL but use other methods (.extra2) to access the struct net.
Fixes:
9cb8e048e5d9 ("net/ipv4/sysctl: show tcp_{allowed, available}_congestion_control in non-initial netns")
Signed-off-by: Jonathon Reinhart <jonathon.reinhart@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>