Lijun Pan [Wed, 14 Apr 2021 07:46:16 +0000 (02:46 -0500)]
ibmvnic: remove duplicate napi_schedule call in open function
Remove the unnecessary napi_schedule() call in __ibmvnic_open() since
interrupt_rx() calls napi_schedule_prep/__napi_schedule during every
receive interrupt.
Fixes: ed651a10875f ("ibmvnic: Updated reset handling")
Signed-off-by: Lijun Pan <lijunp213@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lijun Pan [Wed, 14 Apr 2021 07:46:15 +0000 (02:46 -0500)]
ibmvnic: remove duplicate napi_schedule call in do_reset function
During adapter reset, do_reset/do_hard_reset calls ibmvnic_open(),
which will calls napi_schedule if previous state is VNIC_CLOSED
(i.e, the reset case, and "ifconfig down" case). So there is no need
for do_reset to call napi_schedule again at the end of the function
though napi_schedule will neglect the request if napi is already
scheduled.
Fixes: ed651a10875f ("ibmvnic: Updated reset handling")
Signed-off-by: Lijun Pan <lijunp213@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lijun Pan [Wed, 14 Apr 2021 07:46:14 +0000 (02:46 -0500)]
ibmvnic: avoid calling napi_disable() twice
__ibmvnic_open calls napi_disable without checking whether NAPI polling
has already been disabled or not. This could cause napi_disable
being called twice, which could generate deadlock. For example,
the first napi_disable will spin until NAPI_STATE_SCHED is cleared
by napi_complete_done, then set it again.
When napi_disable is called the second time, it will loop infinitely
because no dev->poll will be running to clear NAPI_STATE_SCHED.
To prevent above scenario from happening, call ibmvnic_napi_disable()
which checks if napi is disabled or not before calling napi_disable.
Fixes: bfc32f297337 ("ibmvnic: Move resource initialization to its own routine")
Suggested-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: Lijun Pan <lijunp213@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Wed, 14 Apr 2021 08:47:10 +0000 (10:47 +0200)]
r8169: don't advertise pause in jumbo mode
It has been reported [0] that using pause frames in jumbo mode impacts
performance. There's no available chip documentation, but vendor
drivers r8168 and r8125 don't advertise pause in jumbo mode. So let's
do the same, according to Roman it fixes the issue.
[0] https://bugzilla.kernel.org/show_bug.cgi?id=212617
Fixes: 9cf9b84cc701 ("r8169: make use of phy_set_asym_pause")
Reported-by: Roman Mamedov <rm+bko@romanrm.net>
Tested-by: Roman Mamedov <rm+bko@romanrm.net>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Wed, 14 Apr 2021 03:46:14 +0000 (20:46 -0700)]
ethtool: pause: make sure we init driver stats
The intention was for pause statistics to not be reported
when driver does not have the relevant callback (only
report an empty netlink nest). What happens currently
we report all 0s instead. Make sure statistics are
initialized to "not set" (which is -1) so the dumping
code skips them.
Fixes: 9a27a33027f2 ("ethtool: add standard pause stats")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Brown [Tue, 13 Apr 2021 15:25:12 +0000 (16:25 +0100)]
xen-netback: Check for hotplug-status existence before watching
The logic in connect() is currently written with the assumption that
xenbus_watch_pathfmt() will return an error for a node that does not
exist. This assumption is incorrect: xenstore does allow a watch to
be registered for a nonexistent node (and will send notifications
should the node be subsequently created).
As of commit
1f2565780 ("xen-netback: remove 'hotplug-status' once it
has served its purpose"), this leads to a failure when a domU
transitions into XenbusStateConnected more than once. On the first
domU transition into Connected state, the "hotplug-status" node will
be deleted by the hotplug_status_changed() callback in dom0. On the
second or subsequent domU transition into Connected state, the
hotplug_status_changed() callback will therefore never be invoked, and
so the backend will remain stuck in InitWait.
This failure prevents scenarios such as reloading the xen-netfront
module within a domU, or booting a domU via iPXE. There is
unfortunately no way for the domU to work around this dom0 bug.
Fix by explicitly checking for existence of the "hotplug-status" node,
thereby creating the behaviour that was previously assumed to exist.
Signed-off-by: Michael Brown <mbrown@fensystems.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 13 Apr 2021 12:41:35 +0000 (05:41 -0700)]
gro: ensure frag0 meets IP header alignment
After commit
0f6925b3e8da ("virtio_net: Do not pull payload in skb->head")
Guenter Roeck reported one failure in his tests using sh architecture.
After much debugging, we have been able to spot silent unaligned accesses
in inet_gro_receive()
The issue at hand is that upper networking stacks assume their header
is word-aligned. Low level drivers are supposed to reserve NET_IP_ALIGN
bytes before the Ethernet header to make that happen.
This patch hardens skb_gro_reset_offset() to not allow frag0 fast-path
if the fragment is not properly aligned.
Some arches like x86, arm64 and powerpc do not care and define NET_IP_ALIGN
as 0, this extra check will be a NOP for them.
Note that if frag0 is not used, GRO will call pskb_may_pull()
as many times as needed to pull network and transport headers.
Fixes: 0f6925b3e8da ("virtio_net: Do not pull payload in skb->head")
Fixes: 78a478d0efd9 ("gro: Inline skb_gro_header and cache frag0 virtual address")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Or Cohen [Tue, 13 Apr 2021 18:10:31 +0000 (21:10 +0300)]
net/sctp: fix race condition in sctp_destroy_sock
If sctp_destroy_sock is called without sock_net(sk)->sctp.addr_wq_lock
held and sp->do_auto_asconf is true, then an element is removed
from the auto_asconf_splist without any proper locking.
This can happen in the following functions:
1. In sctp_accept, if sctp_sock_migrate fails.
2. In inet_create or inet6_create, if there is a bpf program
attached to BPF_CGROUP_INET_SOCK_CREATE which denies
creation of the sctp socket.
The bug is fixed by acquiring addr_wq_lock in sctp_destroy_sock
instead of sctp_close.
This addresses CVE-2021-23133.
Reported-by: Or Cohen <orcohen@paloaltonetworks.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Fixes: 610236587600 ("bpf: Add new cgroup attach type to enable sock modifications")
Signed-off-by: Or Cohen <orcohen@paloaltonetworks.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lijun Pan [Tue, 13 Apr 2021 08:33:25 +0000 (03:33 -0500)]
ibmvnic: correctly use dev_consume/free_skb_irq
It is more correct to use dev_kfree_skb_irq when packets are dropped,
and to use dev_consume_skb_irq when packets are consumed.
Fixes: 0d973388185d ("ibmvnic: Introduce xmit_more support using batched subCRQ hcalls")
Suggested-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: Lijun Pan <lijunp213@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jonathon Reinhart [Tue, 13 Apr 2021 07:08:48 +0000 (03:08 -0400)]
net: Make tcp_allowed_congestion_control readonly in non-init netns
Currently, tcp_allowed_congestion_control is global and writable;
writing to it in any net namespace will leak into all other net
namespaces.
tcp_available_congestion_control and tcp_allowed_congestion_control are
the only sysctls in ipv4_net_table (the per-netns sysctl table) with a
NULL data pointer; their handlers (proc_tcp_available_congestion_control
and proc_allowed_congestion_control) have no other way of referencing a
struct net. Thus, they operate globally.
Because ipv4_net_table does not use designated initializers, there is no
easy way to fix up this one "bad" table entry. However, the data pointer
updating logic shouldn't be applied to NULL pointers anyway, so we
instead force these entries to be read-only.
These sysctls used to exist in ipv4_table (init-net only), but they were
moved to the per-net ipv4_net_table, presumably without realizing that
tcp_allowed_congestion_control was writable and thus introduced a leak.
Because the intent of that commit was only to know (i.e. read) "which
congestion algorithms are available or allowed", this read-only solution
should be sufficient.
The logic added in recent commit
31c4d2f160eb: ("net: Ensure net namespace isolation of sysctls")
does not and cannot check for NULL data pointers, because
other table entries (e.g. /proc/sys/net/netfilter/nf_log/) have
.data=NULL but use other methods (.extra2) to access the struct net.
Fixes: 9cb8e048e5d9 ("net/ipv4/sysctl: show tcp_{allowed, available}_congestion_control in non-initial netns")
Signed-off-by: Jonathon Reinhart <jonathon.reinhart@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 13 Apr 2021 21:31:52 +0000 (14:31 -0700)]
Merge branch 'catch-all-devices'
Hristo Venev says:
====================
net: Fix two use-after-free bugs
The two patches fix two use-after-free bugs related to cleaning up
network namespaces, one in sit and one in ip6_tunnel. They are easy to
trigger if the user has the ability to create network namespaces.
The bugs can be used to trigger null pointer dereferences. I am not
sure if they can be exploited further, but I would guess that they
can. I am not sending them to the mailing list without confirmation
that doing so would be OK.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Hristo Venev [Mon, 12 Apr 2021 17:41:17 +0000 (20:41 +0300)]
net: ip6_tunnel: Unregister catch-all devices
Similarly to the sit case, we need to remove the tunnels with no
addresses that have been moved to another network namespace.
Fixes: 0bd8762824e73 ("ip6tnl: add x-netns support")
Signed-off-by: Hristo Venev <hristo@venev.name>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hristo Venev [Mon, 12 Apr 2021 17:41:16 +0000 (20:41 +0300)]
net: sit: Unregister catch-all devices
A sit interface created without a local or a remote address is linked
into the `sit_net::tunnels_wc` list of its original namespace. When
deleting a network namespace, delete the devices that have been moved.
The following script triggers a null pointer dereference if devices
linked in a deleted `sit_net` remain:
for i in `seq 1 30`; do
ip netns add ns-test
ip netns exec ns-test ip link add dev veth0 type veth peer veth1
ip netns exec ns-test ip link add dev sit$i type sit dev veth0
ip netns exec ns-test ip link set dev sit$i netns $$
ip netns del ns-test
done
for i in `seq 1 30`; do
ip link del dev sit$i
done
Fixes: 5e6700b3bf98f ("sit: add support of x-netns")
Signed-off-by: Hristo Venev <hristo@venev.name>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 12 Apr 2021 23:17:50 +0000 (16:17 -0700)]
Merge git://git./pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net:
1) Fix NAT IPv6 offload in the flowtable.
2) icmpv6 is printed as unknown in /proc/net/nf_conntrack.
3) Use div64_u64() in nft_limit, from Eric Dumazet.
4) Use pre_exit to unregister ebtables and arptables hooks,
from Florian Westphal.
5) Fix out-of-bound memset in x_tables compat match/target,
also from Florian.
6) Clone set elements expression to ensure proper initialization.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso [Sat, 10 Apr 2021 19:29:38 +0000 (21:29 +0200)]
netfilter: nftables: clone set element expression template
memcpy() breaks when using connlimit in set elements. Use
nft_expr_clone() to initialize the connlimit expression list, otherwise
connlimit garbage collector crashes when walking on the list head copy.
[ 493.064656] Workqueue: events_power_efficient nft_rhash_gc [nf_tables]
[ 493.064685] RIP: 0010:find_or_evict+0x5a/0x90 [nf_conncount]
[ 493.064694] Code: 2b 43 40 83 f8 01 77 0d 48 c7 c0 f5 ff ff ff 44 39 63 3c 75 df 83 6d 18 01 48 8b 43 08 48 89 de 48 8b 13 48 8b 3d ee 2f 00 00 <48> 89 42 08 48 89 10 48 b8 00 01 00 00 00 00 ad de 48 89 03 48 83
[ 493.064699] RSP: 0018:
ffffc90000417dc0 EFLAGS:
00010297
[ 493.064704] RAX:
0000000000000000 RBX:
ffff888134f38410 RCX:
0000000000000000
[ 493.064708] RDX:
0000000000000000 RSI:
ffff888134f38410 RDI:
ffff888100060cc0
[ 493.064711] RBP:
ffff88812ce594a8 R08:
ffff888134f38438 R09:
00000000ebb9025c
[ 493.064714] R10:
ffffffff8219f838 R11:
0000000000000017 R12:
0000000000000001
[ 493.064718] R13:
ffffffff82146740 R14:
ffff888134f38410 R15:
0000000000000000
[ 493.064721] FS:
0000000000000000(0000) GS:
ffff88840e440000(0000) knlGS:
0000000000000000
[ 493.064725] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 493.064729] CR2:
0000000000000008 CR3:
00000001330aa002 CR4:
00000000001706e0
[ 493.064733] Call Trace:
[ 493.064737] nf_conncount_gc_list+0x8f/0x150 [nf_conncount]
[ 493.064746] nft_rhash_gc+0x106/0x390 [nf_tables]
Reported-by: Laura Garcia Liebana <nevola@gmail.com>
Fixes: 409444522976 ("netfilter: nf_tables: add elements with stateful expressions")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Wed, 7 Apr 2021 19:38:57 +0000 (21:38 +0200)]
netfilter: x_tables: fix compat match/target pad out-of-bound write
xt_compat_match/target_from_user doesn't check that zeroing the area
to start of next rule won't write past end of allocated ruleset blob.
Remove this code and zero the entire blob beforehand.
Reported-by: syzbot+cfc0247ac173f597aaaa@syzkaller.appspotmail.com
Reported-by: Andy Nguyen <theflow@google.com>
Fixes: 9fa492cdc160c ("[NETFILTER]: x_tables: simplify compat API")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Jakub Kicinski [Mon, 12 Apr 2021 18:47:07 +0000 (11:47 -0700)]
ethtool: fix kdoc attr name
Add missing 't' in attrtype.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pali Rohár [Mon, 12 Apr 2021 16:57:39 +0000 (18:57 +0200)]
net: phy: marvell: fix detection of PHY on Topaz switches
Since commit
fee2d546414d ("net: phy: marvell: mv88e6390 temperature
sensor reading"), Linux reports the temperature of Topaz hwmon as
constant -75°C.
This is because switches from the Topaz family (
88E6141 /
88E6341) have
the address of the temperature sensor register different from Peridot.
This address is instead compatible with
88E1510 PHYs, as was used for
Topaz before the above mentioned commit.
Create a new mapping table between switch family and PHY ID for families
which don't have a model number. And define PHY IDs for Topaz and Peridot
families.
Create a new PHY ID and a new PHY driver for Topaz's internal PHY.
The only difference from Peridot's PHY driver is the HWMON probing
method.
Prior this change Topaz's internal PHY is detected by kernel as:
PHY [...] driver [Marvell
88E6390] (irq=63)
And afterwards as:
PHY [...] driver [Marvell
88E6341 Family] (irq=63)
Signed-off-by: Pali Rohár <pali@kernel.org>
BugLink: https://github.com/globalscaletechnologies/linux/issues/1
Fixes: fee2d546414d ("net: phy: marvell: mv88e6390 temperature sensor reading")
Reviewed-by: Marek Behún <kabel@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Phillip Potter [Sun, 11 Apr 2021 11:28:24 +0000 (12:28 +0100)]
net: geneve: check skb is large enough for IPv4/IPv6 header
Check within geneve_xmit_skb/geneve6_xmit_skb that sk_buff structure
is large enough to include IPv4 or IPv6 header, and reject if not. The
geneve_xmit_skb portion and overall idea was contributed by Eric Dumazet.
Fixes a KMSAN-found uninit-value bug reported by syzbot at:
https://syzkaller.appspot.com/bug?id=
abe95dc3e3e9667fc23b8d81f29ecad95c6f106f
Suggested-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot+2e406a9ac75bb71d4b7a@syzkaller.appspotmail.com
Signed-off-by: Phillip Potter <phil@philpotter.co.uk>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christophe JAILLET [Sun, 11 Apr 2021 09:02:08 +0000 (11:02 +0200)]
net: davicom: Fix regulator not turned off on failed probe
When the probe fails, we must disable the regulator that was previously
enabled.
This patch is a follow-up to commit
ac88c531a5b3
("net: davicom: Fix regulator not turned off on failed probe") which missed
one case.
Fixes: 7994fe55a4a2 ("dm9000: Add regulator and reset support to dm9000")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joakim Zhang [Fri, 9 Apr 2021 09:11:45 +0000 (17:11 +0800)]
MAINTAINERS: update maintainer entry for freescale fec driver
Update maintainer entry for freescale fec driver.
Suggested-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Wed, 7 Apr 2021 19:43:40 +0000 (21:43 +0200)]
netfilter: arp_tables: add pre_exit hook for table unregister
Same problem that also existed in iptables/ip(6)tables, when
arptable_filter is removed there is no longer a wait period before the
table/ruleset is free'd.
Unregister the hook in pre_exit, then remove the table in the exit
function.
This used to work correctly because the old nf_hook_unregister API
did unconditional synchronize_net.
The per-net hook unregister function uses call_rcu instead.
Fixes: b9e69e127397 ("netfilter: xtables: don't hook tables by default")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Wed, 7 Apr 2021 19:43:39 +0000 (21:43 +0200)]
netfilter: bridge: add pre_exit hooks for ebtable unregistration
Just like ip/ip6/arptables, the hooks have to be removed, then
synchronize_rcu() has to be called to make sure no more packets are being
processed before the ruleset data is released.
Place the hook unregistration in the pre_exit hook, then call the new
ebtables pre_exit function from there.
Years ago, when first netns support got added for netfilter+ebtables,
this used an older (now removed) netfilter hook unregister API, that did
a unconditional synchronize_rcu().
Now that all is done with call_rcu, ebtable_{filter,nat,broute} pernet exit
handlers may free the ebtable ruleset while packets are still in flight.
This can only happens on module removal, not during netns exit.
The new function expects the table name, not the table struct.
This is because upcoming patch set (targeting -next) will remove all
net->xt.{nat,filter,broute}_table instances, this makes it necessary
to avoid external references to those member variables.
The existing APIs will be converted, so follow the upcoming scheme of
passing name + hook type instead.
Fixes: aee12a0a3727e ("ebtables: remove nf_hook_register usage")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Eric Dumazet [Fri, 9 Apr 2021 15:49:39 +0000 (08:49 -0700)]
netfilter: nft_limit: avoid possible divide error in nft_limit_init
div_u64() divides u64 by u32.
nft_limit_init() wants to divide u64 by u64, use the appropriate
math function (div64_u64)
divide error: 0000 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 8390 Comm: syz-executor188 Not tainted 5.12.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:div_u64_rem include/linux/math64.h:28 [inline]
RIP: 0010:div_u64 include/linux/math64.h:127 [inline]
RIP: 0010:nft_limit_init+0x2a2/0x5e0 net/netfilter/nft_limit.c:85
Code: ef 4c 01 eb 41 0f 92 c7 48 89 de e8 38 a5 22 fa 4d 85 ff 0f 85 97 02 00 00 e8 ea 9e 22 fa 4c 0f af f3 45 89 ed 31 d2 4c 89 f0 <49> f7 f5 49 89 c6 e8 d3 9e 22 fa 48 8d 7d 48 48 b8 00 00 00 00 00
RSP: 0018:
ffffc90009447198 EFLAGS:
00010246
RAX:
0000000000000000 RBX:
0000200000000000 RCX:
0000000000000000
RDX:
0000000000000000 RSI:
ffffffff875152e6 RDI:
0000000000000003
RBP:
ffff888020f80908 R08:
0000200000000000 R09:
0000000000000000
R10:
ffffffff875152d8 R11:
0000000000000000 R12:
ffffc90009447270
R13:
0000000000000000 R14:
0000000000000000 R15:
0000000000000000
FS:
000000000097a300(0000) GS:
ffff8880b9d00000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
00000000200001c4 CR3:
0000000026a52000 CR4:
00000000001506e0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
Call Trace:
nf_tables_newexpr net/netfilter/nf_tables_api.c:2675 [inline]
nft_expr_init+0x145/0x2d0 net/netfilter/nf_tables_api.c:2713
nft_set_elem_expr_alloc+0x27/0x280 net/netfilter/nf_tables_api.c:5160
nf_tables_newset+0x1997/0x3150 net/netfilter/nf_tables_api.c:4321
nfnetlink_rcv_batch+0x85a/0x21b0 net/netfilter/nfnetlink.c:456
nfnetlink_rcv_skb_batch net/netfilter/nfnetlink.c:580 [inline]
nfnetlink_rcv+0x3af/0x420 net/netfilter/nfnetlink.c:598
netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline]
netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338
netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927
sock_sendmsg_nosec net/socket.c:654 [inline]
sock_sendmsg+0xcf/0x120 net/socket.c:674
____sys_sendmsg+0x6e8/0x810 net/socket.c:2350
___sys_sendmsg+0xf3/0x170 net/socket.c:2404
__sys_sendmsg+0xe5/0x1b0 net/socket.c:2433
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xae
Fixes: c26844eda9d4 ("netfilter: nf_tables: Fix nft limit burst handling")
Fixes: 3e0f64b7dd31 ("netfilter: nft_limit: fix packet ratelimiting")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Diagnosed-by: Luigi Rizzo <lrizzo@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Linus Torvalds [Fri, 9 Apr 2021 22:26:51 +0000 (15:26 -0700)]
Merge tag 'net-5.12-rc7' of git://git./linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Networking fixes for 5.12-rc7, including fixes from can, ipsec,
mac80211, wireless, and bpf trees.
No scary regressions here or in the works, but small fixes for 5.12
changes keep coming.
Current release - regressions:
- virtio: do not pull payload in skb->head
- virtio: ensure mac header is set in virtio_net_hdr_to_skb()
- Revert "net: correct sk_acceptq_is_full()"
- mptcp: revert "mptcp: provide subflow aware release function"
- ethernet: lan743x: fix ethernet frame cutoff issue
- dsa: fix type was not set for devlink port
- ethtool: remove link_mode param and derive link params from driver
- sched: htb: fix null pointer dereference on a null new_q
- wireless: iwlwifi: Fix softirq/hardirq disabling in
iwl_pcie_enqueue_hcmd()
- wireless: iwlwifi: fw: fix notification wait locking
- wireless: brcmfmac: p2p: Fix deadlock introduced by avoiding the
rtnl dependency
Current release - new code bugs:
- napi: fix hangup on napi_disable for threaded napi
- bpf: take module reference for trampoline in module
- wireless: mt76: mt7921: fix airtime reporting and related tx hangs
- wireless: iwlwifi: mvm: rfi: don't lock mvm->mutex when sending
config command
Previous releases - regressions:
- rfkill: revert back to old userspace API by default
- nfc: fix infinite loop, refcount & memory leaks in LLCP sockets
- let skb_orphan_partial wake-up waiters
- xfrm/compat: Cleanup WARN()s that can be user-triggered
- vxlan, geneve: do not modify the shared tunnel info when PMTU
triggers an ICMP reply
- can: fix msg_namelen values depending on CAN_REQUIRED_SIZE
- can: uapi: mark union inside struct can_frame packed
- sched: cls: fix action overwrite reference counting
- sched: cls: fix err handler in tcf_action_init()
- ethernet: mlxsw: fix ECN marking in tunnel decapsulation
- ethernet: nfp: Fix a use after free in nfp_bpf_ctrl_msg_rx
- ethernet: i40e: fix receiving of single packets in xsk zero-copy
mode
- ethernet: cxgb4: avoid collecting SGE_QBASE regs during traffic
Previous releases - always broken:
- bpf: Refuse non-O_RDWR flags in BPF_OBJ_GET
- bpf: Refcount task stack in bpf_get_task_stack
- bpf, x86: Validate computation of branch displacements
- ieee802154: fix many similar syzbot-found bugs
- fix NULL dereferences in netlink attribute handling
- reject unsupported operations on monitor interfaces
- fix error handling in llsec_key_alloc()
- xfrm: make ipv4 pmtu check honor ip header df
- xfrm: make hash generation lock per network namespace
- xfrm: esp: delete NETIF_F_SCTP_CRC bit from features for esp
offload
- ethtool: fix incorrect datatype in set_eee ops
- xdp: fix xdp_return_frame() kernel BUG throw for page_pool memory
model
- openvswitch: fix send of uninitialized stack memory in ct limit
reply
Misc:
- udp: add get handling for UDP_GRO sockopt"
* tag 'net-5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (182 commits)
net: fix hangup on napi_disable for threaded napi
net: hns3: Trivial spell fix in hns3 driver
lan743x: fix ethernet frame cutoff issue
net: ipv6: check for validity before dereferencing cfg->fc_nlinfo.nlh
net: dsa: lantiq_gswip: Configure all remaining GSWIP_MII_CFG bits
net: dsa: lantiq_gswip: Don't use PHY auto polling
net: sched: sch_teql: fix null-pointer dereference
ipv6: report errors for iftoken via netlink extack
net: sched: fix err handler in tcf_action_init()
net: sched: fix action overwrite reference counting
Revert "net: sched: bump refcount for new action in ACT replace mode"
ice: fix memory leak of aRFS after resuming from suspend
i40e: Fix sparse warning: missing error code 'err'
i40e: Fix sparse error: 'vsi->netdev' could be null
i40e: Fix sparse error: uninitialized symbol 'ring'
i40e: Fix sparse errors in i40e_txrx.c
i40e: Fix parameters in aq_get_phy_register()
nl80211: fix beacon head validation
bpf, x86: Validate computation of branch displacements for x86-32
bpf, x86: Validate computation of branch displacements for x86-64
...
Linus Torvalds [Fri, 9 Apr 2021 22:06:52 +0000 (15:06 -0700)]
Merge tag 'io_uring-5.12-2021-04-09' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
"Two minor fixups for the reissue logic, and one for making sure that
unbounded work is canceled on io-wq exit"
* tag 'io_uring-5.12-2021-04-09' of git://git.kernel.dk/linux-block:
io-wq: cancel unbounded works on io-wq destroy
io_uring: fix rw req completion
io_uring: clear F_REISSUE right after getting it
Linus Torvalds [Fri, 9 Apr 2021 20:01:48 +0000 (13:01 -0700)]
Merge tag 'devicetree-fixes-for-5.12-2' of git://git./linux/kernel/git/robh/linux
Pull devicetree fixes from Rob Herring:
- Fix fw_devlink failure with ".*,nr-gpios" properties
- Doc link reference fixes from Mauro
- Fixes for unaligned FDT handling found on OpenRisc. First, avoid
crash with better error handling when unflattening an unaligned FDT.
Second, fix memory allocations for FDTs to ensure alignment.
* tag 'devicetree-fixes-for-5.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
of: property: fw_devlink: do not link ".*,nr-gpios"
dt-bindings:iio:adc: update motorola,cpcap-adc.yaml reference
dt-bindings: fix references for iio-bindings.txt
dt-bindings: don't use ../dir for doc references
of: unittest: overlay: ensure proper alignment of copied FDT
of: properly check for error returned by fdt_get_name()
Linus Torvalds [Fri, 9 Apr 2021 19:56:10 +0000 (12:56 -0700)]
Merge tag 'drm-fixes-2021-04-10' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Was relatively quiet this week, but still a few pulls came in, pretty
much small fixes across the board, a couple of regression fixes in the
amdgpu/radeon code, msm has a few minor fixes across the board, a
panel regression fix also.
amdgpu:
- DCN3 fix
- Fix CAC setting regression for TOPAZ
- Fix ttm regression
radeon:
- Fix ttm regression
msm:
- a5xx/a6xx timestamp fix
- microcode version check
- fail path fix
- block programming fix
- error removal fix
i915:
- Fix invalid access to ACPI _DSM objects
xen:
- Fix use-after-free in xen
- minor duplicate defintion cleanup
vc4:
- Reduce fifo threshold on hvs4 to fix a fifo full error
- minor redunantant assignment cleanup
panel:
- Disable TE support for Droid4 and N950"
* tag 'drm-fixes-2021-04-10' of git://anongit.freedesktop.org/drm/drm:
drm/vc4: crtc: Reduce PV fifo threshold on hvs4
drm/vc4: plane: Remove redundant assignment
drm/amdgpu/smu7: fix CAC setting on TOPAZ
drm/radeon: Fix size overflow
drm/amdgpu: Fix size overflow
drm/i915: Fix invalid access to ACPI _DSM objects
drm/amd/display: Add missing mask for DCN3
drm/panel: panel-dsi-cm: disable TE for now
drm/msm/disp/dpu1: program 3d_merge only if block is attached
drm/msm: a6xx: fix version check for the A650 SQE microcode
drm/msm: Fix a5xx/a6xx timestamps
drm/msm: Fix removal of valid error case when checking speed_bin
drm/msm: Set drvdata to NULL when msm_drm_init() fails
drivers: gpu: drm: xen_drm_front_drm_info is declared twice
gpu/xen: Fix a use after free in xen_drm_drv_init
Paolo Abeni [Fri, 9 Apr 2021 15:24:17 +0000 (17:24 +0200)]
net: fix hangup on napi_disable for threaded napi
napi_disable() is subject to an hangup, when the threaded
mode is enabled and the napi is under heavy traffic.
If the relevant napi has been scheduled and the napi_disable()
kicks in before the next napi_threaded_wait() completes - so
that the latter quits due to the napi_disable_pending() condition,
the existing code leaves the NAPI_STATE_SCHED bit set and the
napi_disable() loop waiting for such bit will hang.
This patch addresses the issue by dropping the NAPI_STATE_DISABLE
bit test in napi_thread_wait(). The later napi_threaded_poll()
iteration will take care of clearing the NAPI_STATE_SCHED.
This also addresses a related problem reported by Jakub:
before this patch a napi_disable()/napi_enable() pair killed
the napi thread, effectively disabling the threaded mode.
On the patched kernel napi_disable() simply stops scheduling
the relevant thread.
v1 -> v2:
- let the main napi_thread_poll() loop clear the SCHED bit
Reported-by: Jakub Kicinski <kuba@kernel.org>
Fixes: 29863d41bb6e ("net: implement threaded-able napi poll loop support")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/883923fa22745a9589e8610962b7dc59df09fb1f.1617981844.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Salil Mehta [Fri, 9 Apr 2021 07:42:23 +0000 (08:42 +0100)]
net: hns3: Trivial spell fix in hns3 driver
Some trivial spelling mistakes which caught my eye during the
review of the code.
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Link: https://lore.kernel.org/r/20210409074223.32480-1-salil.mehta@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sven Van Asbroeck [Fri, 9 Apr 2021 00:39:04 +0000 (20:39 -0400)]
lan743x: fix ethernet frame cutoff issue
The ethernet frame length is calculated incorrectly. Depending on
the value of RX_HEAD_PADDING, this may result in ethernet frames
that are too short (cut off at the end), or too long (garbage added
to the end).
Fix by calculating the ethernet frame length correctly. For added
clarity, use the ETH_FCS_LEN constant in the calculation.
Many thanks to Heiner Kallweit for suggesting this solution.
Suggested-by: Heiner Kallweit <hkallweit1@gmail.com>
Fixes: 3e21a10fdea3 ("lan743x: trim all 4 bytes of the FCS; not just 2")
Link: https://lore.kernel.org/lkml/20210408172353.21143-1-TheSven73@gmail.com/
Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com>
Reviewed-by: George McCollister <george.mccollister@gmail.com>
Tested-by: George McCollister <george.mccollister@gmail.com>
Link: https://lore.kernel.org/r/20210409003904.8957-1-TheSven73@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ilya Lipnitskiy [Mon, 5 Apr 2021 22:25:40 +0000 (15:25 -0700)]
of: property: fw_devlink: do not link ".*,nr-gpios"
[<vendor>,]nr-gpios property is used by some GPIO drivers[0] to indicate
the number of GPIOs present on a system, not define a GPIO. nr-gpios is
not configured by #gpio-cells and can't be parsed along with other
"*-gpios" properties.
nr-gpios without the "<vendor>," prefix is not allowed by the DT
spec[1], so only add exception for the ",nr-gpios" suffix and let the
error message continue being printed for non-compliant implementations.
[0] nr-gpios is referenced in Documentation/devicetree/bindings/gpio:
- gpio-adnp.txt
- gpio-xgene-sb.txt
- gpio-xlp.txt
- snps,dw-apb-gpio.yaml
Link: https://github.com/devicetree-org/dt-schema/blob/cb53a16a1eb3e2169ce170c071e47940845ec26e/schemas/gpio/gpio-consumer.yaml#L20
Fixes errors such as:
OF: /palmbus@300000/gpio@600: could not find phandle
Fixes: 7f00be96f125 ("of: property: Add device link support for interrupt-parent, dmas and -gpio(s)")
Signed-off-by: Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com>
Cc: Saravana Kannan <saravanak@google.com>
Cc: stable@vger.kernel.org # v5.5+
Link: https://lore.kernel.org/r/20210405222540.18145-1-ilya.lipnitskiy@gmail.com
Signed-off-by: Rob Herring <robh@kernel.org>
Mauro Carvalho Chehab [Fri, 9 Apr 2021 12:47:47 +0000 (14:47 +0200)]
dt-bindings:iio:adc: update motorola,cpcap-adc.yaml reference
Changeset
1ca9d1b1342d ("dt-bindings:iio:adc:motorola,cpcap-adc yaml conversion")
renamed: Documentation/devicetree/bindings/iio/adc/cpcap-adc.txt
to: Documentation/devicetree/bindings/iio/adc/motorola,cpcap-adc.yaml.
Update its cross-reference accordingly.
Fixes: 1ca9d1b1342d ("dt-bindings:iio:adc:motorola,cpcap-adc yaml conversion")
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Link: https://lore.kernel.org/r/3e205e5fa701e4bc15d39d6ac1f57717df2bb4c6.1617972339.git.mchehab+huawei@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
Mauro Carvalho Chehab [Fri, 9 Apr 2021 12:47:46 +0000 (14:47 +0200)]
dt-bindings: fix references for iio-bindings.txt
The iio-bindings.txt was converted into two files and merged
at the dt-schema git tree at:
https://github.com/devicetree-org/dt-schema
Yet, some documents still refer to the old file. Fix their
references, in order to point to the right URL.
Fixes: dba91f82d580 ("dt-bindings:iio:iio-binding.txt Drop file as content now in dt-schema")
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Link: https://lore.kernel.org/r/4efd81eca266ca0875d3bf9d1672097444146c69.1617972339.git.mchehab+huawei@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
Mauro Carvalho Chehab [Fri, 9 Apr 2021 12:47:45 +0000 (14:47 +0200)]
dt-bindings: don't use ../dir for doc references
As documents have been renamed and moved around, their
references will break, but this will be unnoticed, as the
script which checks for it won't handle "../" references.
So, replace them by the full patch.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Link: https://lore.kernel.org/r/68d3a1244119d1f2829c375b0ef554cf348bc89f.1617972339.git.mchehab+huawei@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
Dave Airlie [Fri, 9 Apr 2021 19:18:31 +0000 (05:18 +1000)]
Merge tag 'drm-intel-fixes-2021-04-09' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- Fix invalid access to ACPI _DSM objects (Takashi)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YHAW6NInrybUoat6@intel.com
Dave Airlie [Fri, 9 Apr 2021 19:15:35 +0000 (05:15 +1000)]
Merge tag 'drm-misc-fixes-2021-04-09' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
drm-misc-fixes for v5.12-rc7:
- Fix use-after-free in xen.
- Reduce fifo threshold on hvs4 to fix a fifo full error.
- Disable TE support for Droid4 and N950.
- Small compiler fixes.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/e7647dd9-60c3-9dfd-a377-89d717212e13@linux.intel.com
Linus Torvalds [Fri, 9 Apr 2021 18:51:06 +0000 (11:51 -0700)]
Merge tag 'selinux-pr-
20210409' of git://git./linux/kernel/git/pcmoore/selinux
Pull selinux fixes from Paul Moore:
"Three SELinux fixes.
These fix known problems relating to (re)loading SELinux policy or
changing the policy booleans, and pass our test suite without problem"
* tag 'selinux-pr-
20210409' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: fix race between old and new sidtab
selinux: fix cond_list corruption when changing booleans
selinux: make nslot handling in avtab more robust
Linus Torvalds [Fri, 9 Apr 2021 17:09:51 +0000 (10:09 -0700)]
Merge tag 'for_linus' of git://git./linux/kernel/git/mst/vhost
Pull vdpa/mlx5 fixes from Michael Tsirkin:
"Last minute fixes.
These all look like something we are better off having
than not ..."
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
vdpa/mlx5: Fix suspend/resume index restoration
vdpa/mlx5: Fix wrong use of bit numbers
vdpa/mlx5: Retrieve BAR address suitable any function
vdpa/mlx5: Use the correct dma device when registering memory
vdpa/mlx5: should exclude header length and fcs from mtu
Linus Torvalds [Fri, 9 Apr 2021 17:05:25 +0000 (10:05 -0700)]
Merge tag 'rproc-v5.12-fixes' of git://git./linux/kernel/git/andersson/remoteproc
Pull remoteproc fixes from Bjorn Andersson:
"This fixes an issue with firmware loading on the TI K3 PRU, fixes
compatibility with GNU binutils for the same and resolves link error
due to a 64-bit division in the Qualcomm PIL info.
It also recognizes Mathieu Poirier as co-maintainer of the remoteproc
and rpmsg subsystems"
* tag 'rproc-v5.12-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/andersson/remoteproc:
remoteproc: pru: Fix firmware loading crashes on K3 SoCs
remoteproc: pru: Fix loading of GNU Binutils ELF
MAINTAINERS: Add co-maintainer for remoteproc/RPMSG subsystems
remoteproc: qcom: pil_info: avoid 64-bit division
Linus Torvalds [Fri, 9 Apr 2021 16:58:42 +0000 (09:58 -0700)]
Merge tag 'for-linus-5.12b-rc7-tag' of git://git./linux/kernel/git/xen/tip
Pull xen fix from Juergen Gross:
"A single fix of a 5.12 patch for the rather uncommon problem of
running as a Xen guest with a real time kernel config"
* tag 'for-linus-5.12b-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/evtchn: Change irq_info lock to raw_spinlock_t
Linus Torvalds [Fri, 9 Apr 2021 16:25:31 +0000 (09:25 -0700)]
Merge tag 'acpi-5.12-rc7' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
"Fix a build issue introduced by a previous fix in the ACPI processor
driver (Vitaly Kuznetsov)"
* tag 'acpi-5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: processor: Fix build when CONFIG_ACPI_PROCESSOR=m
Eli Cohen [Thu, 8 Apr 2021 09:10:47 +0000 (12:10 +0300)]
vdpa/mlx5: Fix suspend/resume index restoration
When we suspend the VM, the VDPA interface will be reset. When the VM is
resumed again, clear_virtqueues() will clear the available and used
indices resulting in hardware virqtqueue objects becoming out of sync.
We can avoid this function alltogether since qemu will clear them if
required, e.g. when the VM went through a reboot.
Moreover, since the hw available and used indices should always be
identical on query and should be restored to the same value same value
for virtqueues that complete in order, we set the single value provided
by set_vq_state(). In get_vq_state() we return the value of hardware
used index.
Fixes: b35ccebe3ef7 ("vdpa/mlx5: Restore the hardware used index after change map")
Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20210408091047.4269-6-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Eli Cohen [Thu, 8 Apr 2021 09:10:46 +0000 (12:10 +0300)]
vdpa/mlx5: Fix wrong use of bit numbers
VIRTIO_F_VERSION_1 is a bit number. Use BIT_ULL() with mask
conditionals.
Also, in mlx5_vdpa_is_little_endian() use BIT_ULL for consistency with
the rest of the code.
Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20210408091047.4269-5-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Eli Cohen [Thu, 8 Apr 2021 09:10:45 +0000 (12:10 +0300)]
vdpa/mlx5: Retrieve BAR address suitable any function
struct mlx5_core_dev has a bar_addr field that contains the correct bar
address for the function regardless of whether it is pci function or sub
function. Use it.
Fixes: 1958fc2f0712 ("net/mlx5: SF, Add auxiliary device driver")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Link: https://lore.kernel.org/r/20210408091047.4269-4-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Eli Cohen [Thu, 8 Apr 2021 09:10:44 +0000 (12:10 +0300)]
vdpa/mlx5: Use the correct dma device when registering memory
In cases where the vdpa instance uses a SF (sub function), the DMA
device is the parent device. Use a function to retrieve the correct DMA
device.
Fixes: 1958fc2f0712 ("net/mlx5: SF, Add auxiliary device driver")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Link: https://lore.kernel.org/r/20210408091047.4269-3-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Si-Wei Liu [Thu, 8 Apr 2021 09:10:43 +0000 (12:10 +0300)]
vdpa/mlx5: should exclude header length and fcs from mtu
When feature VIRTIO_NET_F_MTU is negotiated on mlx5_vdpa,
22 extra bytes worth of MTU length is shown in guest.
This is because the mlx5_query_port_max_mtu API returns
the "hardware" MTU value, which does not just contain the
Ethernet payload, but includes extra lengths starting
from the Ethernet header up to the FCS altogether.
Fix the MTU so packets won't get dropped silently.
Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices")
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20210408091047.4269-2-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Hans de Goede [Fri, 9 Apr 2021 13:58:50 +0000 (15:58 +0200)]
Bluetooth: btusb: Revert Fix the autosuspend enable and disable
drivers/usb/core/hub.c: usb_new_device() contains the following:
/* By default, forbid autosuspend for all devices. It will be
* allowed for hubs during binding.
*/
usb_disable_autosuspend(udev);
So for anything which is not a hub, such as btusb devices, autosuspend is
disabled by default and we must call usb_enable_autosuspend(udev) to
enable it.
This means that the "Fix the autosuspend enable and disable" commit,
which drops the usb_enable_autosuspend() call when the enable_autosuspend
module option is true, is completely wrong, revert it.
This reverts commit
7bd9fb058d77213130e4b3e594115c028b708e7e.
Cc: Hui Wang <hui.wang@canonical.com>
Fixes: 7bd9fb058d77 ("Bluetooth: btusb: Fix the autosuspend enable and disable")
Acked-by: Hui Wang <hui.wang@canonical.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 9 Apr 2021 01:57:47 +0000 (18:57 -0700)]
Merge tag '5.12-rc6-smb3' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"Three cifs/smb3 fixes, two for stable: a reconnect fix and a fix for
display of devnames with special characters"
* tag '5.12-rc6-smb3' of git://git.samba.org/sfrench/cifs-2.6:
cifs: escape spaces in share names
fs: cifs: Remove unnecessary struct declaration
cifs: On cifs_reconnect, resolve the hostname again.
Dave Airlie [Fri, 9 Apr 2021 00:33:11 +0000 (10:33 +1000)]
Merge tag 'drm-msm-fixes-2021-04-02' of https://gitlab.freedesktop.org/drm/msm into drm-fixes
some more minor fixes:
- a5xx/a6xx timestamp fix
- microcode version check
- fail path fix
- block programming fix
- error removal fix.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rob Clark <robdclark@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGsMj7Nv3vVaVWMxPy8Y=Z_SnZmVKhKgKDxDYTr9rGN_+w@mail.gmail.com
Muhammad Usama Anjum [Thu, 8 Apr 2021 22:01:29 +0000 (03:01 +0500)]
net: ipv6: check for validity before dereferencing cfg->fc_nlinfo.nlh
nlh is being checked for validtity two times when it is dereferenced in
this function. Check for validity again when updating the flags through
nlh pointer to make the dereferencing safe.
CC: <stable@vger.kernel.org>
Addresses-Coverity: ("NULL pointer dereference")
Signed-off-by: Muhammad Usama Anjum <musamaanjum@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 8 Apr 2021 23:38:23 +0000 (16:38 -0700)]
Merge branch 'lantiq-GSWIP-fixes'
Martin Blumenstingl says:
====================
lantiq: GSWIP: two more fixes
after my last patch got accepted and is now in net as commit
3e6fdeb28f4c33 ("net: dsa: lantiq_gswip: Let GSWIP automatically set
the xMII clock") [0] some more people from the OpenWrt community
(many thanks to everyone involved) helped test the GSWIP driver: [1]
It turns out that the previous fix does not work for all boards.
There's no regression, but it doesn't fix as many problems as I
thought. This is why two more fixes are needed:
- the first one solves many (four known but probably there are
a few extra hidden ones) reported bugs with the GSWIP where no
traffic would flow. Not all circumstances are fully understood
but testing shows that switching away from PHY auto polling
solves all of them
- while investigating the different problems which are addressed
by the first patch some small issues with the existing code were
found. These are addressed by the second patch
Changes since v1 at [0]:
- Don't configure the link parameters in gswip_phylink_mac_config
(as we're using the "modern" way in gswip_phylink_mac_link_up).
Thanks to Andrew for the hint with the phylink documentation.
- Clarify that GSWIP_MII_CFG_RMII_CLK is ignored by the hardware in
the description of the second patch as suggested by Hauke
- Don't set GSWIP_MII_CFG_RGMII_IBS in the second patch as we don't
have any hardware available for testing this. The patch
description now also reflects this.
- Added Andrew's Reviewed-by to the first patch (thank you!)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Martin Blumenstingl [Thu, 8 Apr 2021 18:38:28 +0000 (20:38 +0200)]
net: dsa: lantiq_gswip: Configure all remaining GSWIP_MII_CFG bits
There are a few more bits in the GSWIP_MII_CFG register for which we
did rely on the boot-loader (or the hardware defaults) to set them up
properly.
For some external RMII PHYs we need to select the GSWIP_MII_CFG_RMII_CLK
bit and also we should un-set it for non-RMII PHYs. The
GSWIP_MII_CFG_RMII_CLK bit is ignored for other PHY connection modes.
The GSWIP IP also supports in-band auto-negotiation for RGMII PHYs when
the GSWIP_MII_CFG_RGMII_IBS bit is set. Clear this bit always as there's
no known hardware which uses this (so it is not tested yet).
Clear the xMII isolation bit when set at initialization time if it was
previously set by the bootloader. Not doing so could lead to no traffic
(neither RX nor TX) on a port with this bit set.
While here, also add the GSWIP_MII_CFG_RESET bit. We don't need to
manage it because this bit is self-clearning when set. We still add it
here to get a better overview of the GSWIP_MII_CFG register.
Fixes: 14fceff4771e51 ("net: dsa: Add Lantiq / Intel DSA driver for vrx200")
Cc: stable@vger.kernel.org
Suggested-by: Hauke Mehrtens <hauke@hauke-m.de>
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Martin Blumenstingl [Thu, 8 Apr 2021 18:38:27 +0000 (20:38 +0200)]
net: dsa: lantiq_gswip: Don't use PHY auto polling
PHY auto polling on the GSWIP hardware can be used so link changes
(speed, link up/down, etc.) can be detected automatically. Internally
GSWIP reads the PHY's registers for this functionality. Based on this
automatic detection GSWIP can also automatically re-configure it's port
settings. Unfortunately this auto polling (and configuration) mechanism
seems to cause various issues observed by different people on different
devices:
- FritzBox 7360v2: the two Gbit/s ports (connected to the two internal
PHY11G instances) are working fine but the two Fast Ethernet ports
(using an AR8030 RMII PHY) are completely dead (neither RX nor TX are
received). It turns out that the AR8030 PHY sets the BMSR_ESTATEN bit
as well as the ESTATUS_1000_TFULL and ESTATUS_1000_XFULL bits. This
makes the PHY auto polling state machine (rightfully?) think that the
established link speed (when the other side is Gbit/s capable) is
1Gbit/s.
- None of the Ethernet ports on the Zyxel P-2812HNU-F1 (two are
connected to the internal PHY11G GPHYs while the other three are
external RGMII PHYs) are working. Neither RX nor TX traffic was
observed. It is not clear which part of the PHY auto polling state-
machine caused this.
- FritzBox 7412 (only one LAN port which is connected to one of the
internal GPHYs running in PHY22F / Fast Ethernet mode) was seeing
random disconnects (link down events could be seen). Sometimes all
traffic would stop after such disconnect. It is not clear which part
of the PHY auto polling state-machine cauased this.
- TP-Link TD-W9980 (two ports are connected to the internal GPHYs
running in PHY11G / Gbit/s mode, the other two are external RGMII
PHYs) was affected by similar issues as the FritzBox 7412 just without
the "link down" events
Switch to software based configuration instead of PHY auto polling (and
letting the GSWIP hardware configure the ports automatically) for the
following link parameters:
- link up/down
- link speed
- full/half duplex
- flow control (RX / TX pause)
After a big round of manual testing by various people (who helped test
this on OpenWrt) it turns out that this fixes all reported issues.
Additionally it can be considered more future proof because any
"quirk" which is implemented for a PHY on the driver side can now be
used with the GSWIP hardware as well because Linux is in control of the
link parameters.
As a nice side-effect this also solves a problem where fixed-links were
not supported previously because we were relying on the PHY auto polling
mechanism, which cannot work for fixed-links as there's no PHY from
where it can read the registers. Configuring the link settings on the
GSWIP ports means that we now use the settings from device-tree also for
ports with fixed-links.
Fixes: 14fceff4771e51 ("net: dsa: Add Lantiq / Intel DSA driver for vrx200")
Fixes: 3e6fdeb28f4c33 ("net: dsa: lantiq_gswip: Let GSWIP automatically set the xMII clock")
Cc: stable@vger.kernel.org
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 8 Apr 2021 22:51:11 +0000 (15:51 -0700)]
Merge tag 'for-linus' of git://git./linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
"Nothing very exciting here, just a few small bug fixes. No red flags
for this release have shown up.
- Regression from the last pull request in cxgb4 related to the ipv6
fixes
- KASAN crasher in rtrs
- oops in hfi1 related to a buggy BIOS
- Userspace could oops qedr's XRC support
- Uninitialized memory when parsing a LS_NLA_TYPE_DGID netlink
message"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/addr: Be strict with gid size
RDMA/qedr: Fix kernel panic when trying to access recv_cq
IB/hfi1: Fix probe time panic when AIP is enabled with a buggy BIOS
RDMA/cxgb4: check for ipv6 address properly while destroying listener
RDMA/rtrs-clt: Close rtrs client conn before destroying rtrs clt session files
Frank Rowand [Thu, 8 Apr 2021 20:45:08 +0000 (15:45 -0500)]
of: unittest: overlay: ensure proper alignment of copied FDT
The Devicetree standard specifies an 8 byte alignment of the FDT.
Code in libfdt expects this alignment for an FDT image in memory.
kmemdup() returns 4 byte alignment on openrisc. Replace kmemdup()
with kmalloc(), align pointer, memcpy() to get proper alignment.
The 4 byte alignment exposed a related bug which triggered a crash
on openrisc with:
commit
79edff12060f ("scripts/dtc: Update to upstream version
v1.6.0-51-g183df9e9c2b9")
as reported in:
https://lore.kernel.org/lkml/
20210327224116.69309-1-linux@roeck-us.net/
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Frank Rowand <frank.rowand@sony.com>
Link: https://lore.kernel.org/r/20210408204508.2276230-1-frowand.list@gmail.com
Signed-off-by: Rob Herring <robh@kernel.org>
David S. Miller [Thu, 8 Apr 2021 21:21:40 +0000 (14:21 -0700)]
Merge branch '40GbE' of git://git./linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2021-04-08
This series contains updates to i40e and ice drivers.
Grzegorz fixes the ordering of parameters to i40e_aq_get_phy_register()
which is causing incorrect information to be reported.
Arkadiusz fixes various sparse issues reported on the i40e driver.
Yongxin Liu fixes a memory leak with aRFS following resume from suspend
for ice driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Tikhomirov [Thu, 8 Apr 2021 15:14:31 +0000 (18:14 +0300)]
net: sched: sch_teql: fix null-pointer dereference
Reproduce:
modprobe sch_teql
tc qdisc add dev teql0 root teql0
This leads to (for instance in Centos 7 VM) OOPS:
[ 532.366633] BUG: unable to handle kernel NULL pointer dereference at
00000000000000a8
[ 532.366733] IP: [<
ffffffffc06124a8>] teql_destroy+0x18/0x100 [sch_teql]
[ 532.366825] PGD
80000001376d5067 PUD
137e37067 PMD 0
[ 532.366906] Oops: 0000 [#1] SMP
[ 532.366987] Modules linked in: sch_teql ...
[ 532.367945] CPU: 1 PID: 3026 Comm: tc Kdump: loaded Tainted: G ------------ T 3.10.0-1062.7.1.el7.x86_64 #1
[ 532.368041] Hardware name: Virtuozzo KVM, BIOS 1.11.0-2.vz7.2 04/01/2014
[ 532.368125] task:
ffff8b7d37d31070 ti:
ffff8b7c9fdbc000 task.ti:
ffff8b7c9fdbc000
[ 532.368224] RIP: 0010:[<
ffffffffc06124a8>] [<
ffffffffc06124a8>] teql_destroy+0x18/0x100 [sch_teql]
[ 532.368320] RSP: 0018:
ffff8b7c9fdbf8e0 EFLAGS:
00010286
[ 532.368394] RAX:
ffffffffc0612490 RBX:
ffff8b7cb1565e00 RCX:
ffff8b7d35ba2000
[ 532.368476] RDX:
ffff8b7d35ba2000 RSI:
0000000000000000 RDI:
ffff8b7cb1565e00
[ 532.368557] RBP:
ffff8b7c9fdbf8f8 R08:
ffff8b7d3fd1f140 R09:
ffff8b7d3b001600
[ 532.368638] R10:
ffff8b7d3b001600 R11:
ffffffff84c7d65b R12:
00000000ffffffd8
[ 532.368719] R13:
0000000000008000 R14:
ffff8b7d35ba2000 R15:
ffff8b7c9fdbf9a8
[ 532.368800] FS:
00007f6a4e872740(0000) GS:
ffff8b7d3fd00000(0000) knlGS:
0000000000000000
[ 532.368885] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 532.368961] CR2:
00000000000000a8 CR3:
00000001396ee000 CR4:
00000000000206e0
[ 532.369046] Call Trace:
[ 532.369159] [<
ffffffff84c8192e>] qdisc_create+0x36e/0x450
[ 532.369268] [<
ffffffff846a9b49>] ? ns_capable+0x29/0x50
[ 532.369366] [<
ffffffff849afde2>] ? nla_parse+0x32/0x120
[ 532.369442] [<
ffffffff84c81b4c>] tc_modify_qdisc+0x13c/0x610
[ 532.371508] [<
ffffffff84c693e7>] rtnetlink_rcv_msg+0xa7/0x260
[ 532.372668] [<
ffffffff84907b65>] ? sock_has_perm+0x75/0x90
[ 532.373790] [<
ffffffff84c69340>] ? rtnl_newlink+0x890/0x890
[ 532.374914] [<
ffffffff84c8da7b>] netlink_rcv_skb+0xab/0xc0
[ 532.376055] [<
ffffffff84c63708>] rtnetlink_rcv+0x28/0x30
[ 532.377204] [<
ffffffff84c8d400>] netlink_unicast+0x170/0x210
[ 532.378333] [<
ffffffff84c8d7a8>] netlink_sendmsg+0x308/0x420
[ 532.379465] [<
ffffffff84c2f3a6>] sock_sendmsg+0xb6/0xf0
[ 532.380710] [<
ffffffffc034a56e>] ? __xfs_filemap_fault+0x8e/0x1d0 [xfs]
[ 532.381868] [<
ffffffffc034a75c>] ? xfs_filemap_fault+0x2c/0x30 [xfs]
[ 532.383037] [<
ffffffff847ec23a>] ? __do_fault.isra.61+0x8a/0x100
[ 532.384144] [<
ffffffff84c30269>] ___sys_sendmsg+0x3e9/0x400
[ 532.385268] [<
ffffffff847f3fad>] ? handle_mm_fault+0x39d/0x9b0
[ 532.386387] [<
ffffffff84d88678>] ? __do_page_fault+0x238/0x500
[ 532.387472] [<
ffffffff84c31921>] __sys_sendmsg+0x51/0x90
[ 532.388560] [<
ffffffff84c31972>] SyS_sendmsg+0x12/0x20
[ 532.389636] [<
ffffffff84d8dede>] system_call_fastpath+0x25/0x2a
[ 532.390704] [<
ffffffff84d8de21>] ? system_call_after_swapgs+0xae/0x146
[ 532.391753] Code: 00 00 00 00 00 00 5b 5d c3 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48 89 e5 41 55 41 54 53 48 8b b7 48 01 00 00 48 89 fb <48> 8b 8e a8 00 00 00 48 85 c9 74 43 48 89 ca eb 0f 0f 1f 80 00
[ 532.394036] RIP [<
ffffffffc06124a8>] teql_destroy+0x18/0x100 [sch_teql]
[ 532.395127] RSP <
ffff8b7c9fdbf8e0>
[ 532.396179] CR2:
00000000000000a8
Null pointer dereference happens on master->slaves dereference in
teql_destroy() as master is null-pointer.
When qdisc_create() calls teql_qdisc_init() it imediately fails after
check "if (m->dev == dev)" because both devices are teql0, and it does
not set qdisc_priv(sch)->m leaving it zero on error path, then
qdisc_create() imediately calls teql_destroy() which does not expect
zero master pointer and we get OOPS.
Fixes: 87b60cfacf9f ("net_sched: fix error recovery at qdisc creation")
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 8 Apr 2021 21:10:53 +0000 (14:10 -0700)]
Merge git://git./pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:
====================
pull-request: bpf 2021-04-08
The following pull-request contains BPF updates for your *net* tree.
We've added 4 non-merge commits during the last 2 day(s) which contain
a total of 4 files changed, 31 insertions(+), 10 deletions(-).
The main changes are:
1) Validate and reject invalid JIT branch displacements, from Piotr Krysiuk.
2) Fix incorrect unhash restore as well as fwd_alloc memory accounting in
sock map, from John Fastabend.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 8 Apr 2021 21:08:37 +0000 (14:08 -0700)]
Merge tag 'mac80211-for-net-2021-04-08.2' of git://git./linux/kernel/git/jberg/mac80211
Johannes berg says:
====================
Various small fixes:
* S1G beacon validation
* potential leak in nl80211
* fast-RX confusion with 4-addr mode
* erroneous WARN_ON that userspace can trigger
* wrong time units in virt_wifi
* rfkill userspace API breakage
* TXQ AC confusing that led to traffic stopped forever
* connection monitoring time after/before confusion
* netlink beacon head validation buffer overrun
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Hemminger [Wed, 7 Apr 2021 15:59:12 +0000 (08:59 -0700)]
ipv6: report errors for iftoken via netlink extack
Setting iftoken can fail for several different reasons but there
and there was no report to user as to the cause. Add netlink
extended errors to the processing of the request.
This requires adding additional argument through rtnl_af_ops
set_link_af callback.
Reported-by: Hongren Zheng <li@zenithal.me>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 8 Apr 2021 20:47:34 +0000 (13:47 -0700)]
Merge branch 'net-sched-action-init-fixes'
Vlad Buslov says:
====================
Action initalization fixes
This series fixes reference counting of action instances and modules in
several parts of action init code. The first patch reverts previous fix
that didn't properly account for rollback from a failure in the middle of
the loop in tcf_action_init() which is properly fixed by the following
patch.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Buslov [Wed, 7 Apr 2021 15:36:04 +0000 (18:36 +0300)]
net: sched: fix err handler in tcf_action_init()
With recent changes that separated action module load from action
initialization tcf_action_init() function error handling code was modified
to manually release the loaded modules if loading/initialization of any
further action in same batch failed. For the case when all modules
successfully loaded and some of the actions were initialized before one of
them failed in init handler. In this case for all previous actions the
module will be released twice by the error handler: First time by the loop
that manually calls module_put() for all ops, and second time by the action
destroy code that puts the module after destroying the action.
Reproduction:
$ sudo tc actions add action simple sdata \"2\" index 2
$ sudo tc actions add action simple sdata \"1\" index 1 \
action simple sdata \"2\" index 2
RTNETLINK answers: File exists
We have an error talking to the kernel
$ sudo tc actions ls action simple
total acts 1
action order 0: Simple <"2">
index 2 ref 1 bind 0
$ sudo tc actions flush action simple
$ sudo tc actions ls action simple
$ sudo tc actions add action simple sdata \"2\" index 2
Error: Failed to load TC action module.
We have an error talking to the kernel
$ lsmod | grep simple
act_simple 20480 -1
Fix the issue by modifying module reference counting handling in action
initialization code:
- Get module reference in tcf_idr_create() and put it in tcf_idr_release()
instead of taking over the reference held by the caller.
- Modify users of tcf_action_init_1() to always release the module
reference which they obtain before calling init function instead of
assuming that created action takes over the reference.
- Finally, modify tcf_action_init_1() to not release the module reference
when overwriting existing action as this is no longer necessary since both
upper and lower layers obtain and manage their own module references
independently.
Fixes: d349f9976868 ("net_sched: fix RTNL deadlock again caused by request_module()")
Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Buslov [Wed, 7 Apr 2021 15:36:03 +0000 (18:36 +0300)]
net: sched: fix action overwrite reference counting
Action init code increments reference counter when it changes an action.
This is the desired behavior for cls API which needs to obtain action
reference for every classifier that points to action. However, act API just
needs to change the action and releases the reference before returning.
This sequence breaks when the requested action doesn't exist, which causes
act API init code to create new action with specified index, but action is
still released before returning and is deleted (unless it was referenced
concurrently by cls API).
Reproduction:
$ sudo tc actions ls action gact
$ sudo tc actions change action gact drop index 1
$ sudo tc actions ls action gact
Extend tcf_action_init() to accept 'init_res' array and initialize it with
action->ops->init() result. In tcf_action_add() remove pointers to created
actions from actions array before passing it to tcf_action_put_many().
Fixes: cae422f379f3 ("net: sched: use reference counting action init")
Reported-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Buslov [Wed, 7 Apr 2021 15:36:02 +0000 (18:36 +0300)]
Revert "net: sched: bump refcount for new action in ACT replace mode"
This reverts commit
6855e8213e06efcaf7c02a15e12b1ae64b9a7149.
Following commit in series fixes the issue without introducing regression
in error rollback of tcf_action_destroy().
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Begunkov [Thu, 8 Apr 2021 00:54:42 +0000 (01:54 +0100)]
io-wq: cancel unbounded works on io-wq destroy
WARNING: CPU: 5 PID: 227 at fs/io_uring.c:8578 io_ring_exit_work+0xe6/0x470
RIP: 0010:io_ring_exit_work+0xe6/0x470
Call Trace:
process_one_work+0x206/0x400
worker_thread+0x4a/0x3d0
kthread+0x129/0x170
ret_from_fork+0x22/0x30
INFO: task lfs-openat:2359 blocked for more than 245 seconds.
task:lfs-openat state:D stack: 0 pid: 2359 ppid: 1 flags:0x00000004
Call Trace:
...
wait_for_completion+0x8b/0xf0
io_wq_destroy_manager+0x24/0x60
io_wq_put_and_exit+0x18/0x30
io_uring_clean_tctx+0x76/0xa0
__io_uring_files_cancel+0x1b9/0x2e0
do_exit+0xc0/0xb40
...
Even after io-wq destroy has been issued io-wq worker threads will
continue executing all left work items as usual, and may hang waiting
for I/O that won't ever complete (aka unbounded).
[<0>] pipe_read+0x306/0x450
[<0>] io_iter_do_read+0x1e/0x40
[<0>] io_read+0xd5/0x330
[<0>] io_issue_sqe+0xd21/0x18a0
[<0>] io_wq_submit_work+0x6c/0x140
[<0>] io_worker_handle_work+0x17d/0x400
[<0>] io_wqe_worker+0x2c0/0x330
[<0>] ret_from_fork+0x22/0x30
Cancel all unbounded I/O instead of executing them. This changes the
user visible behaviour, but that's inevitable as io-wq is not per task.
Suggested-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/cd4b543154154cba055cf86f351441c2174d7f71.1617842918.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Thu, 8 Apr 2021 18:28:03 +0000 (19:28 +0100)]
io_uring: fix rw req completion
WARNING: at fs/io_uring.c:8578 io_ring_exit_work.cold+0x0/0x18
As reissuing is now passed back by REQ_F_REISSUE and kiocb_done()
internally uses __io_complete_rw(), it may stop after setting the flag
so leaving a dangling request.
There are tricky edge cases, e.g. reading beyound file, boundary, so
the easiest way is to hand code reissue in kiocb_done() as
__io_complete_rw() was doing for us before.
Fixes: 230d50d448ac ("io_uring: move reissue into regular IO path")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/f602250d292f8a84cca9a01d747744d1e797be26.1617842918.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Leon Romanovsky [Mon, 5 Apr 2021 07:44:34 +0000 (10:44 +0300)]
RDMA/addr: Be strict with gid size
The nla_len() is less than or equal to 16. If it's less than 16 then end
of the "gid" buffer is uninitialized.
Fixes: ae43f8286730 ("IB/core: Add IP to GID netlink offload")
Link: https://lore.kernel.org/r/20210405074434.264221-1-leon@kernel.org
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Linus Torvalds [Thu, 8 Apr 2021 18:09:25 +0000 (11:09 -0700)]
Merge tag 's390-5.12-6' of git://git./linux/kernel/git/s390/linux
Pull s390 fixes from Heiko Carstens:
- fix incorrect dereference of the ext_params2 external interrupt
parameter, which leads to an instant kernel crash if a pfault
interrupt occurs.
- add forgotten stack unwinder support, and fix memory leak for the
new machine check handler stack.
- fix inline assembly register clobbering due to KASAN code
instrumentation.
* tag 's390-5.12-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/setup: use memblock_free_late() to free old stack
s390/irq: fix reading of ext_params2 field from lowcore
s390/unwind: add machine check handler stack
s390/cpcmd: fix inline assembly register clobbering
Yongxin Liu [Thu, 1 Apr 2021 18:59:15 +0000 (11:59 -0700)]
ice: fix memory leak of aRFS after resuming from suspend
In ice_suspend(), ice_clear_interrupt_scheme() is called, and then
irq_free_descs() will be eventually called to free irq and its descriptor.
In ice_resume(), ice_init_interrupt_scheme() is called to allocate new
irqs. However, in ice_rebuild_arfs(), struct irq_glue and struct cpu_rmap
maybe cannot be freed, if the irqs that released in ice_suspend() were
reassigned to other devices, which makes irq descriptor's affinity_notify
lost.
So call ice_free_cpu_rx_rmap() before ice_clear_interrupt_scheme(), which
can make sure all irq_glue and cpu_rmap can be correctly released before
corresponding irq and descriptor are released.
Fix the following memory leak.
unreferenced object 0xffff95bd951afc00 (size 512):
comm "kworker/0:1", pid 134, jiffies
4294684283 (age 13051.958s)
hex dump (first 32 bytes):
18 00 00 00 18 00 18 00 70 fc 1a 95 bd 95 ff ff ........p.......
00 00 ff ff 01 00 ff ff 02 00 ff ff 03 00 ff ff ................
backtrace:
[<
0000000072e4b914>] __kmalloc+0x336/0x540
[<
0000000054642a87>] alloc_cpu_rmap+0x3b/0xb0
[<
00000000f220deec>] ice_set_cpu_rx_rmap+0x6a/0x110 [ice]
[<
000000002370a632>] ice_probe+0x941/0x1180 [ice]
[<
00000000d692edba>] local_pci_probe+0x47/0xa0
[<
00000000503934f0>] work_for_cpu_fn+0x1a/0x30
[<
00000000555a9e4a>] process_one_work+0x1dd/0x410
[<
000000002c4b414a>] worker_thread+0x221/0x3f0
[<
00000000bb2b556b>] kthread+0x14c/0x170
[<
00000000ad2cf1cd>] ret_from_fork+0x1f/0x30
unreferenced object 0xffff95bd81b0a2a0 (size 96):
comm "kworker/0:1", pid 134, jiffies
4294684283 (age 13051.958s)
hex dump (first 32 bytes):
38 00 00 00 01 00 00 00 e0 ff ff ff 0f 00 00 00 8...............
b0 a2 b0 81 bd 95 ff ff b0 a2 b0 81 bd 95 ff ff ................
backtrace:
[<
00000000582dd5c5>] kmem_cache_alloc_trace+0x31f/0x4c0
[<
000000002659850d>] irq_cpu_rmap_add+0x25/0xe0
[<
00000000495a3055>] ice_set_cpu_rx_rmap+0xb4/0x110 [ice]
[<
000000002370a632>] ice_probe+0x941/0x1180 [ice]
[<
00000000d692edba>] local_pci_probe+0x47/0xa0
[<
00000000503934f0>] work_for_cpu_fn+0x1a/0x30
[<
00000000555a9e4a>] process_one_work+0x1dd/0x410
[<
000000002c4b414a>] worker_thread+0x221/0x3f0
[<
00000000bb2b556b>] kthread+0x14c/0x170
[<
00000000ad2cf1cd>] ret_from_fork+0x1f/0x30
Fixes: 769c500dcc1e ("ice: Add advanced power mgmt for WoL")
Signed-off-by: Yongxin Liu <yongxin.liu@windriver.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Arkadiusz Kubalewski [Fri, 26 Mar 2021 18:43:43 +0000 (19:43 +0100)]
i40e: Fix sparse warning: missing error code 'err'
Set proper return values inside error checking if-statements.
Previously following warning was produced when compiling against sparse.
i40e_main.c:15162 i40e_init_recovery_mode() warn: missing error code 'err'
Fixes: 4ff0ee1af0169 ("i40e: Introduce recovery mode support")
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Tested-by: Dave Switzer <david.switzer@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Arkadiusz Kubalewski [Fri, 26 Mar 2021 18:43:42 +0000 (19:43 +0100)]
i40e: Fix sparse error: 'vsi->netdev' could be null
Remove vsi->netdev->name from the trace.
This is redundant information. With the devinfo trace, the adapter
is already identifiable.
Previously following error was produced when compiling against sparse.
i40e_main.c:2571 i40e_sync_vsi_filters() error:
we previously assumed 'vsi->netdev' could be null (see line 2323)
Fixes: b603f9dc20af ("i40e: Log info when PF is entering and leaving Allmulti mode.")
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Tested-by: Dave Switzer <david.switzer@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Arkadiusz Kubalewski [Fri, 26 Mar 2021 18:43:41 +0000 (19:43 +0100)]
i40e: Fix sparse error: uninitialized symbol 'ring'
Init pointer with NULL in default switch case statement.
Previously the error was produced when compiling against sparse.
i40e_debugfs.c:582 i40e_dbg_dump_desc() error: uninitialized symbol 'ring'.
Fixes: 44ea803e2fa7 ("i40e: introduce new dump desc XDP command")
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Tested-by: Dave Switzer <david.switzer@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Arkadiusz Kubalewski [Fri, 26 Mar 2021 18:43:40 +0000 (19:43 +0100)]
i40e: Fix sparse errors in i40e_txrx.c
Remove error handling through pointers. Instead use plain int
to return value from i40e_run_xdp(...).
Previously:
- sparse errors were produced during compilation:
i40e_txrx.c:2338 i40e_run_xdp() error: (-
2147483647) too low for ERR_PTR
i40e_txrx.c:2558 i40e_clean_rx_irq() error: 'skb' dereferencing possible ERR_PTR()
- sk_buff* was used to return value, but it has never had valid
pointer to sk_buff. Returned value was always int handled as
a pointer.
Fixes: 0c8493d90b6b ("i40e: add XDP support for pass and drop actions")
Fixes: 2e6893123830 ("i40e: split XDP_TX tail and XDP_REDIRECT map flushing")
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Tested-by: Dave Switzer <david.switzer@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Grzegorz Siwik [Wed, 24 Mar 2021 08:58:27 +0000 (09:58 +0100)]
i40e: Fix parameters in aq_get_phy_register()
Change parameters order in aq_get_phy_register() due to wrong
statistics in PHY reported by ethtool. Previously all PHY statistics were
exactly the same for all interfaces
Now statistics are reported correctly - different for different interfaces
Fixes: 0514db37dd78 ("i40e: Extend PHY access with page change flag")
Signed-off-by: Grzegorz Siwik <grzegorz.siwik@intel.com>
Tested-by: Dave Switzer <david.switzer@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Linus Torvalds [Thu, 8 Apr 2021 16:01:30 +0000 (09:01 -0700)]
Merge tag 'sound-5.12-rc7' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"This batch became unexpectedly bigger due to the pending ASoC patches,
but all look small and fine device-specific fixes.
Many of the commits are for ASoC Intel drivers, while the rest are for
ASoC small codec/platform fixes and HD-audio quirks"
* tag 'sound-5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (21 commits)
ALSA: hda/realtek: Fix speaker amp setup on Acer Aspire E1
ALSA: aloop: Fix initialization of controls
ALSA: hda/conexant: Apply quirk for another HP ZBook G5 model
ASoC: fsl_esai: Fix TDM slot setup for I2S mode
ASoC: codecs: lpass-rx-macro: set npl clock rate correctly
ASoC: codecs: lpass-tx-macro: set npl clock rate correctly
ASoC: sunxi: sun4i-codec: fill ASoC card owner
ASoC: cygnus: fix for_each_child.cocci warnings
ASoC: max98373: Added 30ms turn on/off time delay
ASoC: max98373: Changed amp shutdown register as volatile
ASoC: intel: atom: Remove 44100 sample-rate from the media and deep-buffer DAI descriptions
ASoC: intel: atom: Stop advertising non working S24LE support
ASoC: wm8960: Fix wrong bclk and lrclk with pll enabled for some chips
ASoC: SOF: Intel: move ELH chip info
ASoC: SOF: Intel: APL: set shutdown callback to hda_dsp_shutdown
ASoC: SOF: Intel: CNL: set shutdown callback to hda_dsp_shutdown
ASoC: SOF: Intel: ICL: set shutdown callback to hda_dsp_shutdown
ASoC: SOF: Intel: TGL: set shutdown callback to hda_dsp_shutdown
ASoC: SOF: Intel: TGL: fix EHL ops
ASoC: SOF: core: harden shutdown helper
...
Linus Torvalds [Thu, 8 Apr 2021 15:54:26 +0000 (08:54 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull kvm fix from Paolo Bonzini:
"A lone x86 patch, for a bug found while developing a backport to
stable versions"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86/mmu: preserve pending TLB flush across calls to kvm_tdp_mmu_zap_sp
Linus Torvalds [Thu, 8 Apr 2021 15:46:53 +0000 (08:46 -0700)]
Merge tag 'for-linus-2021-04-08' of git://git./linux/kernel/git/brauner/linux
Pull close_range() fix from Christian Brauner:
"Syzbot reported a bug in close_range.
Debugging this showed we didn't recalculate the current maximum fd
number for CLOSE_RANGE_UNSHARE | CLOSE_RANGE_CLOEXEC after we unshared
the file descriptors table. As a result, max_fd could exceed the
current fdtable maximum causing us to set excessive bits.
As a concrete example, let's say the user requested everything from fd
4 to ~0UL to be closed and their current fdtable size is 256 with
their highest open fd being 4. With CLOSE_RANGE_UNSHARE the caller
will end up with a new fdtable which has room for 64 file descriptors
since that is the lowest fdtable size we accept. But now max_fd will
still point to 255 and needs to be adjusted. Fix this by retrieving
the correct maximum fd value in __range_cloexec().
I've carried this fix for a little while but since there was no
linux-next release over easter I waited until now.
With this change close_range() can be further simplified but imho we
are in no hurry to do that and so I'll defer this for the 5.13 merge
window"
* tag 'for-linus-2021-04-08' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
file: fix close_range() for unshare+cloexec
Linus Torvalds [Thu, 8 Apr 2021 15:26:06 +0000 (08:26 -0700)]
Merge branch 'fixes' of git://git./linux/kernel/git/viro/vfs
Pull umount fix from Al Viro:
"Brown paperbag time: dumb braino in the series that went into 5.7
broke the 'don't step into ->d_weak_revalidate() when umount(2) looks
the victim up' behaviour.
Spotted only now - saw
if (!err && unlikely(nd->flags & LOOKUP_MOUNTPOINT)) {
err = handle_lookup_down(nd);
nd->flags &= ~LOOKUP_JUMPED; // no d_weak_revalidate(), please...
}
and went "why do we clear that flag here - nothing below that point is
going to check it anyway" / "wait a minute, what is it doing *after*
complete_walk() (which is where we check that flag and call
->d_weak_revalidate())" / "how could that possibly _not_ break?",
followed by reproducing the breakage and verifying that the obvious
fix of that braino does, indeed, fix it.
The reproducer is (assuming that $DIR exists and is exported r/w to
localhost)
mkdir $DIR/a
mkdir /tmp/foo
mount --bind /tmp/foo /tmp/foo
mkdir /tmp/foo/a
mkdir /tmp/foo/b
mount -t nfs4 localhost:$DIR/a /tmp/foo/a
mount -t nfs4 localhost:$DIR /tmp/foo/b
rmdir /tmp/foo/b/a
umount /tmp/foo/b
umount /tmp/foo/a
umount -l /tmp/foo # will get everything under /tmp/foo, no matter what
Correct behaviour is successful umount; broken kernels (5.7-rc1 and
later) get
umount.nfs4: /tmp/foo/a: Stale file handle
Note that bind mount is there to be able to recover - on broken
kernels we'd get stuck with impossible-to-umount filesystem if not for
that.
FWIW, that braino had been posted for review back then, at least
twice. Unfortunately, the call of complete_walk() was outside of diff
context, so the bogosity hadn't been immediately obvious from the
patch alone ;-/"
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
LOOKUP_MOUNTPOINT: we are cleaning "jumped" flag too late
Johannes Berg [Thu, 8 Apr 2021 13:45:20 +0000 (15:45 +0200)]
nl80211: fix beacon head validation
If the beacon head attribute (NL80211_ATTR_BEACON_HEAD)
is too short to even contain the frame control field,
we access uninitialized data beyond the buffer. Fix this
by checking the minimal required size first. We used to
do this until S1G support was added, where the fixed
data portion has a different size.
Reported-and-tested-by: syzbot+72b99dcf4607e8c770f3@syzkaller.appspotmail.com
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Fixes: 1d47f1198d58 ("nl80211: correctly validate S1G beacon head")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Link: https://lore.kernel.org/r/20210408154518.d9b06d39b4ee.Iff908997b2a4067e8d456b3cb96cab9771d252b8@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Piotr Krysiuk [Tue, 6 Apr 2021 20:59:39 +0000 (21:59 +0100)]
bpf, x86: Validate computation of branch displacements for x86-32
The branch displacement logic in the BPF JIT compilers for x86 assumes
that, for any generated branch instruction, the distance cannot
increase between optimization passes.
But this assumption can be violated due to how the distances are
computed. Specifically, whenever a backward branch is processed in
do_jit(), the distance is computed by subtracting the positions in the
machine code from different optimization passes. This is because part
of addrs[] is already updated for the current optimization pass, before
the branch instruction is visited.
And so the optimizer can expand blocks of machine code in some cases.
This can confuse the optimizer logic, where it assumes that a fixed
point has been reached for all machine code blocks once the total
program size stops changing. And then the JIT compiler can output
abnormal machine code containing incorrect branch displacements.
To mitigate this issue, we assert that a fixed point is reached while
populating the output image. This rejects any problematic programs.
The issue affects both x86-32 and x86-64. We mitigate separately to
ease backporting.
Signed-off-by: Piotr Krysiuk <piotras@gmail.com>
Reviewed-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Piotr Krysiuk [Mon, 5 Apr 2021 21:52:15 +0000 (22:52 +0100)]
bpf, x86: Validate computation of branch displacements for x86-64
The branch displacement logic in the BPF JIT compilers for x86 assumes
that, for any generated branch instruction, the distance cannot
increase between optimization passes.
But this assumption can be violated due to how the distances are
computed. Specifically, whenever a backward branch is processed in
do_jit(), the distance is computed by subtracting the positions in the
machine code from different optimization passes. This is because part
of addrs[] is already updated for the current optimization pass, before
the branch instruction is visited.
And so the optimizer can expand blocks of machine code in some cases.
This can confuse the optimizer logic, where it assumes that a fixed
point has been reached for all machine code blocks once the total
program size stops changing. And then the JIT compiler can output
abnormal machine code containing incorrect branch displacements.
To mitigate this issue, we assert that a fixed point is reached while
populating the output image. This rejects any problematic programs.
The issue affects both x86-32 and x86-64. We mitigate separately to
ease backporting.
Signed-off-by: Piotr Krysiuk <piotras@gmail.com>
Reviewed-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Dom Cobley [Thu, 18 Mar 2021 16:13:28 +0000 (17:13 +0100)]
drm/vc4: crtc: Reduce PV fifo threshold on hvs4
Experimentally have found PV on hvs4 reports fifo full
error with expected settings and does not with one less
This appears as:
[drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:82:crtc-3] flip_done timed out
with bit 10 of PV_STAT set "HVS driving pixels when the PV FIFO is full"
Fixes: c8b75bca92cb ("drm/vc4: Add KMS support for Raspberry Pi.")
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20210318161328.1471556-3-maxime@cerno.tech
Maxime Ripard [Thu, 18 Mar 2021 16:13:27 +0000 (17:13 +0100)]
drm/vc4: plane: Remove redundant assignment
The vc4_plane_atomic_async_update function assigns twice in a row the
src_h field in the drm_plane_state structure to the same value. Remove
the second one.
Reviewed-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20210318161328.1471556-2-maxime@cerno.tech
Johannes Berg [Thu, 8 Apr 2021 12:28:34 +0000 (14:28 +0200)]
nl80211: fix potential leak of ACL params
In case nl80211_parse_unsol_bcast_probe_resp() results in an
error, need to "goto out" instead of just returning to free
possibly allocated data.
Fixes: 7443dcd1f171 ("nl80211: Unsolicited broadcast probe response support")
Link: https://lore.kernel.org/r/20210408142833.d8bc2e2e454a.If290b1ba85789726a671ff0b237726d4851b5b0f@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Thu, 8 Apr 2021 12:28:27 +0000 (14:28 +0200)]
cfg80211: check S1G beacon compat element length
We need to check the length of this element so that we don't
access data beyond its end. Fix that.
Fixes: 9eaffe5078ca ("cfg80211: convert S1G beacon to scan results")
Link: https://lore.kernel.org/r/20210408142826.f6f4525012de.I9fdeff0afdc683a6024e5ea49d2daa3cd2459d11@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Paolo Bonzini [Tue, 6 Apr 2021 15:08:51 +0000 (11:08 -0400)]
KVM: x86/mmu: preserve pending TLB flush across calls to kvm_tdp_mmu_zap_sp
Right now, if a call to kvm_tdp_mmu_zap_sp returns false, the caller
will skip the TLB flush, which is wrong. There are two ways to fix
it:
- since kvm_tdp_mmu_zap_sp will not yield and therefore will not flush
the TLB itself, we could change the call to kvm_tdp_mmu_zap_sp to
use "flush |= ..."
- or we can chain the flush argument through kvm_tdp_mmu_zap_sp down
to __kvm_tdp_mmu_zap_gfn_range. Note that kvm_tdp_mmu_zap_sp will
neither yield nor flush, so flush would never go from true to
false.
This patch does the former to simplify application to stable kernels,
and to make it further clearer that kvm_tdp_mmu_zap_sp will not flush.
Cc: seanjc@google.com
Fixes: 048f49809c526 ("KVM: x86/mmu: Ensure TLBs are flushed for TDP MMU during NX zapping")
Cc: <stable@vger.kernel.org> # 5.10.x: 048f49809c: KVM: x86/mmu: Ensure TLBs are flushed for TDP MMU during NX zapping
Cc: <stable@vger.kernel.org> # 5.10.x: 33a3164161: KVM: x86/mmu: Don't allow TDP MMU to yield when recovering NX pages
Cc: <stable@vger.kernel.org>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A. Cody Schuffelen [Thu, 18 Mar 2021 20:04:19 +0000 (13:04 -0700)]
virt_wifi: Return micros for BSS TSF values
cfg80211_inform_bss expects to receive a TSF value, but is given the
time since boot in nanoseconds. TSF values are expected to be at
microsecond scale rather than nanosecond scale.
Signed-off-by: A. Cody Schuffelen <schuffelen@google.com>
Link: https://lore.kernel.org/r/20210318200419.1421034-1-schuffelen@google.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Du Cheng [Wed, 7 Apr 2021 16:27:56 +0000 (00:27 +0800)]
cfg80211: remove WARN_ON() in cfg80211_sme_connect
A WARN_ON(wdev->conn) would trigger in cfg80211_sme_connect(), if multiple
send_msg(NL80211_CMD_CONNECT) system calls are made from the userland, which
should be anticipated and handled by the wireless driver. Remove this WARN_ON()
to prevent kernel panic if kernel is configured to "panic_on_warn".
Bug reported by syzbot.
Reported-by: syzbot+5f9392825de654244975@syzkaller.appspotmail.com
Signed-off-by: Du Cheng <ducheng2@gmail.com>
Link: https://lore.kernel.org/r/20210407162756.6101-1-ducheng2@gmail.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Ben Greear [Tue, 30 Mar 2021 23:07:49 +0000 (16:07 -0700)]
mac80211: fix time-is-after bug in mlme
The incorrect timeout check caused probing to happen when it did
not need to happen. This in turn caused tx performance drop
for around 5 seconds in ath10k-ct driver. Possibly that tx drop
is due to a secondary issue, but fixing the probe to not happen
when traffic is running fixes the symptom.
Signed-off-by: Ben Greear <greearb@candelatech.com>
Fixes: 9abf4e49830d ("mac80211: optimize station connection monitor")
Acked-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20210330230749.14097-1-greearb@candelatech.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Tue, 23 Mar 2021 20:05:01 +0000 (21:05 +0100)]
mac80211: fix TXQ AC confusion
Normally, TXQs have
txq->tid = tid;
txq->ac = ieee80211_ac_from_tid(tid);
However, the special management TXQ actually has
txq->tid = IEEE80211_NUM_TIDS; // 16
txq->ac = IEEE80211_AC_VO;
This makes sense, but ieee80211_ac_from_tid(16) is the same
as ieee80211_ac_from_tid(0) which is just IEEE80211_AC_BE.
Now, normally this is fine. However, if the netdev queues
were stopped, then the code in ieee80211_tx_dequeue() will
propagate the stop from the interface (vif->txqs_stopped[])
if the AC 2 (ieee80211_ac_from_tid(txq->tid)) is marked as
stopped. On wake, however, __ieee80211_wake_txqs() will wake
the TXQ if AC 0 (txq->ac) is woken up.
If a driver stops all queues with ieee80211_stop_tx_queues()
and then wakes them again with ieee80211_wake_tx_queues(),
the ieee80211_wake_txqs() tasklet will run to resync queue
and TXQ state. If all queues were woken, then what'll happen
is that _ieee80211_wake_txqs() will run in order of HW queues
0-3, typically (and certainly for iwlwifi) corresponding to
ACs 0-3, so it'll call __ieee80211_wake_txqs() for each AC in
order 0-3.
When __ieee80211_wake_txqs() is called for AC 0 (VO) that'll
wake up the management TXQ (remember its tid is 16), and the
driver's wake_tx_queue() will be called. That tries to get a
frame, which will immediately *stop* the TXQ again, because
now we check against AC 2, and AC 2 hasn't yet been marked as
woken up again in sdata->vif.txqs_stopped[] since we're only
in the __ieee80211_wake_txqs() call for AC 0.
Thus, the management TXQ will never be started again.
Fix this by checking txq->ac directly instead of calculating
the AC as ieee80211_ac_from_tid(txq->tid).
Fixes: adf8ed01e4fd ("mac80211: add an optional TXQ for other PS-buffered frames")
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20210323210500.bf4d50afea4a.I136ffde910486301f8818f5442e3c9bf8670a9c4@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Fri, 19 Mar 2021 22:25:11 +0000 (23:25 +0100)]
rfkill: revert back to old userspace API by default
Recompiling with the new extended version of struct rfkill_event
broke systemd in *two* ways:
- It used "sizeof(struct rfkill_event)" to read the event, but
then complained if it actually got something != 8, this broke
it on new kernels (that include the updated API);
- It used sizeof(struct rfkill_event) to write a command, but
didn't implement the intended expansion protocol where the
kernel returns only how many bytes it accepted, and errored
out due to the unexpected smaller size on kernels that didn't
include the updated API.
Even though systemd has now been fixed, that fix may not be always
deployed, and other applications could potentially have similar
issues.
As such, in the interest of avoiding regressions, revert the
default API "struct rfkill_event" back to the original size.
Instead, add a new "struct rfkill_event_ext" that extends it by
the new field, and even more clearly document that applications
should be prepared for extensions in two ways:
* write might only accept fewer bytes on older kernels, and
will return how many to let userspace know which data may
have been ignored;
* read might return anything between 8 (the original size) and
whatever size the application sized its buffer at, indicating
how much event data was supported by the kernel.
Perhaps that will help avoid such issues in the future and we
won't have to come up with another version of the struct if we
ever need to extend it again.
Applications that want to take advantage of the new field will
have to be modified to use struct rfkill_event_ext instead now,
which comes with the danger of them having already been updated
to use it from 'struct rfkill_event', but I found no evidence
of that, and it's still relatively new.
Cc: stable@vger.kernel.org # 5.11
Reported-by: Takashi Iwai <tiwai@suse.de>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM/Clang v12.0.0-r4 (x86-64)
Link: https://lore.kernel.org/r/20210319232510.f1a139cfdd9c.Ic5c7c9d1d28972059e132ea653a21a427c326678@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Seevalamuthu Mariappan [Fri, 19 Mar 2021 14:18:52 +0000 (19:48 +0530)]
mac80211: clear sta->fast_rx when STA removed from 4-addr VLAN
In some race conditions, with more clients and traffic configuration,
below crash is seen when making the interface down. sta->fast_rx wasn't
cleared when STA gets removed from 4-addr AP_VLAN interface. The crash is
due to try accessing 4-addr AP_VLAN interface's net_device (fast_rx->dev)
which has been deleted already.
Resolve this by clearing sta->fast_rx pointer when STA removes
from a 4-addr VLAN.
[ 239.449529] Unable to handle kernel NULL pointer dereference at virtual address
00000004
[ 239.449531] pgd =
80204000
...
[ 239.481496] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.4.60 #227
[ 239.481591] Hardware name: Generic DT based system
[ 239.487665] task:
be05b700 ti:
be08e000 task.ti:
be08e000
[ 239.492360] PC is at get_rps_cpu+0x2d4/0x31c
[ 239.497823] LR is at 0xbe08fc54
...
[ 239.778574] [<
80739740>] (get_rps_cpu) from [<
8073cb10>] (netif_receive_skb_internal+0x8c/0xac)
[ 239.786722] [<
8073cb10>] (netif_receive_skb_internal) from [<
8073d578>] (napi_gro_receive+0x48/0xc4)
[ 239.795267] [<
8073d578>] (napi_gro_receive) from [<
c7b83e8c>] (ieee80211_mark_rx_ba_filtered_frames+0xbcc/0x12d4 [mac80211])
[ 239.804776] [<
c7b83e8c>] (ieee80211_mark_rx_ba_filtered_frames [mac80211]) from [<
c7b84d4c>] (ieee80211_rx_napi+0x7b8/0x8c8 [mac8
0211])
[ 239.815857] [<
c7b84d4c>] (ieee80211_rx_napi [mac80211]) from [<
c7f63d7c>] (ath11k_dp_process_rx+0x7bc/0x8c8 [ath11k])
[ 239.827757] [<
c7f63d7c>] (ath11k_dp_process_rx [ath11k]) from [<
c7f5b6c4>] (ath11k_dp_service_srng+0x2c0/0x2e0 [ath11k])
[ 239.838484] [<
c7f5b6c4>] (ath11k_dp_service_srng [ath11k]) from [<
7f55b7dc>] (ath11k_ahb_ext_grp_napi_poll+0x20/0x84 [ath11k_ahb]
)
[ 239.849419] [<
7f55b7dc>] (ath11k_ahb_ext_grp_napi_poll [ath11k_ahb]) from [<
8073ce1c>] (net_rx_action+0xe0/0x28c)
[ 239.860945] [<
8073ce1c>] (net_rx_action) from [<
80324868>] (__do_softirq+0xe4/0x228)
[ 239.871269] [<
80324868>] (__do_softirq) from [<
80324c48>] (irq_exit+0x98/0x108)
[ 239.879080] [<
80324c48>] (irq_exit) from [<
8035c59c>] (__handle_domain_irq+0x90/0xb4)
[ 239.886114] [<
8035c59c>] (__handle_domain_irq) from [<
8030137c>] (gic_handle_irq+0x50/0x94)
[ 239.894100] [<
8030137c>] (gic_handle_irq) from [<
803024c0>] (__irq_svc+0x40/0x74)
Signed-off-by: Seevalamuthu Mariappan <seevalam@codeaurora.org>
Link: https://lore.kernel.org/r/1616163532-3881-1-git-send-email-seevalam@codeaurora.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Takashi Iwai [Wed, 7 Apr 2021 09:57:30 +0000 (11:57 +0200)]
ALSA: hda/realtek: Fix speaker amp setup on Acer Aspire E1
We've got a report about Acer Aspire E1 (PCI SSID 1025:0840) that
loses the speaker output after resume. With the comparison of COEF
dumps, it was identified that the COEF 0x0d bits 0x6000 corresponds to
the speaker amp.
This patch adds the specific quirk for the device to restore the COEF
bits at the codec (re-)initialization.
BugLink: https://bugzilla.suse.com/show_bug.cgi?id=1183869
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210407095730.12560-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Dave Airlie [Thu, 8 Apr 2021 07:11:09 +0000 (17:11 +1000)]
Merge tag 'amd-drm-fixes-5.12-2021-04-08' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-5.12-2021-04-08:
amdgpu:
- DCN3 fix
- Fix CAC setting regression for TOPAZ
- Fix ttm regression
radeon:
- Fix ttm regression
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210408045512.3879-1-alexander.deucher@amd.com
Alex Deucher [Wed, 7 Apr 2021 13:28:23 +0000 (09:28 -0400)]
drm/amdgpu/smu7: fix CAC setting on TOPAZ
We need to enable MC CAC for mclk switching to work.
Fixes: d765129a719f ("drm/amd/pm: correct sclk/mclk dpm enablement")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1561
Tested-by: Konstantin Kharlamov <Hi-Angel@yandex.ru>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
xinhui pan [Wed, 7 Apr 2021 12:57:50 +0000 (20:57 +0800)]
drm/radeon: Fix size overflow
ttm->num_pages is uint32. Hit overflow when << PAGE_SHIFT directly
Fixes: 230c079fdcf4 ("drm/ttm: make num_pages uint32_t")
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
xinhui pan [Wed, 7 Apr 2021 11:29:39 +0000 (19:29 +0800)]
drm/amdgpu: Fix size overflow
ttm->num_pages is uint32. Hit overflow when << PAGE_SHIFT directly
Fixes: 230c079fdcf4 ("drm/ttm: make num_pages uint32_t")
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Pavel Begunkov [Thu, 8 Apr 2021 00:54:39 +0000 (01:54 +0100)]
io_uring: clear F_REISSUE right after getting it
There are lots of ways r/w request may continue its path after getting
REQ_F_REISSUE, it's not necessarily io-wq and can be, e.g. apoll,
and submitted via io_async_task_func() -> __io_req_task_submit()
Clear the flag right after getting it, so the next attempt is well
prepared regardless how the request will be executed.
Fixes: 230d50d448ac ("io_uring: move reissue into regular IO path")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/11dcead939343f4e27cab0074d34afcab771bfa4.1617842918.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Maciek Borzecki [Tue, 6 Apr 2021 15:02:29 +0000 (17:02 +0200)]
cifs: escape spaces in share names
Commit
653a5efb849a ("cifs: update super_operations to show_devname")
introduced the display of devname for cifs mounts. However, when mounting
a share which has a whitespace in the name, that exact share name is also
displayed in mountinfo. Make sure that all whitespace is escaped.
Signed-off-by: Maciek Borzecki <maciek.borzecki@gmail.com>
CC: <stable@vger.kernel.org> # 5.11+
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>