Linus Torvalds [Thu, 12 Mar 2020 23:19:19 +0000 (16:19 -0700)]
Merge git://git./linux/kernel/git/netdev/net
Pull networking fixes from David Miller:
"It looks like a decent sized set of fixes, but a lot of these are one
liner off-by-one and similar type changes:
1) Fix netlink header pointer to calcular bad attribute offset
reported to user. From Pablo Neira Ayuso.
2) Don't double clear PHY interrupts when ->did_interrupt is set,
from Heiner Kallweit.
3) Add missing validation of various (devlink, nl802154, fib, etc.)
attributes, from Jakub Kicinski.
4) Missing *pos increments in various netfilter seq_next ops, from
Vasily Averin.
5) Missing break in of_mdiobus_register() loop, from Dajun Jin.
6) Don't double bump tx_dropped in veth driver, from Jiang Lidong.
7) Work around FMAN erratum
A050385, from Madalin Bucur.
8) Make sure ARP header is pulled early enough in bonding driver,
from Eric Dumazet.
9) Do a cond_resched() during multicast processing of ipvlan and
macvlan, from Mahesh Bandewar.
10) Don't attach cgroups to unrelated sockets when in interrupt
context, from Shakeel Butt.
11) Fix tpacket ring state management when encountering unknown GSO
types. From Willem de Bruijn.
12) Fix MDIO bus PHY resume by checking mdio_bus_phy_may_suspend()
only in the suspend context. From Heiner Kallweit"
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (112 commits)
net: systemport: fix index check to avoid an array out of bounds access
tc-testing: add ETS scheduler to tdc build configuration
net: phy: fix MDIO bus PM PHY resuming
net: hns3: clear port base VLAN when unload PF
net: hns3: fix RMW issue for VLAN filter switch
net: hns3: fix VF VLAN table entries inconsistent issue
net: hns3: fix "tc qdisc del" failed issue
taprio: Fix sending packets without dequeueing them
net: mvmdio: avoid error message for optional IRQ
net: dsa: mv88e6xxx: Add missing mask of ATU occupancy register
net: memcg: fix lockdep splat in inet_csk_accept()
s390/qeth: implement smarter resizing of the RX buffer pool
s390/qeth: refactor buffer pool code
s390/qeth: use page pointers to manage RX buffer pool
seg6: fix SRv6 L2 tunnels to use IANA-assigned protocol number
net: dsa: Don't instantiate phylink for CPU/DSA ports unless needed
net/packet: tpacket_rcv: do not increment ring index on drop
sxgbe: Fix off by one in samsung driver strncpy size arg
net: caif: Add lockdep expression to RCU traversal primitive
MAINTAINERS: remove Sathya Perla as Emulex NIC maintainer
...
Linus Torvalds [Thu, 12 Mar 2020 22:51:26 +0000 (15:51 -0700)]
Merge branch 'fixes' of git://git./linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
"A couple of fixes for old crap in ->atomic_open() instances"
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
cifs_atomic_open(): fix double-put on late allocation failure
gfs2_atomic_open(): fix O_EXCL|O_CREAT handling on cold dcache
Colin Ian King [Thu, 12 Mar 2020 15:04:30 +0000 (15:04 +0000)]
net: systemport: fix index check to avoid an array out of bounds access
Currently the bounds check on index is off by one and can lead to
an out of bounds access on array priv->filters_loc when index is
RXCHK_BRCM_TAG_MAX.
Fixes: bb9051a2b230 ("net: systemport: Add support for WAKE_FILTER")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Davide Caratti [Thu, 12 Mar 2020 16:51:45 +0000 (17:51 +0100)]
tc-testing: add ETS scheduler to tdc build configuration
add CONFIG_NET_SCH_ETS to 'config', otherwise test suites using this file
to perform a full tdc run will encounter the following warning:
ok 645 e90e - Add ETS qdisc using bands # skipped - "-----> teardown stage" did not complete successfully
Fixes: 82c664b69c8b ("selftests: qdiscs: Add test coverage for ETS Qdisc")
Reported-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Thu, 12 Mar 2020 21:25:20 +0000 (22:25 +0100)]
net: phy: fix MDIO bus PM PHY resuming
So far we have the unfortunate situation that mdio_bus_phy_may_suspend()
is called in suspend AND resume path, assuming that function result is
the same. After the original change this is no longer the case,
resulting in broken resume as reported by Geert.
To fix this call mdio_bus_phy_may_suspend() in the suspend path only,
and let the phy_device store the info whether it was suspended by
MDIO bus PM.
Fixes: 503ba7c69610 ("net: phy: Avoid multiple suspends")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Al Viro [Thu, 12 Mar 2020 22:25:20 +0000 (18:25 -0400)]
cifs_atomic_open(): fix double-put on late allocation failure
several iterations of ->atomic_open() calling conventions ago, we
used to need fput() if ->atomic_open() failed at some point after
successful finish_open(). Now (since 2016) it's not needed -
struct file carries enough state to make fput() work regardless
of the point in struct file lifecycle and discarding it on
failure exits in open() got unified. Unfortunately, I'd missed
the fact that we had an instance of ->atomic_open() (cifs one)
that used to need that fput(), as well as the stale comment in
finish_open() demanding such late failure handling. Trivially
fixed...
Fixes: fe9ec8291fca "do_last(): take fput() on error after opening to out:"
Cc: stable@kernel.org # v4.7+
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 10 Mar 2020 13:31:41 +0000 (09:31 -0400)]
gfs2_atomic_open(): fix O_EXCL|O_CREAT handling on cold dcache
with the way fs/namei.c:do_last() had been done, ->atomic_open()
instances needed to recognize the case when existing file got
found with O_EXCL|O_CREAT, either by falling back to finish_no_open()
or failing themselves. gfs2 one didn't.
Fixes: 6d4ade986f9c (GFS2: Add atomic_open support)
Cc: stable@kernel.org # v3.11
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
David S. Miller [Thu, 12 Mar 2020 18:38:27 +0000 (11:38 -0700)]
Merge branch 'hns3-fixes'
Huazhong Tan says:
====================
net: hns3: fixes for -net
This series includes several bugfixes for the HNS3 ethernet driver.
[patch 1] fixes an "tc qdisc del" failure.
[patch 2] fixes SW & HW VLAN table not consistent issue.
[patch 3] fixes a RMW issue related to VLAN filter switch.
[patch 4] clears port based VLAN when uploading PF.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jian Shen [Thu, 12 Mar 2020 07:11:06 +0000 (15:11 +0800)]
net: hns3: clear port base VLAN when unload PF
Currently, PF missed to clear the port base VLAN for VF when
unload. In this case, the VLAN id will remain in the VLAN
table. This patch fixes it.
Fixes: 92f11ea177cd ("net: hns3: fix set port based VLAN issue for VF")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jian Shen [Thu, 12 Mar 2020 07:11:05 +0000 (15:11 +0800)]
net: hns3: fix RMW issue for VLAN filter switch
According to the user manual, the ingress and egress VLAN filter
are configured at the same time. Currently, hclge_init_vlan_config()
and hclge_set_vlan_spoofchk() will both change the VLAN filter
switch. So it's necessary to read the old configuration before
modifying it.
Fixes: 22044f95faa0 ("net: hns3: add support for spoof check setting")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jian Shen [Thu, 12 Mar 2020 07:11:04 +0000 (15:11 +0800)]
net: hns3: fix VF VLAN table entries inconsistent issue
Currently, if VF is loaded on the host side, the host doesn't
clear the VF's VLAN table entries when VF removing. In this
case, when doing reset and disabling sriov at the same time the
VLAN device over VF will be removed, but the VLAN table entries
in hardware are remained.
This patch fixes it by asking PF to clear the VLAN table entries for
VF when VF is removing. It also clears the VLAN table full bit
after VF VLAN table entries being cleared.
Fixes: c6075b193462 ("net: hns3: Record VF vlan tables")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yonglong Liu [Thu, 12 Mar 2020 07:11:03 +0000 (15:11 +0800)]
net: hns3: fix "tc qdisc del" failed issue
The HNS3 driver supports to configure TC numbers and TC to priority
map via "tc" tool. But when delete the rule, will fail, because
the HNS3 driver needs at least one TC, but the "tc" tool sets TC
number to zero when delete.
This patch makes sure that the TC number is at least one.
Fixes: 30d240dfa2e8 ("net: hns3: Add mqprio hardware offload support in hns3 driver")
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vinicius Costa Gomes [Mon, 9 Mar 2020 17:39:53 +0000 (10:39 -0700)]
taprio: Fix sending packets without dequeueing them
There was a bug that was causing packets to be sent to the driver
without first calling dequeue() on the "child" qdisc. And the KASAN
report below shows that sending a packet without calling dequeue()
leads to bad results.
The problem is that when checking the last qdisc "child" we do not set
the returned skb to NULL, which can cause it to be sent to the driver,
and so after the skb is sent, it may be freed, and in some situations a
reference to it may still be in the child qdisc, because it was never
dequeued.
The crash log looks like this:
[ 19.937538] ==================================================================
[ 19.938300] BUG: KASAN: use-after-free in taprio_dequeue_soft+0x620/0x780
[ 19.938968] Read of size 4 at addr
ffff8881128628cc by task swapper/1/0
[ 19.939612]
[ 19.939772] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.6.0-rc3+ #97
[ 19.940397] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.12.0-59-gc9ba5276e321-prebuilt.qe4
[ 19.941523] Call Trace:
[ 19.941774] <IRQ>
[ 19.941985] dump_stack+0x97/0xe0
[ 19.942323] print_address_description.constprop.0+0x3b/0x60
[ 19.942884] ? taprio_dequeue_soft+0x620/0x780
[ 19.943325] ? taprio_dequeue_soft+0x620/0x780
[ 19.943767] __kasan_report.cold+0x1a/0x32
[ 19.944173] ? taprio_dequeue_soft+0x620/0x780
[ 19.944612] kasan_report+0xe/0x20
[ 19.944954] taprio_dequeue_soft+0x620/0x780
[ 19.945380] __qdisc_run+0x164/0x18d0
[ 19.945749] net_tx_action+0x2c4/0x730
[ 19.946124] __do_softirq+0x268/0x7bc
[ 19.946491] irq_exit+0x17d/0x1b0
[ 19.946824] smp_apic_timer_interrupt+0xeb/0x380
[ 19.947280] apic_timer_interrupt+0xf/0x20
[ 19.947687] </IRQ>
[ 19.947912] RIP: 0010:default_idle+0x2d/0x2d0
[ 19.948345] Code: 00 00 41 56 41 55 65 44 8b 2d 3f 8d 7c 7c 41 54 55 53 0f 1f 44 00 00 e8 b1 b2 c5 fd e9 07 00 3
[ 19.950166] RSP: 0018:
ffff88811a3efda0 EFLAGS:
00000282 ORIG_RAX:
ffffffffffffff13
[ 19.950909] RAX:
0000000080000000 RBX:
ffff88811a3a9600 RCX:
ffffffff8385327e
[ 19.951608] RDX:
1ffff110234752c0 RSI:
0000000000000000 RDI:
ffffffff8385262f
[ 19.952309] RBP:
ffffed10234752c0 R08:
0000000000000001 R09:
ffffed10234752c1
[ 19.953009] R10:
ffffed10234752c0 R11:
ffff88811a3a9607 R12:
0000000000000001
[ 19.953709] R13:
0000000000000001 R14:
0000000000000000 R15:
0000000000000000
[ 19.954408] ? default_idle_call+0x2e/0x70
[ 19.954816] ? default_idle+0x1f/0x2d0
[ 19.955192] default_idle_call+0x5e/0x70
[ 19.955584] do_idle+0x3d4/0x500
[ 19.955909] ? arch_cpu_idle_exit+0x40/0x40
[ 19.956325] ? _raw_spin_unlock_irqrestore+0x23/0x30
[ 19.956829] ? trace_hardirqs_on+0x30/0x160
[ 19.957242] cpu_startup_entry+0x19/0x20
[ 19.957633] start_secondary+0x2a6/0x380
[ 19.958026] ? set_cpu_sibling_map+0x18b0/0x18b0
[ 19.958486] secondary_startup_64+0xa4/0xb0
[ 19.958921]
[ 19.959078] Allocated by task 33:
[ 19.959412] save_stack+0x1b/0x80
[ 19.959747] __kasan_kmalloc.constprop.0+0xc2/0xd0
[ 19.960222] kmem_cache_alloc+0xe4/0x230
[ 19.960617] __alloc_skb+0x91/0x510
[ 19.960967] ndisc_alloc_skb+0x133/0x330
[ 19.961358] ndisc_send_ns+0x134/0x810
[ 19.961735] addrconf_dad_work+0xad5/0xf80
[ 19.962144] process_one_work+0x78e/0x13a0
[ 19.962551] worker_thread+0x8f/0xfa0
[ 19.962919] kthread+0x2ba/0x3b0
[ 19.963242] ret_from_fork+0x3a/0x50
[ 19.963596]
[ 19.963753] Freed by task 33:
[ 19.964055] save_stack+0x1b/0x80
[ 19.964386] __kasan_slab_free+0x12f/0x180
[ 19.964830] kmem_cache_free+0x80/0x290
[ 19.965231] ip6_mc_input+0x38a/0x4d0
[ 19.965617] ipv6_rcv+0x1a4/0x1d0
[ 19.965948] __netif_receive_skb_one_core+0xf2/0x180
[ 19.966437] netif_receive_skb+0x8c/0x3c0
[ 19.966846] br_handle_frame_finish+0x779/0x1310
[ 19.967302] br_handle_frame+0x42a/0x830
[ 19.967694] __netif_receive_skb_core+0xf0e/0x2a90
[ 19.968167] __netif_receive_skb_one_core+0x96/0x180
[ 19.968658] process_backlog+0x198/0x650
[ 19.969047] net_rx_action+0x2fa/0xaa0
[ 19.969420] __do_softirq+0x268/0x7bc
[ 19.969785]
[ 19.969940] The buggy address belongs to the object at
ffff888112862840
[ 19.969940] which belongs to the cache skbuff_head_cache of size 224
[ 19.971202] The buggy address is located 140 bytes inside of
[ 19.971202] 224-byte region [
ffff888112862840,
ffff888112862920)
[ 19.972344] The buggy address belongs to the page:
[ 19.972820] page:
ffffea00044a1800 refcount:1 mapcount:0 mapping:
ffff88811a2bd1c0 index:0xffff8881128625c0 compo0
[ 19.973930] flags: 0x8000000000010200(slab|head)
[ 19.974388] raw:
8000000000010200 ffff88811a2ed650 ffff88811a2ed650 ffff88811a2bd1c0
[ 19.975151] raw:
ffff8881128625c0 0000000000190013 00000001ffffffff 0000000000000000
[ 19.975915] page dumped because: kasan: bad access detected
[ 19.976461] page_owner tracks the page as allocated
[ 19.976946] page last allocated via order 2, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO|__GFP_FS|__GFP_NO)
[ 19.978332] prep_new_page+0x24b/0x330
[ 19.978707] get_page_from_freelist+0x2057/0x2c90
[ 19.979170] __alloc_pages_nodemask+0x218/0x590
[ 19.979619] new_slab+0x9d/0x300
[ 19.979948] ___slab_alloc.constprop.0+0x2f9/0x6f0
[ 19.980421] __slab_alloc.constprop.0+0x30/0x60
[ 19.980870] kmem_cache_alloc+0x201/0x230
[ 19.981269] __alloc_skb+0x91/0x510
[ 19.981620] alloc_skb_with_frags+0x78/0x4a0
[ 19.982043] sock_alloc_send_pskb+0x5eb/0x750
[ 19.982476] unix_stream_sendmsg+0x399/0x7f0
[ 19.982904] sock_sendmsg+0xe2/0x110
[ 19.983262] ____sys_sendmsg+0x4de/0x6d0
[ 19.983660] ___sys_sendmsg+0xe4/0x160
[ 19.984032] __sys_sendmsg+0xab/0x130
[ 19.984396] do_syscall_64+0xe7/0xae0
[ 19.984761] page last free stack trace:
[ 19.985142] __free_pages_ok+0x432/0xbc0
[ 19.985533] qlist_free_all+0x56/0xc0
[ 19.985907] quarantine_reduce+0x149/0x170
[ 19.986315] __kasan_kmalloc.constprop.0+0x9e/0xd0
[ 19.986791] kmem_cache_alloc+0xe4/0x230
[ 19.987182] prepare_creds+0x24/0x440
[ 19.987548] do_faccessat+0x80/0x590
[ 19.987906] do_syscall_64+0xe7/0xae0
[ 19.988276] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 19.988775]
[ 19.988930] Memory state around the buggy address:
[ 19.989402]
ffff888112862780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 19.990111]
ffff888112862800: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
[ 19.990822] >
ffff888112862880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 19.991529] ^
[ 19.992081]
ffff888112862900: fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc
[ 19.992796]
ffff888112862980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
Fixes: 5a781ccbd19e ("tc: Add support for configuring the taprio scheduler")
Reported-by: Michael Schmidt <michael.schmidt@eti.uni-siegen.de>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Acked-by: Andre Guedes <andre.guedes@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 12 Mar 2020 16:59:36 +0000 (09:59 -0700)]
Merge tag 'for-linus-5.6-2' of git://github.com/cminyard/linux-ipmi
Pull IPMI fix from Corey Minyard:
"Fix a message spew on some system
The call to platform_get_irq() was changed to print a log if the
interrupt was not available, and that was causing bogus messages to
spew out for the IPMI driver. People have requested that this get in
to 5.6 so I'm sending it along"
* tag 'for-linus-5.6-2' of git://github.com/cminyard/linux-ipmi:
ipmi_si: Avoid spurious errors for optional IRQs
Linus Torvalds [Thu, 12 Mar 2020 16:25:55 +0000 (09:25 -0700)]
Merge branch 'linus' of git://git./linux/kernel/git/herbert/crypto-2.6
Pull crypto fix from Herbert Xu:
"Fix a build problem with x86/curve25519"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: x86/curve25519 - support assemblers with no adx support
Chris Packham [Wed, 11 Mar 2020 20:05:46 +0000 (09:05 +1300)]
net: mvmdio: avoid error message for optional IRQ
Per the dt-binding the interrupt is optional so use
platform_get_irq_optional() instead of platform_get_irq(). Since
commit
7723f4c5ecdb ("driver core: platform: Add an error message to
platform_get_irq*()") platform_get_irq() produces an error message
orion-mdio
f1072004.mdio: IRQ index 0 not found
which is perfectly normal if one hasn't specified the optional property
in the device tree.
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Wed, 11 Mar 2020 20:02:31 +0000 (21:02 +0100)]
net: dsa: mv88e6xxx: Add missing mask of ATU occupancy register
Only the bottom 12 bits contain the ATU bin occupancy statistics. The
upper bits need masking off.
Fixes: e0c69ca7dfbb ("net: dsa: mv88e6xxx: Add ATU occupancy via devlink resources")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 11 Mar 2020 18:44:26 +0000 (11:44 -0700)]
net: memcg: fix lockdep splat in inet_csk_accept()
Locking newsk while still holding the listener lock triggered
a lockdep splat [1]
We can simply move the memcg code after we release the listener lock,
as this can also help if multiple threads are sharing a common listener.
Also fix a typo while reading socket sk_rmem_alloc.
[1]
WARNING: possible recursive locking detected
5.6.0-rc3-syzkaller #0 Not tainted
--------------------------------------------
syz-executor598/9524 is trying to acquire lock:
ffff88808b5b8b90 (sk_lock-AF_INET6){+.+.}, at: lock_sock include/net/sock.h:1541 [inline]
ffff88808b5b8b90 (sk_lock-AF_INET6){+.+.}, at: inet_csk_accept+0x69f/0xd30 net/ipv4/inet_connection_sock.c:492
but task is already holding lock:
ffff88808b5b9590 (sk_lock-AF_INET6){+.+.}, at: lock_sock include/net/sock.h:1541 [inline]
ffff88808b5b9590 (sk_lock-AF_INET6){+.+.}, at: inet_csk_accept+0x8d/0xd30 net/ipv4/inet_connection_sock.c:445
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(sk_lock-AF_INET6);
lock(sk_lock-AF_INET6);
*** DEADLOCK ***
May be due to missing lock nesting notation
1 lock held by syz-executor598/9524:
#0:
ffff88808b5b9590 (sk_lock-AF_INET6){+.+.}, at: lock_sock include/net/sock.h:1541 [inline]
#0:
ffff88808b5b9590 (sk_lock-AF_INET6){+.+.}, at: inet_csk_accept+0x8d/0xd30 net/ipv4/inet_connection_sock.c:445
stack backtrace:
CPU: 0 PID: 9524 Comm: syz-executor598 Not tainted 5.6.0-rc3-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x188/0x20d lib/dump_stack.c:118
print_deadlock_bug kernel/locking/lockdep.c:2370 [inline]
check_deadlock kernel/locking/lockdep.c:2411 [inline]
validate_chain kernel/locking/lockdep.c:2954 [inline]
__lock_acquire.cold+0x114/0x288 kernel/locking/lockdep.c:3954
lock_acquire+0x197/0x420 kernel/locking/lockdep.c:4484
lock_sock_nested+0xc5/0x110 net/core/sock.c:2947
lock_sock include/net/sock.h:1541 [inline]
inet_csk_accept+0x69f/0xd30 net/ipv4/inet_connection_sock.c:492
inet_accept+0xe9/0x7c0 net/ipv4/af_inet.c:734
__sys_accept4_file+0x3ac/0x5b0 net/socket.c:1758
__sys_accept4+0x53/0x90 net/socket.c:1809
__do_sys_accept4 net/socket.c:1821 [inline]
__se_sys_accept4 net/socket.c:1818 [inline]
__x64_sys_accept4+0x93/0xf0 net/socket.c:1818
do_syscall_64+0xf6/0x790 arch/x86/entry/common.c:294
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4445c9
Code: e8 0c 0d 03 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:
00007ffc35b37608 EFLAGS:
00000246 ORIG_RAX:
0000000000000120
RAX:
ffffffffffffffda RBX:
0000000000000003 RCX:
00000000004445c9
RDX:
0000000000000000 RSI:
0000000000000000 RDI:
0000000000000003
RBP:
0000000000000000 R08:
0000000000306777 R09:
0000000000306777
R10:
0000000000000000 R11:
0000000000000246 R12:
0000000000000000
R13:
00000000004053d0 R14:
0000000000000000 R15:
0000000000000000
Fixes: d752a4986532 ("net: memcg: late association of sock to memcg")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Shakeel Butt <shakeelb@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 12 Mar 2020 06:52:32 +0000 (23:52 -0700)]
Merge branch 's390-qeth-fixes'
Julian Wiedmann says:
====================
s390/qeth: fixes 2020-03-11
please apply the following patch series for qeth to netdev's net tree.
Just one fix to get the RX buffer pool resizing right, with two
preparatory cleanups.
This is on the larger side given where we are in the -rc cycle, but a
big chunk of the delta is just refactoring to make the fix look nice.
I intentionally split these off from yesterday's series. No objections
if you'd rather punt them to net-next, the series should apply cleanly.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Wed, 11 Mar 2020 17:07:11 +0000 (18:07 +0100)]
s390/qeth: implement smarter resizing of the RX buffer pool
The RX buffer pool is allocated in qeth_alloc_qdio_queues().
A subsequent pool resizing is then handled in a very simple way:
first free the current pool, then allocate a new pool of the requested
size.
There's two ways where this can go wrong:
1. if the resize action happens _before_ the initial pool was allocated,
then a subsequent initialization will call qeth_alloc_qdio_queues()
and fill the pool with a second(!) set of pages. We consume twice the
planned amount of memory.
This is easy to fix - just skip the resizing if the queues haven't
been allocated yet.
2. if the initial pool was created by qeth_alloc_qdio_queues() but a
subsequent resizing fails, then the device has no(!) RX buffer pool.
The next initialization will _not_ call qeth_alloc_qdio_queues(), and
attempting to back the RX buffers with pages in
qeth_init_qdio_queues() will fail.
Not very difficult to fix either - instead of re-allocating the whole
pool, just allocate/free as many entries to match the desired size.
Fixes: 4a71df50047f ("qeth: new qeth device driver")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Wed, 11 Mar 2020 17:07:10 +0000 (18:07 +0100)]
s390/qeth: refactor buffer pool code
In preparation for a subsequent fix, split out helpers to allocate/free
individual pool entries.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Wed, 11 Mar 2020 17:07:09 +0000 (18:07 +0100)]
s390/qeth: use page pointers to manage RX buffer pool
The RX buffer elements are always backed with full pages, reflect this
in the pointer type.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Lungaroni [Wed, 11 Mar 2020 16:54:06 +0000 (17:54 +0100)]
seg6: fix SRv6 L2 tunnels to use IANA-assigned protocol number
The Internet Assigned Numbers Authority (IANA) has recently assigned
a protocol number value of 143 for Ethernet [1].
Before this assignment, encapsulation mechanisms such as Segment Routing
used the IPv6-NoNxt protocol number (59) to indicate that the encapsulated
payload is an Ethernet frame.
In this patch, we add the definition of the Ethernet protocol number to the
kernel headers and update the SRv6 L2 tunnels to use it.
[1] https://www.iana.org/assignments/protocol-numbers/protocol-numbers.xhtml
Signed-off-by: Paolo Lungaroni <paolo.lungaroni@cnit.it>
Reviewed-by: Andrea Mayer <andrea.mayer@uniroma2.it>
Acked-by: Ahmed Abdelsalam <ahmed.abdelsalam@gssi.it>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Wed, 11 Mar 2020 15:24:24 +0000 (16:24 +0100)]
net: dsa: Don't instantiate phylink for CPU/DSA ports unless needed
By default, DSA drivers should configure CPU and DSA ports to their
maximum speed. In many configurations this is sufficient to make the
link work.
In some cases it is necessary to configure the link to run slower,
e.g. because of limitations of the SoC it is connected to. Or back to
back PHYs are used and the PHY needs to be driven in order to
establish link. In this case, phylink is used.
Only instantiate phylink if it is required. If there is no PHY, or no
fixed link properties, phylink can upset a link which works in the
default configuration.
Fixes: 0e27921816ad ("net: dsa: Use PHYLINK for the CPU/DSA ports")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Willem de Bruijn [Mon, 9 Mar 2020 15:34:35 +0000 (11:34 -0400)]
net/packet: tpacket_rcv: do not increment ring index on drop
In one error case, tpacket_rcv drops packets after incrementing the
ring producer index.
If this happens, it does not update tp_status to TP_STATUS_USER and
thus the reader is stalled for an iteration of the ring, causing out
of order arrival.
The only such error path is when virtio_net_hdr_from_skb fails due
to encountering an unknown GSO type.
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dominik Czarnota [Mon, 9 Mar 2020 15:22:50 +0000 (16:22 +0100)]
sxgbe: Fix off by one in samsung driver strncpy size arg
This patch fixes an off-by-one error in strncpy size argument in
drivers/net/ethernet/samsung/sxgbe/sxgbe_main.c. The issue is that in:
strncmp(opt, "eee_timer:", 6)
the passed string literal: "eee_timer:" has 10 bytes (without the NULL
byte) and the passed size argument is 6. As a result, the logic will
also accept other, malformed strings, e.g. "eee_tiXXX:".
This bug doesn't seem to have any security impact since its present in
module's cmdline parsing code.
Signed-off-by: Dominik Czarnota <dominik.b.czarnota@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Amol Grover [Thu, 12 Mar 2020 05:34:20 +0000 (11:04 +0530)]
net: caif: Add lockdep expression to RCU traversal primitive
caifdevs->list is traversed using list_for_each_entry_rcu()
outside an RCU read-side critical section but under the
protection of rtnl_mutex. Hence, add the corresponding lockdep
expression to silence the following false-positive warning:
[ 10.868467] =============================
[ 10.869082] WARNING: suspicious RCU usage
[ 10.869817]
5.6.0-rc1-00177-g06ec0a154aae4 #1 Not tainted
[ 10.870804] -----------------------------
[ 10.871557] net/caif/caif_dev.c:115 RCU-list traversed in non-reader section!!
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Amol Grover <frextrite@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Wed, 11 Mar 2020 23:37:02 +0000 (16:37 -0700)]
MAINTAINERS: remove Sathya Perla as Emulex NIC maintainer
Remove Sathya Perla, sathya.perla@broadcom.com is bouncing.
The driver has 3 more maintainers.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Wed, 11 Mar 2020 03:36:16 +0000 (20:36 -0700)]
net: fec: validate the new settings in fec_enet_set_coalesce()
fec_enet_set_coalesce() validates the previously set params
and if they are within range proceeds to apply the new ones.
The new ones, however, are not validated. This seems backwards,
probably a copy-paste error?
Compile tested only.
Fixes: d851b47b22fc ("net: fec: add interrupt coalescence feature support")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Takashi Iwai [Wed, 5 Feb 2020 09:31:46 +0000 (10:31 +0100)]
ipmi_si: Avoid spurious errors for optional IRQs
Although the IRQ assignment in ipmi_si driver is optional,
platform_get_irq() spews error messages unnecessarily:
ipmi_si dmi-ipmi-si.0: IRQ index 0 not found
Fix this by switching to platform_get_irq_optional().
Cc: stable@vger.kernel.org # 5.4.x
Cc: John Donnelly <john.p.donnelly@oracle.com>
Fixes: 7723f4c5ecdb ("driver core: platform: Add an error message to platform_get_irq*()")
Reported-and-tested-by: Patrick Vo <patrick.vo@hpe.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Message-Id: <
20200205093146.1352-1-tiwai@suse.de>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Nathan Chancellor [Tue, 10 Mar 2020 22:06:54 +0000 (15:06 -0700)]
dpaa_eth: Remove unnecessary boolean expression in dpaa_get_headroom
Clang warns:
drivers/net/ethernet/freescale/dpaa/dpaa_eth.c:2860:9: warning:
converting the result of '?:' with integer constants to a boolean always
evaluates to 'true' [-Wtautological-constant-compare]
return DPAA_FD_DATA_ALIGNMENT ? ALIGN(headroom,
^
drivers/net/ethernet/freescale/dpaa/dpaa_eth.c:131:34: note: expanded
from macro 'DPAA_FD_DATA_ALIGNMENT'
\#define DPAA_FD_DATA_ALIGNMENT (fman_has_errata_a050385() ? 64 : 16)
^
1 warning generated.
This was exposed by commit
3c68b8fffb48 ("dpaa_eth: FMan erratum
A050385
workaround") even though it appears to have been an issue since the
introductory commit
9ad1a3749333 ("dpaa_eth: add support for DPAA
Ethernet") since DPAA_FD_DATA_ALIGNMENT has never been able to be zero.
Just replace the whole boolean expression with the true branch, as it is
always been true.
Link: https://github.com/ClangBuiltLinux/linux/issues/928
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 11 Mar 2020 20:35:34 +0000 (13:35 -0700)]
Merge tag 'fscrypt-for-linus' of git://git./fs/fscrypt/fscrypt
Pull fscrypt fix from Eric Biggers:
"Fix a bug where if userspace is writing to encrypted files while the
FS_IOC_REMOVE_ENCRYPTION_KEY ioctl (introduced in v5.4) is running,
dirty inodes could be evicted, causing writes could be lost or the
filesystem to hang due to a use-after-free. This was encountered
during real-world use, not just theoretical.
Tested with the existing fscrypt xfstests, and with a new xfstest I
wrote to reproduce this bug. This fix does expose an existing bug with
'-o lazytime' that Ted is working on fixing, but this fix is more
critical and needed anyway regardless of the lazytime fix"
* tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
fscrypt: don't evict dirty inodes after removing key
David S. Miller [Wed, 11 Mar 2020 19:29:03 +0000 (12:29 -0700)]
Merge tag 'mac80211-for-net-2020-03-11' of git://git./linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
A couple of fixes:
* three netlink validation fixes
* a mesh path selection fix
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 11 Mar 2020 17:00:41 +0000 (10:00 -0700)]
Merge tag 'for-linus-2020-03-10' of git://git./linux/kernel/git/brauner/linux
Pull thread fix from Christian Brauner:
"This contains a single fix for a regression which was introduced when
we introduced the ability to select a specific pid at process creation
time.
When this feature is requested, the error value will be set to -EPERM
after exiting the pid allocation loop. This caused EPERM to be
returned when e.g. the init process/child subreaper of the pid
namespace has already died where we used to return ENOMEM before.
The first patch here simply fixes the regression by unconditionally
setting the return value back to ENOMEM again once we've successfully
allocated the requested pid number. This should be easy to backport to
v5.5.
The second patch adds a comment explaining that we must keep returning
ENOMEM since we've been doing it for a long time and have explicitly
documented this behavior for userspace. This seemed worthwhile because
we now have at least two separate example where people tried to change
the return value to something other than ENOMEM (The first version of
the regression fix did that too and the commit message links to an
earlier patch that tried to do the same.).
I have a simple regression test to make sure we catch this regression
in the future but since that introduces a whole new selftest subdir
and test files I'll keep this for v5.7"
* tag 'for-linus-2020-03-10' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
pid: make ENOMEM return value more obvious
pid: Fix error return value in some cases
Linus Torvalds [Wed, 11 Mar 2020 16:54:59 +0000 (09:54 -0700)]
Merge tag 'trace-v5.6-rc4' of git://git./linux/kernel/git/rostedt/linux-trace
Pull ftrace fix from Steven Rostedt:
"Have ftrace lookup_rec() return a consistent record otherwise it can
break live patching"
* tag 'trace-v5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
ftrace: Return the first found result in lookup_rec()
Linus Torvalds [Wed, 11 Mar 2020 16:49:47 +0000 (09:49 -0700)]
Merge tag 'mips_fixes_5.6.1' of git://git./linux/kernel/git/mips/linux
Pull MIPS fixes from Thomas Bogendoerfer:
"A few MIPS fixes:
- DT fixes for CI20
- Fix command line handling
- Correct patchwork URL"
* tag 'mips_fixes_5.6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
MAINTAINERS: Correct MIPS patchwork URL
MIPS: DTS: CI20: fix interrupt for pcf8563 RTC
MIPS: DTS: CI20: fix PMU definitions for ACT8600
MIPS: Fix CONFIG_MIPS_CMDLINE_DTB_EXTEND handling
Linus Torvalds [Wed, 11 Mar 2020 16:45:38 +0000 (09:45 -0700)]
Merge tag 'pinctrl-v5.6-2' of git://git./linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
"Some pin control fixes for the v5.6 series.
It comes down to memory leaks in the core and driver fixes. Some
should have been sent earlier but they kept piling up and the world is
just so full of distractions these days.
- Fix some inverted pins in the Meson GLX driver.
- Align the i.MX SC message structs causing warnings from KASan.
- Balance the kref in pinctrl hogs so they are actually free:d when
removing a pin control module. We haven't seen it before as people
don't use modules for pin control that much, I think.
- Add a missing call to pinctrl_unregister_mappings() another memory
leak when using modules.
- Fix the fwspec parsing in the Qualcomm driver.
- Fix a syntax error in the Falcon driver.
- Assign .irq_eoi conditionally in the Qualcomm driver, fixing a bug
affecting elder Qualcomm platforms"
* tag 'pinctrl-v5.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: qcom: Assign irq_eoi conditionally
pinctrl: falcon: fix syntax error
pinctrl: qcom: ssbi-gpio: Fix fwspec parsing bug
pinctrl: madera: Add missing call to pinctrl_unregister_mappings
pinctrl: core: Remove extra kref_get which blocks hogs being freed
pinctrl: imx: scu: Align imx sc msg structs to 4
pinctrl: meson-gxl: fix GPIOX sdio pins
Christoph Hellwig [Wed, 11 Mar 2020 16:07:10 +0000 (17:07 +0100)]
driver code: clarify and fix platform device DMA mask allocation
This does three inter-related things to clarify the usage of the
platform device dma_mask field. In the process, fix the bug introduced
by
cdfee5623290 ("driver core: initialize a default DMA mask for
platform device") that caused Artem Tashkinov's laptop to not boot with
newer Fedora kernels.
This does:
- First off, rename the field to "platform_dma_mask" to make it
greppable.
We have way too many different random fields called "dma_mask" in
various data structures, where some of them are actual masks, and
some of them are just pointers to the mask. And the structures all
have pointers to each other, or embed each other inside themselves,
and "pdev" sometimes means "platform device" and sometimes it means
"PCI device".
So to make it clear in the code when you actually use this new field,
give it a unique name (it really should be something even more unique
like "platform_device_dma_mask", since it's per platform device, not
per platform, but that gets old really fast, and this is unique
enough in context).
To further clarify when the field gets used, initialize it when we
actually start using it with the default value.
- Then, use this field instead of the random one-off allocation in
platform_device_register_full() that is now unnecessary since we now
already have a perfectly fine allocation for it in the platform
device structure.
- The above then allows us to fix the actual bug, where the error path
of platform_device_register_full() would unconditionally free the
platform device DMA allocation with 'kfree()'.
That kfree() was dont regardless of whether the allocation had been
done earlier with the (now removed) kmalloc, or whether
setup_pdev_dma_masks() had already been used and the dma_mask pointer
pointed to the mask that was part of the platform device.
It seems most people never triggered the error path, or only triggered
it from a call chain that set an explicit pdevinfo->dma_mask value (and
thus caused the unnecessary allocation that was "cleaned up" in the
error path) before calling platform_device_register_full().
Robin Murphy points out that in Artem's case the wdat_wdt driver failed
in platform_device_add(), and that was the one that had called
platform_device_register_full() with pdevinfo.dma_mask = 0, and would
have caused that kfree() of pdev.dma_mask corrupting the heap.
A later unrelated kmalloc() then oopsed due to the heap corruption.
Fixes: cdfee5623290 ("driver core: initialize a default DMA mask for platform device")
Reported-bisected-and-tested-by: Artem S. Tashkinov <aros@gmx.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Artem Savkov [Fri, 6 Mar 2020 17:43:17 +0000 (18:43 +0100)]
ftrace: Return the first found result in lookup_rec()
It appears that ip ranges can overlap so. In that case lookup_rec()
returns whatever results it got last even if it found nothing in last
searched page.
This breaks an obscure livepatch late module patching usecase:
- load livepatch
- load the patched module
- unload livepatch
- try to load livepatch again
To fix this return from lookup_rec() as soon as it found the record
containing searched-for ip. This used to be this way prior lookup_rec()
introduction.
Link: http://lkml.kernel.org/r/20200306174317.21699-1-asavkov@redhat.com
Cc: stable@vger.kernel.org
Fixes: 7e16f581a817 ("ftrace: Separate out functionality from ftrace_location_range()")
Signed-off-by: Artem Savkov <asavkov@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Nicolas Cavallari [Thu, 5 Mar 2020 14:04:09 +0000 (15:04 +0100)]
mac80211: Do not send mesh HWMP PREQ if HWMP is disabled
When trying to transmit to an unknown destination, the mesh code would
unconditionally transmit a HWMP PREQ even if HWMP is not the current
path selection algorithm.
Signed-off-by: Nicolas Cavallari <nicolas.cavallari@green-communications.fr>
Link: https://lore.kernel.org/r/20200305140409.12204-1-cavallar@lri.fr
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Jakub Kicinski [Tue, 3 Mar 2020 05:10:58 +0000 (21:10 -0800)]
nl80211: add missing attribute validation for channel switch
Add missing attribute validation for NL80211_ATTR_OPER_CLASS
to the netlink policy.
Fixes: 1057d35ede5d ("cfg80211: introduce TDLS channel switch commands")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20200303051058.4089398-4-kuba@kernel.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Jakub Kicinski [Tue, 3 Mar 2020 05:10:57 +0000 (21:10 -0800)]
nl80211: add missing attribute validation for beacon report scanning
Add missing attribute validation for beacon report scanning
to the netlink policy.
Fixes: 1d76250bd34a ("nl80211: support beacon report scanning")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20200303051058.4089398-3-kuba@kernel.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Jakub Kicinski [Tue, 3 Mar 2020 05:10:56 +0000 (21:10 -0800)]
nl80211: add missing attribute validation for critical protocol indication
Add missing attribute validation for critical protocol fields
to the netlink policy.
Fixes: 5de17984898c ("cfg80211: introduce critical protocol indication from user-space")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20200303051058.4089398-2-kuba@kernel.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
David S. Miller [Tue, 10 Mar 2020 23:07:49 +0000 (16:07 -0700)]
Merge branch 's390-qeth-fixes'
Julian Wiedmann says:
====================
s390/qeth: fixes 2020-03-10
This fixes three minor issues:
1) a setup parameter gets cleared unnecessarily when the HW config
changes,
2) insufficient error handling when initially filling the RX ring, and
3) a rarely used worker that needs to be cancelled during tear down.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Tue, 10 Mar 2020 17:38:03 +0000 (18:38 +0100)]
s390/qeth: cancel RX reclaim work earlier
When qeth's napi poll code fails to refill an entirely empty RX ring, it
kicks off buffer_reclaim_work to try again later.
Make sure that this worker is cancelled when setting the qeth device
offline. Otherwise a RX refill action can unexpectedly end up running
concurrently to bigger re-configurations (eg. resizing the buffer pool),
without any locking.
Fixes: b333293058aa ("qeth: add support for af_iucv HiperSockets transport")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Tue, 10 Mar 2020 17:38:02 +0000 (18:38 +0100)]
s390/qeth: handle error when backing RX buffer
qeth_init_qdio_queues() fills the RX ring with an initial set of
RX buffers. If qeth_init_input_buffer() fails to back one of the RX
buffers with memory, we need to bail out and report the error.
Fixes: 4a71df50047f ("qeth: new qeth device driver")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Tue, 10 Mar 2020 17:38:01 +0000 (18:38 +0100)]
s390/qeth: don't reset default_out_queue
When an OSA device in prio-queue setup is reduced to 1 TX queue due to
HW restrictions, we reset its the default_out_queue to 0.
In the old code this was needed so that qeth_get_priority_queue() gets
the queue selection right. But with proper multiqueue support we already
reduced dev->real_num_tx_queues to 1, and so the stack puts all traffic
on txq 0 without even calling .ndo_select_queue.
Thus we can preserve the user's configuration, and apply it if the OSA
device later re-gains support for multiple TX queues.
Fixes: 73dc2daf110f ("s390/qeth: add TX multiqueue support for OSA devices")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 10 Mar 2020 22:59:32 +0000 (15:59 -0700)]
Merge branch 'MACSec-bugfixes-related-to-MAC-address-change'
Igor Russkikh says:
====================
MACSec bugfixes related to MAC address change
We found out that there's an issue in MACSec code when the MAC address
is changed.
Both s/w and offloaded implementations don't update SCI when the MAC
address changes at the moment, but they should do so, because SCI contains
MAC in its first 6 octets.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Bogdanov [Tue, 10 Mar 2020 15:22:25 +0000 (18:22 +0300)]
net: macsec: invoke mdo_upd_secy callback when mac address changed
Notify the offload engine about MAC address change to reconfigure it
accordingly.
Fixes: 3cf3227a21d1 ("net: macsec: hardware offloading infrastructure")
Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Bogdanov [Tue, 10 Mar 2020 15:22:24 +0000 (18:22 +0300)]
net: macsec: update SCI upon MAC address change.
SCI should be updated, because it contains MAC in its first 6 octets.
Fixes: c09440f7dcb3 ("macsec: introduce IEEE 802.1AE driver")
Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Juliet Kim [Tue, 10 Mar 2020 14:23:58 +0000 (09:23 -0500)]
ibmvnic: Do not process device remove during device reset
The ibmvnic driver does not check the device state when the device
is removed. If the device is removed while a device reset is being
processed, the remove may free structures needed by the reset,
causing an oops.
Fix this by checking the device state before processing device remove.
Signed-off-by: Juliet Kim <julietk@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Karsten Graul [Tue, 10 Mar 2020 08:33:30 +0000 (09:33 +0100)]
net/smc: cancel event worker during device removal
During IB device removal, cancel the event worker before the device
structure is freed.
Fixes: a4cf0443c414 ("smc: introduce SMC as an IB-client")
Reported-by: syzbot+b297c6825752e7a07272@syzkaller.appspotmail.com
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hangbin Liu [Tue, 10 Mar 2020 07:27:37 +0000 (15:27 +0800)]
ipv6/addrconf: call ipv6_mc_up() for non-Ethernet interface
Rafał found an issue that for non-Ethernet interface, if we down and up
frequently, the memory will be consumed slowly.
The reason is we add allnodes/allrouters addressed in multicast list in
ipv6_add_dev(). When link down, we call ipv6_mc_down(), store all multicast
addresses via mld_add_delrec(). But when link up, we don't call ipv6_mc_up()
for non-Ethernet interface to remove the addresses. This makes idev->mc_tomb
getting bigger and bigger. The call stack looks like:
addrconf_notify(NETDEV_REGISTER)
ipv6_add_dev
ipv6_dev_mc_inc(ff01::1)
ipv6_dev_mc_inc(ff02::1)
ipv6_dev_mc_inc(ff02::2)
addrconf_notify(NETDEV_UP)
addrconf_dev_config
/* Alas, we support only Ethernet autoconfiguration. */
return;
addrconf_notify(NETDEV_DOWN)
addrconf_ifdown
ipv6_mc_down
igmp6_group_dropped(ff02::2)
mld_add_delrec(ff02::2)
igmp6_group_dropped(ff02::1)
igmp6_group_dropped(ff01::1)
After investigating, I can't found a rule to disable multicast on
non-Ethernet interface. In RFC2460, the link could be Ethernet, PPP, ATM,
tunnels, etc. In IPv4, it doesn't check the dev type when calls ip_mc_up()
in inetdev_event(). Even for IPv6, we don't check the dev type and call
ipv6_add_dev(), ipv6_dev_mc_inc() after register device.
So I think it's OK to fix this memory consumer by calling ipv6_mc_up() for
non-Ethernet interface.
v2: Also check IFF_MULTICAST flag to make sure the interface supports
multicast
Reported-by: Rafał Miłecki <zajec5@gmail.com>
Tested-by: Rafał Miłecki <zajec5@gmail.com>
Fixes: 74235a25c673 ("[IPV6] addrconf: Fix IPv6 on tuntap tunnels")
Fixes: 1666d49e1d41 ("mld: do not remove mld souce list info when set link down")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 10 Mar 2020 22:36:27 +0000 (15:36 -0700)]
Merge tag 'clang-format-for-linus-v5.6-rc6' of git://github.com/ojeda/linux
Pull clang-format update from Miguel Ojeda:
"Another update for the .clang-format macro list
It has been a while since the last time I sent one!"
* tag 'clang-format-for-linus-v5.6-rc6' of git://github.com/ojeda/linux:
clang-format: Update with the latest for_each macro list
Shakeel Butt [Tue, 10 Mar 2020 05:16:06 +0000 (22:16 -0700)]
net: memcg: late association of sock to memcg
If a TCP socket is allocated in IRQ context or cloned from unassociated
(i.e. not associated to a memcg) in IRQ context then it will remain
unassociated for its whole life. Almost half of the TCPs created on the
system are created in IRQ context, so, memory used by such sockets will
not be accounted by the memcg.
This issue is more widespread in cgroup v1 where network memory
accounting is opt-in but it can happen in cgroup v2 if the source socket
for the cloning was created in root memcg.
To fix the issue, just do the association of the sockets at the accept()
time in the process context and then force charge the memory buffer
already used and reserved by the socket.
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shakeel Butt [Tue, 10 Mar 2020 05:16:05 +0000 (22:16 -0700)]
cgroup: memcg: net: do not associate sock with unrelated cgroup
We are testing network memory accounting in our setup and noticed
inconsistent network memory usage and often unrelated cgroups network
usage correlates with testing workload. On further inspection, it
seems like mem_cgroup_sk_alloc() and cgroup_sk_alloc() are broken in
irq context specially for cgroup v1.
mem_cgroup_sk_alloc() and cgroup_sk_alloc() can be called in irq context
and kind of assumes that this can only happen from sk_clone_lock()
and the source sock object has already associated cgroup. However in
cgroup v1, where network memory accounting is opt-in, the source sock
can be unassociated with any cgroup and the new cloned sock can get
associated with unrelated interrupted cgroup.
Cgroup v2 can also suffer if the source sock object was created by
process in the root cgroup or if sk_alloc() is called in irq context.
The fix is to just do nothing in interrupt.
WARNING: Please note that about half of the TCP sockets are allocated
from the IRQ context, so, memory used by such sockets will not be
accouted by the memcg.
The stack trace of mem_cgroup_sk_alloc() from IRQ-context:
CPU: 70 PID: 12720 Comm: ssh Tainted: 5.6.0-smp-DEV #1
Hardware name: ...
Call Trace:
<IRQ>
dump_stack+0x57/0x75
mem_cgroup_sk_alloc+0xe9/0xf0
sk_clone_lock+0x2a7/0x420
inet_csk_clone_lock+0x1b/0x110
tcp_create_openreq_child+0x23/0x3b0
tcp_v6_syn_recv_sock+0x88/0x730
tcp_check_req+0x429/0x560
tcp_v6_rcv+0x72d/0xa40
ip6_protocol_deliver_rcu+0xc9/0x400
ip6_input+0x44/0xd0
? ip6_protocol_deliver_rcu+0x400/0x400
ip6_rcv_finish+0x71/0x80
ipv6_rcv+0x5b/0xe0
? ip6_sublist_rcv+0x2e0/0x2e0
process_backlog+0x108/0x1e0
net_rx_action+0x26b/0x460
__do_softirq+0x104/0x2a6
do_softirq_own_stack+0x2a/0x40
</IRQ>
do_softirq.part.19+0x40/0x50
__local_bh_enable_ip+0x51/0x60
ip6_finish_output2+0x23d/0x520
? ip6table_mangle_hook+0x55/0x160
__ip6_finish_output+0xa1/0x100
ip6_finish_output+0x30/0xd0
ip6_output+0x73/0x120
? __ip6_finish_output+0x100/0x100
ip6_xmit+0x2e3/0x600
? ipv6_anycast_cleanup+0x50/0x50
? inet6_csk_route_socket+0x136/0x1e0
? skb_free_head+0x1e/0x30
inet6_csk_xmit+0x95/0xf0
__tcp_transmit_skb+0x5b4/0xb20
__tcp_send_ack.part.60+0xa3/0x110
tcp_send_ack+0x1d/0x20
tcp_rcv_state_process+0xe64/0xe80
? tcp_v6_connect+0x5d1/0x5f0
tcp_v6_do_rcv+0x1b1/0x3f0
? tcp_v6_do_rcv+0x1b1/0x3f0
__release_sock+0x7f/0xd0
release_sock+0x30/0xa0
__inet_stream_connect+0x1c3/0x3b0
? prepare_to_wait+0xb0/0xb0
inet_stream_connect+0x3b/0x60
__sys_connect+0x101/0x120
? __sys_getsockopt+0x11b/0x140
__x64_sys_connect+0x1a/0x20
do_syscall_64+0x51/0x200
entry_SYSCALL_64_after_hwframe+0x44/0xa9
The stack trace of mem_cgroup_sk_alloc() from IRQ-context:
Fixes: 2d7580738345 ("mm: memcontrol: consolidate cgroup socket tracking")
Fixes: d979a39d7242 ("cgroup: duplicate cgroup reference when cloning sockets")
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Reviewed-by: Roman Gushchin <guro@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 10 Mar 2020 22:32:57 +0000 (15:32 -0700)]
Merge tag 'auxdisplay-for-linus-v5.6-rc6' of git://github.com/ojeda/linux
Pull auxdisplay updates from Miguel Ojeda:
"A few minor auxdisplay improvements:
- charlcd: replace zero-length array with flexible-array member
(kernel-wide cleanup by Gustavo A. R. Silva)
- img-ascii-lcd: convert to devm_platform_ioremap_resource (Yangtao
Li)
- Fix Kconfig indentation (Krzysztof Kozlowski)
* tag 'auxdisplay-for-linus-v5.6-rc6' of git://github.com/ojeda/linux:
auxdisplay: charlcd: replace zero-length array with flexible-array member
auxdisplay: img-ascii-lcd: convert to devm_platform_ioremap_resource
auxdisplay: Fix Kconfig indentation
Jakub Kicinski [Tue, 10 Mar 2020 03:11:42 +0000 (20:11 -0700)]
MAINTAINERS: update cxgb4vf maintainer to Vishal
Casey Leedomn <leedom@chelsio.com> is bouncing,
Vishal indicated he's happy to take the role.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 10 Mar 2020 22:05:45 +0000 (15:05 -0700)]
Merge branch 'for-5.6-fixes' of git://git./linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:
- cgroup.procs listing related fixes.
It didn't interlock properly with exiting tasks leaving a short
window where a cgroup has empty cgroup.procs but still can't be
removed and misbehaved on short reads.
- psi_show() crash fix on 32bit ino archs
- Empty release_agent handling fix
* 'for-5.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup1: don't call release_agent when it is ""
cgroup: fix psi_show() crash on 32bit ino archs
cgroup: Iterate tasks that did not finish do_exit()
cgroup: cgroup_procs_next should increase position index
cgroup-v1: cgroup_pidlist_next should update position index
Linus Torvalds [Tue, 10 Mar 2020 21:48:22 +0000 (14:48 -0700)]
Merge branch 'for-5.6-fixes' of git://git./linux/kernel/git/tj/wq
Pull workqueue fixes from Tejun Heo:
"Workqueue has been incorrectly round-robining per-cpu work items.
Hillf's patch fixes that.
The other patch documents memory-ordering properties of workqueue
operations"
* 'for-5.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: don't use wq_select_unbound_cpu() for bound works
workqueue: Document (some) memory-ordering properties of {queue,schedule}_work()
Hillf Danton [Sat, 25 Jan 2020 01:14:45 +0000 (20:14 -0500)]
workqueue: don't use wq_select_unbound_cpu() for bound works
wq_select_unbound_cpu() is designed for unbound workqueues only, but
it's wrongly called when using a bound workqueue too.
Fixing this ensures work queued to a bound workqueue with
cpu=WORK_CPU_UNBOUND always runs on the local CPU.
Before, that would happen only if wq_unbound_cpumask happened to include
it (likely almost always the case), or was empty, or we got lucky with
forced round-robin placement. So restricting
/sys/devices/virtual/workqueue/cpumask to a small subset of a machine's
CPUs would cause some bound work items to run unexpectedly there.
Fixes: ef557180447f ("workqueue: schedule WORK_CPU_UNBOUND work on wq_unbound_cpumask CPUs")
Cc: stable@vger.kernel.org # v4.5+
Signed-off-by: Hillf Danton <hdanton@sina.com>
[dj: massage changelog]
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Tejun Heo <tj@kernel.org>
David S. Miller [Tue, 10 Mar 2020 02:08:43 +0000 (19:08 -0700)]
Merge tag 'batadv-net-for-davem-
20200306' of git://git.open-mesh.org/linux-merge
Simon Wunderlich says:
====================
Here is a batman-adv bugfix:
- Don't schedule OGM for disabled interface, by Sven Eckelmann
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Tue, 10 Mar 2020 01:28:18 +0000 (03:28 +0200)]
net: mscc: ocelot: properly account for VLAN header length when setting MRU
What the driver writes into MAC_MAXLEN_CFG does not actually represent
VLAN_ETH_FRAME_LEN but instead ETH_FRAME_LEN + ETH_FCS_LEN. Yes they are
numerically equal, but the difference is important, as the switch treats
VLAN-tagged traffic specially and knows to increase the maximum accepted
frame size automatically. So it is always wrong to account for VLAN in
the MAC_MAXLEN_CFG register.
Unconditionally increase the maximum allowed frame size for
double-tagged traffic. Accounting for the additional length does not
mean that the other VLAN membership checks aren't performed, so there's
no harm done.
Also, stop abusing the MTU name for configuring the MRU. There is no
support for configuring the MRU on an interface at the moment.
Fixes: a556c76adc05 ("net: mscc: Add initial Ocelot switch support")
Fixes: fa914e9c4d94 ("net: mscc: ocelot: create a helper for changing the port MTU")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 10 Mar 2020 01:22:58 +0000 (18:22 -0700)]
ipvlan: do not use cond_resched_rcu() in ipvlan_process_multicast()
Commit
e18b353f102e ("ipvlan: add cond_resched_rcu() while
processing muticast backlog") added a cond_resched_rcu() in a loop
using rcu protection to iterate over slaves.
This is breaking rcu rules, so lets instead use cond_resched()
at a point we can reschedule
Fixes: e18b353f102e ("ipvlan: add cond_resched_rcu() while processing muticast backlog")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Yakunin [Thu, 5 Mar 2020 14:45:57 +0000 (17:45 +0300)]
cgroup, netclassid: periodically release file_lock on classid updating
In our production environment we have faced with problem that updating
classid in cgroup with heavy tasks cause long freeze of the file tables
in this tasks. By heavy tasks we understand tasks with many threads and
opened sockets (e.g. balancers). This freeze leads to an increase number
of client timeouts.
This patch implements following logic to fix this issue:
аfter iterating 1000 file descriptors file table lock will be released
thus providing a time gap for socket creation/deletion.
Now update is non atomic and socket may be skipped using calls:
dup2(oldfd, newfd);
close(oldfd);
But this case is not typical. Moreover before this patch skip is possible
too by hiding socket fd in unix socket buffer.
New sockets will be allocated with updated classid because cgroup state
is updated before start of the file descriptors iteration.
So in common cases this patch has no side effects.
Signed-off-by: Dmitry Yakunin <zeil@yandex-team.ru>
Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mahesh Bandewar [Mon, 9 Mar 2020 22:57:07 +0000 (15:57 -0700)]
macvlan: add cond_resched() during multicast processing
The Rx bound multicast packets are deferred to a workqueue and
macvlan can also suffer from the same attack that was discovered
by Syzbot for IPvlan. This solution is not as effective as in
IPvlan. IPvlan defers all (Tx and Rx) multicast packet processing
to a workqueue while macvlan does this way only for the Rx. This
fix should address the Rx codition to certain extent.
Tx is still suseptible. Tx multicast processing happens when
.ndo_start_xmit is called, hence we cannot add cond_resched().
However, it's not that severe since the user which is generating
/ flooding will be affected the most.
Fixes: 412ca1550cbe ("macvlan: Move broadcasts into a work queue")
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mahesh Bandewar [Mon, 9 Mar 2020 22:57:02 +0000 (15:57 -0700)]
ipvlan: add cond_resched_rcu() while processing muticast backlog
If there are substantial number of slaves created as simulated by
Syzbot, the backlog processing could take much longer and result
into the issue found in the Syzbot report.
INFO: rcu_sched detected stalls on CPUs/tasks:
(detected by 1, t=10502 jiffies, g=5049, c=5048, q=752)
All QSes seen, last rcu_sched kthread activity 10502 (
4294965563-
4294955061), jiffies_till_next_fqs=1, root ->qsmask 0x0
syz-executor.1 R running task on cpu 1 10984 11210 3866 0x30020008
179034491270
Call Trace:
<IRQ>
[<
ffffffff81497163>] _sched_show_task kernel/sched/core.c:8063 [inline]
[<
ffffffff81497163>] _sched_show_task.cold+0x2fd/0x392 kernel/sched/core.c:8030
[<
ffffffff8146a91b>] sched_show_task+0xb/0x10 kernel/sched/core.c:8073
[<
ffffffff815c931b>] print_other_cpu_stall kernel/rcu/tree.c:1577 [inline]
[<
ffffffff815c931b>] check_cpu_stall kernel/rcu/tree.c:1695 [inline]
[<
ffffffff815c931b>] __rcu_pending kernel/rcu/tree.c:3478 [inline]
[<
ffffffff815c931b>] rcu_pending kernel/rcu/tree.c:3540 [inline]
[<
ffffffff815c931b>] rcu_check_callbacks.cold+0xbb4/0xc29 kernel/rcu/tree.c:2876
[<
ffffffff815e3962>] update_process_times+0x32/0x80 kernel/time/timer.c:1635
[<
ffffffff816164f0>] tick_sched_handle+0xa0/0x180 kernel/time/tick-sched.c:161
[<
ffffffff81616ae4>] tick_sched_timer+0x44/0x130 kernel/time/tick-sched.c:1193
[<
ffffffff815e75f7>] __run_hrtimer kernel/time/hrtimer.c:1393 [inline]
[<
ffffffff815e75f7>] __hrtimer_run_queues+0x307/0xd90 kernel/time/hrtimer.c:1455
[<
ffffffff815e90ea>] hrtimer_interrupt+0x2ea/0x730 kernel/time/hrtimer.c:1513
[<
ffffffff844050f4>] local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1031 [inline]
[<
ffffffff844050f4>] smp_apic_timer_interrupt+0x144/0x5e0 arch/x86/kernel/apic/apic.c:1056
[<
ffffffff84401cbe>] apic_timer_interrupt+0x8e/0xa0 arch/x86/entry/entry_64.S:778
RIP: 0010:do_raw_read_lock+0x22/0x80 kernel/locking/spinlock_debug.c:153
RSP: 0018:
ffff8801dad07ab8 EFLAGS:
00000a02 ORIG_RAX:
ffffffffffffff12
RAX:
0000000000000000 RBX:
ffff8801c4135680 RCX:
0000000000000000
RDX:
1ffff10038826afe RSI:
ffff88019d816bb8 RDI:
ffff8801c41357f0
RBP:
ffff8801dad07ac0 R08:
0000000000004b15 R09:
0000000000310273
R10:
ffff88019d816bb8 R11:
0000000000000001 R12:
ffff8801c41357e8
R13:
0000000000000000 R14:
ffff8801cfb19850 R15:
ffff8801cfb198b0
[<
ffffffff8101460e>] __raw_read_lock_bh include/linux/rwlock_api_smp.h:177 [inline]
[<
ffffffff8101460e>] _raw_read_lock_bh+0x3e/0x50 kernel/locking/spinlock.c:240
[<
ffffffff840d78ca>] ipv6_chk_mcast_addr+0x11a/0x6f0 net/ipv6/mcast.c:1006
[<
ffffffff84023439>] ip6_mc_input+0x319/0x8e0 net/ipv6/ip6_input.c:482
[<
ffffffff840211c8>] dst_input include/net/dst.h:449 [inline]
[<
ffffffff840211c8>] ip6_rcv_finish+0x408/0x610 net/ipv6/ip6_input.c:78
[<
ffffffff840214de>] NF_HOOK include/linux/netfilter.h:292 [inline]
[<
ffffffff840214de>] NF_HOOK include/linux/netfilter.h:286 [inline]
[<
ffffffff840214de>] ipv6_rcv+0x10e/0x420 net/ipv6/ip6_input.c:278
[<
ffffffff83a29efa>] __netif_receive_skb_one_core+0x12a/0x1f0 net/core/dev.c:5303
[<
ffffffff83a2a15c>] __netif_receive_skb+0x2c/0x1b0 net/core/dev.c:5417
[<
ffffffff83a2f536>] process_backlog+0x216/0x6c0 net/core/dev.c:6243
[<
ffffffff83a30d1b>] napi_poll net/core/dev.c:6680 [inline]
[<
ffffffff83a30d1b>] net_rx_action+0x47b/0xfb0 net/core/dev.c:6748
[<
ffffffff846002c8>] __do_softirq+0x2c8/0x99a kernel/softirq.c:317
[<
ffffffff813e656a>] invoke_softirq kernel/softirq.c:399 [inline]
[<
ffffffff813e656a>] irq_exit+0x16a/0x1a0 kernel/softirq.c:439
[<
ffffffff84405115>] exiting_irq arch/x86/include/asm/apic.h:561 [inline]
[<
ffffffff84405115>] smp_apic_timer_interrupt+0x165/0x5e0 arch/x86/kernel/apic/apic.c:1058
[<
ffffffff84401cbe>] apic_timer_interrupt+0x8e/0xa0 arch/x86/entry/entry_64.S:778
</IRQ>
RIP: 0010:__sanitizer_cov_trace_pc+0x26/0x50 kernel/kcov.c:102
RSP: 0018:
ffff880196033bd8 EFLAGS:
00000246 ORIG_RAX:
ffffffffffffff12
RAX:
ffff88019d8161c0 RBX:
00000000ffffffff RCX:
ffffc90003501000
RDX:
0000000000000002 RSI:
ffffffff816236d1 RDI:
0000000000000005
RBP:
ffff880196033bd8 R08:
ffff88019d8161c0 R09:
0000000000000000
R10:
1ffff10032c067f0 R11:
0000000000000000 R12:
0000000000000000
R13:
0000000000000080 R14:
0000000000000000 R15:
0000000000000000
[<
ffffffff816236d1>] do_futex+0x151/0x1d50 kernel/futex.c:3548
[<
ffffffff816260f0>] C_SYSC_futex kernel/futex_compat.c:201 [inline]
[<
ffffffff816260f0>] compat_SyS_futex+0x270/0x3b0 kernel/futex_compat.c:175
[<
ffffffff8101da17>] do_syscall_32_irqs_on arch/x86/entry/common.c:353 [inline]
[<
ffffffff8101da17>] do_fast_syscall_32+0x357/0xe1c arch/x86/entry/common.c:415
[<
ffffffff84401a9b>] entry_SYSENTER_compat+0x8b/0x9d arch/x86/entry/entry_64_compat.S:139
RIP: 0023:0xf7f23c69
RSP: 002b:
00000000f5d1f12c EFLAGS:
00000282 ORIG_RAX:
00000000000000f0
RAX:
ffffffffffffffda RBX:
000000000816af88 RCX:
0000000000000080
RDX:
0000000000000000 RSI:
0000000000000000 RDI:
000000000816af8c
RBP:
00000000f5d1f228 R08:
0000000000000000 R09:
0000000000000000
R10:
0000000000000000 R11:
0000000000000000 R12:
0000000000000000
R13:
0000000000000000 R14:
0000000000000000 R15:
0000000000000000
rcu_sched kthread starved for 10502 jiffies! g5049 c5048 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x0 ->cpu=1
rcu_sched R running task on cpu 1 13048 8 2 0x90000000
179099587640
Call Trace:
[<
ffffffff8147321f>] context_switch+0x60f/0xa60 kernel/sched/core.c:3209
[<
ffffffff8100095a>] __schedule+0x5aa/0x1da0 kernel/sched/core.c:3934
[<
ffffffff810021df>] schedule+0x8f/0x1b0 kernel/sched/core.c:4011
[<
ffffffff8101116d>] schedule_timeout+0x50d/0xee0 kernel/time/timer.c:1803
[<
ffffffff815c13f1>] rcu_gp_kthread+0xda1/0x3b50 kernel/rcu/tree.c:2327
[<
ffffffff8144b318>] kthread+0x348/0x420 kernel/kthread.c:246
[<
ffffffff84400266>] ret_from_fork+0x56/0x70 arch/x86/entry/entry_64.S:393
Fixes: ba35f8588f47 (“ipvlan: Defer multicast / broadcast processing to a work-queue”)
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mahesh Bandewar [Mon, 9 Mar 2020 22:56:56 +0000 (15:56 -0700)]
ipvlan: don't deref eth hdr before checking it's set
IPvlan in L3 mode discards outbound multicast packets but performs
the check before ensuring the ether-header is set or not. This is
an error that Eric found through code browsing.
Fixes: 2ad7bf363841 (“ipvlan: Initial check-in of the IPVLAN driver.”)
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Edward Cree [Mon, 9 Mar 2020 18:16:24 +0000 (18:16 +0000)]
sfc: detach from cb_page in efx_copy_channel()
It's a resource, not a parameter, so we can't copy it into the new
channel's TX queues, otherwise aliasing will lead to resource-
management bugs if the channel is subsequently torn down without
being initialised.
Before the Fixes:-tagged commit there was a similar bug with
tsoh_page, but I'm not sure it's worth doing another fix for such
old kernels.
Fixes: e9117e5099ea ("sfc: Firmware-Assisted TSO version 2")
Suggested-by: Derek Shute <Derek.Shute@stratus.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Mon, 9 Mar 2020 23:16:42 +0000 (16:16 -0700)]
Merge tag 'ktest-v5.6' of git://git./linux/kernel/git/rostedt/linux-ktest
Pull Ktest fixes and clean ups from Steven Rostedt:
- Make the default option oldconfig instead of randconfig (one too many
times I lost my config because I left the build type out)
- Add timeout to ssh sync to sync before reboot (prevents test hangs)
- A couple of spelling fix patches
* tag 'ktest-v5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest:
ktest: Fix typos in ktest.pl
ktest: Add timeout for ssh sync testing
ktest: Make default build option oldconfig not randconfig
ktest: Fix some typos in sample.conf
Linus Torvalds [Mon, 9 Mar 2020 23:12:20 +0000 (16:12 -0700)]
Merge tag 'mmc-v5.6-rc1' of git://git./linux/kernel/git/ulfh/mmc
Pull MMC host fixes from Ulf Hansson:
- sdhci-msm: Silence warning about turning function into static
- sdhci-pci-gli: Fix support for GL975x by enabling MSI interrupt
* tag 'mmc-v5.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: sdhci-pci-gli: Enable MSI interrupt for GL975x
mmc: sdhci-msm: Mark sdhci_msm_cqe_disable static
Linus Torvalds [Mon, 9 Mar 2020 23:02:32 +0000 (16:02 -0700)]
Merge tag 'for_linus' of git://git./linux/kernel/git/mst/vhost
Pull virtio fixes from Michael Tsirkin:
"Some bug fixes all over the place"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
virtio_balloon: Adjust label in virtballoon_probe
virtio-blk: improve virtqueue error to BLK_STS
virtio-blk: fix hw_queue stopped on arbitrary error
virtio_ring: Fix mem leak with vring_new_virtqueue()
Christian Brauner [Sun, 8 Mar 2020 13:29:17 +0000 (14:29 +0100)]
pid: make ENOMEM return value more obvious
The alloc_pid() codepath used to be simpler. With the introducation of the
ability to choose specific pids in
49cb2fc42ce4 ("fork: extend clone3() to
support setting a PID") it got more complex. It hasn't been super obvious
that ENOMEM is returned when the pid namespace init process/child subreaper
of the pid namespace has died. As can be seen from multiple attempts to
improve this see e.g. [1] and most recently [2].
We regressed returning ENOMEM in [3] and [2] restored it. Let's add a
comment on top explaining that this is historic and documented behavior and
cannot easily be changed.
[1]:
35f71bc0a09a ("fork: report pid reservation failure properly")
[2]:
b26ebfe12f34 ("pid: Fix error return value in some cases")
[3]:
49cb2fc42ce4 ("fork: extend clone3() to support setting a PID")
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Masanari Iida [Mon, 9 Mar 2020 11:54:30 +0000 (20:54 +0900)]
ktest: Fix typos in ktest.pl
This patch fixes multipe spelling typo found in ktest.pl.
Link: http://lkml.kernel.org/r/20200309115430.57540-1-standby24x7@gmail.com
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Steven Rostedt (VMware) [Mon, 9 Mar 2020 20:00:11 +0000 (16:00 -0400)]
ktest: Add timeout for ssh sync testing
Before rebooting the box, a "ssh sync" is called to the test machine to see
if it is alive or not. But if the test machine is in a partial state, that
ssh may never actually finish, and the ktest test hangs.
Add a 10 second timeout to the sync test, which will fail after 10 seconds
and then cause the test to reboot the test machine.
Cc: stable@vger.kernel.org
Fixes: 6474ace999edd ("ktest.pl: Powercycle the box on reboot if no connection can be made")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Steven Rostedt (VMware) [Wed, 13 Nov 2019 18:36:24 +0000 (13:36 -0500)]
ktest: Make default build option oldconfig not randconfig
For the last time, I screwed up my ktest config file, and the build went
into the default "randconfig", blowing away the .config that I had set up.
The reason for the default randconfig was because when this was first
written, I wanted to do a bunch of randconfigs. But as time progressed,
ktest isn't about randconfig anymore, and because randconfig destroys the
config in the build directory, it's a dangerous default to have. Use
oldconfig as the default.
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Masanari Iida [Mon, 30 Sep 2019 12:49:25 +0000 (21:49 +0900)]
ktest: Fix some typos in sample.conf
This patch fixes some spelling typo in sample.conf
Link: http://lkml.kernel.org/r/20190930124925.20250-1-standby24x7@gmail.com
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Masanari Iida [Mon, 9 Mar 2020 10:43:56 +0000 (19:43 +0900)]
linux-next: DOC: RDS: Fix a typo in rds.txt
This patch fix a spelling typo in rds.txt
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Walleij [Mon, 9 Mar 2020 15:26:04 +0000 (16:26 +0100)]
pinctrl: qcom: Assign irq_eoi conditionally
The hierarchical parts of MSM pinctrl/GPIO is only
used when the device tree has a "wakeup-parent" as
a phandle, but the .irq_eoi is anyway assigned leading
to semantic problems on elder Qualcomm chipsets.
When the drivers/mfd/qcom-pm8xxx.c driver calls
chained_irq_exit() that call will in turn call chip->irq_eoi()
which is set to irq_chip_eoi_parent() by default on a
hierachical IRQ chip, and the parent is pinctrl-msm.c
so that will in turn unconditionally call
irq_chip_eoi_parent() again, but its parent is invalid
so we get the following crash:
Unnable to handle kernel NULL pointer dereference at
virtual address
00000010
pgd = (ptrval)
[
00000010] *pgd=
00000000
Internal error: Oops: 5 [#1] PREEMPT SMP ARM
(...)
PC is at irq_chip_eoi_parent+0x4/0x10
LR is at pm8xxx_irq_handler+0x1b4/0x2d8
If we solve this crash by avoiding to call up to
irq_chip_eoi_parent(), the machine will hang and get
reset by the watchdog, because of semantic issues,
probably inside irq_chip.
As a solution, just assign the .irq_eoi conditionally if
we are actually using a wakeup parent.
Cc: David Heidelberg <david@ixit.cz>
Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: Lina Iyer <ilina@codeaurora.org>
Cc: Stephen Boyd <swboyd@chromium.org>
Cc: stable@vger.kernel.org
Fixes: e35a6ae0eb3a ("pinctrl/msm: Setup GPIO chip in hierarchy")
Link: https://lore.kernel.org/r/20200306121221.1231296-1-linus.walleij@linaro.org
Link: https://lore.kernel.org/r/20200309125207.571840-1-linus.walleij@linaro.org
Link: https://lore.kernel.org/r/20200309152604.585112-1-linus.walleij@linaro.org
Tested-by: David Heidelberg <david@ixit.cz>
Acked-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Mathias Kresin [Thu, 5 Mar 2020 18:22:45 +0000 (19:22 +0100)]
pinctrl: falcon: fix syntax error
Add the missing semicolon after of_node_put to get the file compiled.
Fixes: f17d2f54d36d ("pinctrl: falcon: Add of_node_put() before return")
Cc: stable@vger.kernel.org # v5.4+
Signed-off-by: Mathias Kresin <dev@kresin.me>
Link: https://lore.kernel.org/r/20200305182245.9636-1-dev@kresin.me
Acked-by: Thomas Langer <thomas.langer@intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Linus Walleij [Fri, 6 Mar 2020 14:34:15 +0000 (15:34 +0100)]
pinctrl: qcom: ssbi-gpio: Fix fwspec parsing bug
We are parsing SSBI gpios as fourcell fwspecs but they are
twocell. Probably a simple copy-and-paste bug.
Tested on the APQ8060 DragonBoard and after this ethernet
and MMC card detection works again.
Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: stable@vger.kernel.org
Reviewed-by: Brian Masney <masneyb@onstation.org>
Fixes: ae436fe81053 ("pinctrl: ssbi-gpio: convert to hierarchical IRQ helpers in gpio core")
Link: https://lore.kernel.org/r/20200306143416.1476250-1-linus.walleij@linaro.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Thomas Bogendoerfer [Sat, 7 Mar 2020 09:00:22 +0000 (10:00 +0100)]
MAINTAINERS: Correct MIPS patchwork URL
MIPS patchwork lives on patchwork.kernel.org for quite some time.
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Dmitry Yakunin [Thu, 5 Mar 2020 12:33:12 +0000 (15:33 +0300)]
inet_diag: return classid for all socket types
In commit
1ec17dbd90f8 ("inet_diag: fix reporting cgroup classid and
fallback to priority") croup classid reporting was fixed. But this works
only for TCP sockets because for other socket types icsk parameter can
be NULL and classid code path is skipped. This change moves classid
handling to inet_diag_msg_attrs_fill() function.
Also inet_diag_msg_attrs_size() helper was added and addends in
nlmsg_new() were reordered to save order from inet_sk_diag_fill().
Fixes: 1ec17dbd90f8 ("inet_diag: fix reporting cgroup classid and fallback to priority")
Signed-off-by: Dmitry Yakunin <zeil@yandex-team.ru>
Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remi Pommarel [Sun, 8 Mar 2020 09:25:56 +0000 (10:25 +0100)]
net: stmmac: dwmac1000: Disable ACS if enhanced descs are not used
ACS (auto PAD/FCS stripping) removes FCS off 802.3 packets (LLC) so that
there is no need to manually strip it for such packets. The enhanced DMA
descriptors allow to flag LLC packets so that the receiving callback can
use that to strip FCS manually or not. On the other hand, normal
descriptors do not support that.
Thus in order to not truncate LLC packet ACS should be disabled when
using normal DMA descriptors.
Fixes: 47dd7a540b8a0 ("net: add support for STMicroelectronics Ethernet controllers.")
Signed-off-by: Remi Pommarel <repk@triplefau.lt>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 8 Mar 2020 06:05:14 +0000 (22:05 -0800)]
gre: fix uninit-value in __iptunnel_pull_header
syzbot found an interesting case of the kernel reading
an uninit-value [1]
Problem is in the handling of ETH_P_WCCP in gre_parse_header()
We look at the byte following GRE options to eventually decide
if the options are four bytes longer.
Use skb_header_pointer() to not pull bytes if we found
that no more bytes were needed.
All callers of gre_parse_header() are properly using pskb_may_pull()
anyway before proceeding to next header.
[1]
BUG: KMSAN: uninit-value in pskb_may_pull include/linux/skbuff.h:2303 [inline]
BUG: KMSAN: uninit-value in __iptunnel_pull_header+0x30c/0xbd0 net/ipv4/ip_tunnel_core.c:94
CPU: 1 PID: 11784 Comm: syz-executor940 Not tainted 5.6.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1c9/0x220 lib/dump_stack.c:118
kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:118
__msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:215
pskb_may_pull include/linux/skbuff.h:2303 [inline]
__iptunnel_pull_header+0x30c/0xbd0 net/ipv4/ip_tunnel_core.c:94
iptunnel_pull_header include/net/ip_tunnels.h:411 [inline]
gre_rcv+0x15e/0x19c0 net/ipv6/ip6_gre.c:606
ip6_protocol_deliver_rcu+0x181b/0x22c0 net/ipv6/ip6_input.c:432
ip6_input_finish net/ipv6/ip6_input.c:473 [inline]
NF_HOOK include/linux/netfilter.h:307 [inline]
ip6_input net/ipv6/ip6_input.c:482 [inline]
ip6_mc_input+0xdf2/0x1460 net/ipv6/ip6_input.c:576
dst_input include/net/dst.h:442 [inline]
ip6_rcv_finish net/ipv6/ip6_input.c:76 [inline]
NF_HOOK include/linux/netfilter.h:307 [inline]
ipv6_rcv+0x683/0x710 net/ipv6/ip6_input.c:306
__netif_receive_skb_one_core net/core/dev.c:5198 [inline]
__netif_receive_skb net/core/dev.c:5312 [inline]
netif_receive_skb_internal net/core/dev.c:5402 [inline]
netif_receive_skb+0x66b/0xf20 net/core/dev.c:5461
tun_rx_batched include/linux/skbuff.h:4321 [inline]
tun_get_user+0x6aef/0x6f60 drivers/net/tun.c:1997
tun_chr_write_iter+0x1f2/0x360 drivers/net/tun.c:2026
call_write_iter include/linux/fs.h:1901 [inline]
new_sync_write fs/read_write.c:483 [inline]
__vfs_write+0xa5a/0xca0 fs/read_write.c:496
vfs_write+0x44a/0x8f0 fs/read_write.c:558
ksys_write+0x267/0x450 fs/read_write.c:611
__do_sys_write fs/read_write.c:623 [inline]
__se_sys_write fs/read_write.c:620 [inline]
__ia32_sys_write+0xdb/0x120 fs/read_write.c:620
do_syscall_32_irqs_on arch/x86/entry/common.c:339 [inline]
do_fast_syscall_32+0x3c7/0x6e0 arch/x86/entry/common.c:410
entry_SYSENTER_compat+0x68/0x77 arch/x86/entry/entry_64_compat.S:139
RIP: 0023:0xf7f62d99
Code: 90 e8 0b 00 00 00 f3 90 0f ae e8 eb f9 8d 74 26 00 89 3c 24 c3 90 90 90 90 90 90 90 90 90 90 90 90 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 eb 0d 90 90 90 90 90 90 90 90 90 90 90 90
RSP: 002b:
00000000fffedb2c EFLAGS:
00000217 ORIG_RAX:
0000000000000004
RAX:
ffffffffffffffda RBX:
0000000000000003 RCX:
0000000020002580
RDX:
0000000000000fca RSI:
0000000000000036 RDI:
0000000000000004
RBP:
0000000000008914 R08:
0000000000000000 R09:
0000000000000000
R10:
0000000000000000 R11:
0000000000000000 R12:
0000000000000000
R13:
0000000000000000 R14:
0000000000000000 R15:
0000000000000000
Uninit was created at:
kmsan_save_stack_with_flags mm/kmsan/kmsan.c:144 [inline]
kmsan_internal_poison_shadow+0x66/0xd0 mm/kmsan/kmsan.c:127
kmsan_slab_alloc+0x8a/0xe0 mm/kmsan/kmsan_hooks.c:82
slab_alloc_node mm/slub.c:2793 [inline]
__kmalloc_node_track_caller+0xb40/0x1200 mm/slub.c:4401
__kmalloc_reserve net/core/skbuff.c:142 [inline]
__alloc_skb+0x2fd/0xac0 net/core/skbuff.c:210
alloc_skb include/linux/skbuff.h:1051 [inline]
alloc_skb_with_frags+0x18c/0xa70 net/core/skbuff.c:5766
sock_alloc_send_pskb+0xada/0xc60 net/core/sock.c:2242
tun_alloc_skb drivers/net/tun.c:1529 [inline]
tun_get_user+0x10ae/0x6f60 drivers/net/tun.c:1843
tun_chr_write_iter+0x1f2/0x360 drivers/net/tun.c:2026
call_write_iter include/linux/fs.h:1901 [inline]
new_sync_write fs/read_write.c:483 [inline]
__vfs_write+0xa5a/0xca0 fs/read_write.c:496
vfs_write+0x44a/0x8f0 fs/read_write.c:558
ksys_write+0x267/0x450 fs/read_write.c:611
__do_sys_write fs/read_write.c:623 [inline]
__se_sys_write fs/read_write.c:620 [inline]
__ia32_sys_write+0xdb/0x120 fs/read_write.c:620
do_syscall_32_irqs_on arch/x86/entry/common.c:339 [inline]
do_fast_syscall_32+0x3c7/0x6e0 arch/x86/entry/common.c:410
entry_SYSENTER_compat+0x68/0x77 arch/x86/entry/entry_64_compat.S:139
Fixes: 95f5c64c3c13 ("gre: Move utility functions to common headers")
Fixes: c54419321455 ("GRE: Refactor GRE tunneling code.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Wiesner [Sat, 7 Mar 2020 12:31:57 +0000 (13:31 +0100)]
ipvlan: do not add hardware address of master to its unicast filter list
There is a problem when ipvlan slaves are created on a master device that
is a vmxnet3 device (ipvlan in VMware guests). The vmxnet3 driver does not
support unicast address filtering. When an ipvlan device is brought up in
ipvlan_open(), the ipvlan driver calls dev_uc_add() to add the hardware
address of the vmxnet3 master device to the unicast address list of the
master device, phy_dev->uc. This inevitably leads to the vmxnet3 master
device being forced into promiscuous mode by __dev_set_rx_mode().
Promiscuous mode is switched on the master despite the fact that there is
still only one hardware address that the master device should use for
filtering in order for the ipvlan device to be able to receive packets.
The comment above struct net_device describes the uc_promisc member as a
"counter, that indicates, that promiscuous mode has been enabled due to
the need to listen to additional unicast addresses in a device that does
not implement ndo_set_rx_mode()". Moreover, the design of ipvlan
guarantees that only the hardware address of a master device,
phy_dev->dev_addr, will be used to transmit and receive all packets from
its ipvlan slaves. Thus, the unicast address list of the master device
should not be modified by ipvlan_open() and ipvlan_stop() in order to make
ipvlan a workable option on masters that do not support unicast address
filtering.
Fixes: 2ad7bf3638411 ("ipvlan: Initial check-in of the IPVLAN driver")
Reported-by: Per Sundstrom <per.sundstrom@redqube.se>
Signed-off-by: Jiri Wiesner <jwiesner@suse.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Mon, 9 Mar 2020 00:44:44 +0000 (17:44 -0700)]
Linux 5.6-rc5
Linus Torvalds [Mon, 9 Mar 2020 00:36:22 +0000 (17:36 -0700)]
Merge tag 'armsoc-fixes' of git://git./linux/kernel/git/soc/soc
Pull ARM SoC fixes from Olof Johansson:
"We've been accruing these for a couple of weeks, so the batch is a bit
bigger than usual.
Largest delta is due to a led-bl driver that is added -- there was a
miscommunication before the merge window and the driver didn't make it
in. Due to this, the platforms needing it regressed. At this point, it
seemed easier to add the new driver than unwind the changes.
Besides that, there are a handful of various fixes:
- AMD tee memory leak fix
- A handful of fixlets for i.MX SCU communication
- A few maintainers woke up and realized DEBUG_FS had been missing
for a while, so a few updates of that.
... and the usual collection of smaller fixes to various platforms"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (37 commits)
ARM: socfpga_defconfig: Add back DEBUG_FS
arm64: dts: socfpga: agilex: Fix gmac compatible
ARM: bcm2835_defconfig: Explicitly restore CONFIG_DEBUG_FS
arm64: dts: meson: fix gxm-khadas-vim2 wifi
arm64: dts: meson-sm1-sei610: add missing interrupt-names
ARM: meson: Drop unneeded select of COMMON_CLK
ARM: dts: bcm2711: Add pcie0 alias
ARM: dts: bcm283x: Add missing properties to the PWR LED
tee: amdtee: fix memory leak in amdtee_open_session()
ARM: OMAP2+: Fix compile if CONFIG_HAVE_ARM_SMCCC is not set
arm: dts: dra76x: Fix mmc3 max-frequency
ARM: dts: dra7: Add "dma-ranges" property to PCIe RC DT nodes
bus: ti-sysc: Fix 1-wire reset quirk
ARM: dts: r8a7779: Remove deprecated "renesas, rcar-sata" compatible value
soc: imx-scu: Align imx sc msg structs to 4
firmware: imx: Align imx_sc_msg_req_cpu_start to 4
firmware: imx: scu-pd: Align imx sc msg structs to 4
firmware: imx: misc: Align imx sc msg structs to 4
firmware: imx: scu: Ensure sequential TX
ARM: dts: imx7-colibri: Fix frequency for sd/mmc
...
Linus Torvalds [Mon, 9 Mar 2020 00:33:52 +0000 (17:33 -0700)]
Merge tag 'edac_urgent-2020-03-08' of git://git./linux/kernel/git/ras/ras
Pull EDAC fix from Borislav Petkov:
"Error reporting fix for synopsys_edac: do not overwrite partial
decoded error message (Sherry Sun)"
* tag 'edac_urgent-2020-03-08' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
EDAC/synopsys: Do not print an error with back-to-back snprintf() calls
Linus Torvalds [Sun, 8 Mar 2020 15:49:44 +0000 (10:49 -0500)]
Merge tag 'char-misc-5.6-rc5' of git://git./linux/kernel/git/gregkh/char-misc
Pull char/misc fixes from Greg KH:
"Here are four small char/misc driver fixes for reported issues for
5.6-rc5.
These fixes are:
- binder fix for a potential use-after-free problem found (took two
tries to get it right)
- interconnect core fix
- altera-stapl driver fix
All four of these have been in linux-next for a while with no reported
issues"
* tag 'char-misc-5.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
binder: prevent UAF for binderfs devices II
interconnect: Handle memory allocation errors
altera-stapl: altera_get_note: prevent write beyond end of 'key'
binder: prevent UAF for binderfs devices
Linus Torvalds [Sun, 8 Mar 2020 15:39:40 +0000 (10:39 -0500)]
Merge tag 'driver-core-5.6-rc5' of git://git./linux/kernel/git/gregkh/driver-core
Pull driver core and debugfs fixes from Greg KH:
"Here are four small driver core / debugfs patches for 5.6-rc3:
- debugfs api cleanup now that all debugfs_create_regset32() callers
have been fixed up. This was waiting until after the -rc1 merge as
these fixes came in through different trees
- driver core sync state fixes based on reports of minor issues found
in the feature
All of these have been in linux-next with no reported issues"
* tag 'driver-core-5.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
driver core: Skip unnecessary work when device doesn't have sync_state()
driver core: Add dev_has_sync_state()
driver core: Call sync_state() even if supplier has no consumers
debugfs: remove return value of debugfs_create_regset32()
Linus Torvalds [Sun, 8 Mar 2020 15:35:04 +0000 (10:35 -0500)]
Merge tag 'tty-5.6-rc5' of git://git./linux/kernel/git/gregkh/tty
Pull tty/serial fixes from Greg KH:
"Here are some small tty/serial fixes for 5.6-rc5
Just some small serial driver fixes, and a vt core fixup, full details
are:
- vt fixes for issues found by syzbot
- serdev fix for Apple boxes
- fsl_lpuart serial driver fixes
- MAINTAINER update for incorrect serial files
- new device ids for 8250_exar driver
- mvebu-uart fix
All of these have been in linux-next with no reported issues"
* tag 'tty-5.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
tty: serial: fsl_lpuart: free IDs allocated by IDA
Revert "tty: serial: fsl_lpuart: drop EARLYCON_DECLARE"
serdev: Fix detection of UART devices on Apple machines.
MAINTAINERS: Add missed files related to Synopsys DesignWare UART
serial: 8250_exar: add support for ACCES cards
tty:serial:mvebu-uart:fix a wrong return
vt: selection, push sel_lock up
vt: selection, push console lock down
Linus Torvalds [Sun, 8 Mar 2020 15:32:23 +0000 (10:32 -0500)]
Merge tag 'usb-5.6-rc5' of git://git./linux/kernel/git/gregkh/usb
Pull USB/PHY fixes from Greg KH:
"Here are some small USB and PHY driver fixes for reported issues for
5.6-rc5.
Included in here are:
- phy driver fixes
- new USB quirks
- USB cdns3 gadget driver fixes
- USB hub core fixes
All of these have been in linux-next with no reported issues"
* tag 'usb-5.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: dwc3: gadget: Update chain bit correctly when using sg list
usb: core: port: do error out if usb_autopm_get_interface() fails
usb: core: hub: do error out if usb_autopm_get_interface() fails
usb: core: hub: fix unhandled return by employing a void function
usb: storage: Add quirk for Samsung Fit flash
usb: quirks: add NO_LPM quirk for Logitech Screen Share
usb: usb251xb: fix regulator probe and error handling
phy: allwinner: Fix GENMASK misuse
usb: cdns3: gadget: toggle cycle bit before reset endpoint
usb: cdns3: gadget: link trb should point to next request
phy: mapphone-mdm6600: Fix timeouts by adding wake-up handling
phy: brcm-sata: Correct MDIO operations for 40nm platforms
phy: ti: gmii-sel: do not fail in case of gmii
phy: ti: gmii-sel: fix set of copy-paste errors
phy: core: Fix phy_get() to not return error on link creation failure
phy: mapphone-mdm6600: Fix write timeouts with shorter GPIO toggle interval
Corey Minyard [Fri, 6 Mar 2020 17:23:14 +0000 (11:23 -0600)]
pid: Fix error return value in some cases
Recent changes to alloc_pid() allow the pid number to be specified on
the command line. If set_tid_size is set, then the code scanning the
levels will hard-set retval to -EPERM, overriding it's previous -ENOMEM
value.
After the code scanning the levels, there are error returns that do not
set retval, assuming it is still set to -ENOMEM.
So set retval back to -ENOMEM after scanning the levels.
Fixes: 49cb2fc42ce4 ("fork: extend clone3() to support setting a PID")
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Andrei Vagin <avagin@gmail.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Adrian Reber <areber@redhat.com>
Cc: <stable@vger.kernel.org> # 5.5
Link: https://lore.kernel.org/r/20200306172314.12232-1-minyard@acm.org
[christian.brauner@ubuntu.com: fixup commit message]
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Nathan Chancellor [Sun, 16 Feb 2020 00:40:39 +0000 (17:40 -0700)]
virtio_balloon: Adjust label in virtballoon_probe
Clang warns when CONFIG_BALLOON_COMPACTION is unset:
../drivers/virtio/virtio_balloon.c:963:1: warning: unused label
'out_del_vqs' [-Wunused-label]
out_del_vqs:
^~~~~~~~~~~~
1 warning generated.
Move the label within the preprocessor block since it is only used when
CONFIG_BALLOON_COMPACTION is set.
Fixes: 1ad6f58ea936 ("virtio_balloon: Fix memory leaks on errors in virtballoon_probe()")
Link: https://github.com/ClangBuiltLinux/linux/issues/886
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Link: https://lore.kernel.org/r/20200216004039.23464-1-natechancellor@gmail.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Halil Pasic [Thu, 13 Feb 2020 12:37:28 +0000 (13:37 +0100)]
virtio-blk: improve virtqueue error to BLK_STS
Let's change the mapping between virtqueue_add errors to BLK_STS
statuses, so that -ENOSPC, which indicates virtqueue full is still
mapped to BLK_STS_DEV_RESOURCE, but -ENOMEM which indicates non-device
specific resource outage is mapped to BLK_STS_RESOURCE.
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Link: https://lore.kernel.org/r/20200213123728.61216-3-pasic@linux.ibm.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Halil Pasic [Thu, 13 Feb 2020 12:37:27 +0000 (13:37 +0100)]
virtio-blk: fix hw_queue stopped on arbitrary error
Since nobody else is going to restart our hw_queue for us, the
blk_mq_start_stopped_hw_queues() is in virtblk_done() is not sufficient
necessarily sufficient to ensure that the queue will get started again.
In case of global resource outage (-ENOMEM because mapping failure,
because of swiotlb full) our virtqueue may be empty and we can get
stuck with a stopped hw_queue.
Let us not stop the queue on arbitrary errors, but only on -EONSPC which
indicates a full virtqueue, where the hw_queue is guaranteed to get
started by virtblk_done() before when it makes sense to carry on
submitting requests. Let us also remove a stale comment.
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Cc: Jens Axboe <axboe@kernel.dk>
Fixes: f7728002c1c7 ("virtio_ring: fix return code on DMA mapping fails")
Link: https://lore.kernel.org/r/20200213123728.61216-2-pasic@linux.ibm.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Suman Anna [Mon, 24 Feb 2020 21:26:43 +0000 (15:26 -0600)]
virtio_ring: Fix mem leak with vring_new_virtqueue()
The functions vring_new_virtqueue() and __vring_new_virtqueue() are used
with split rings, and any allocations within these functions are managed
outside of the .we_own_ring flag. The commit
cbeedb72b97a ("virtio_ring:
allocate desc state for split ring separately") allocates the desc state
within the __vring_new_virtqueue() but frees it only when the .we_own_ring
flag is set. This leads to a memory leak when freeing such allocated
virtqueues with the vring_del_virtqueue() function.
Fix this by moving the desc_state free code outside the flag and only
for split rings. Issue was discovered during testing with remoteproc
and virtio_rpmsg.
Fixes: cbeedb72b97a ("virtio_ring: allocate desc state for split ring separately")
Signed-off-by: Suman Anna <s-anna@ti.com>
Link: https://lore.kernel.org/r/20200224212643.30672-1-s-anna@ti.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Eric Biggers [Thu, 5 Mar 2020 08:41:38 +0000 (00:41 -0800)]
fscrypt: don't evict dirty inodes after removing key
After FS_IOC_REMOVE_ENCRYPTION_KEY removes a key, it syncs the
filesystem and tries to get and put all inodes that were unlocked by the
key so that unused inodes get evicted via fscrypt_drop_inode().
Normally, the inodes are all clean due to the sync.
However, after the filesystem is sync'ed, userspace can modify and close
one of the files. (Userspace is *supposed* to close the files before
removing the key. But it doesn't always happen, and the kernel can't
assume it.) This causes the inode to be dirtied and have i_count == 0.
Then, fscrypt_drop_inode() failed to consider this case and indicated
that the inode can be dropped, causing the write to be lost.
On f2fs, other problems such as a filesystem freeze could occur due to
the inode being freed while still on f2fs's dirty inode list.
Fix this bug by making fscrypt_drop_inode() only drop clean inodes.
I've written an xfstest which detects this bug on ext4, f2fs, and ubifs.
Fixes: b1c0ec3599f4 ("fscrypt: add FS_IOC_REMOVE_ENCRYPTION_KEY ioctl")
Cc: <stable@vger.kernel.org> # v5.4+
Link: https://lore.kernel.org/r/20200305084138.653498-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Linus Torvalds [Sun, 8 Mar 2020 01:52:55 +0000 (19:52 -0600)]
Merge tag 'for-linus' of git://git./linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
"Nothing particularly exciting, some small ODP regressions from the mmu
notifier rework, another bunch of syzkaller fixes, and a bug fix for a
botched syzkaller fix in the first rc pull request.
- Fix busted syzkaller fix in 'get_new_pps' - this turned out to
crash on certain HW configurations
- Bug fixes for various missed things in error unwinds
- Add a missing rcu_read_lock annotation in hfi/qib
- Fix two ODP related regressions from the recent mmu notifier
changes
- Several more syzkaller bugs in siw, RDMA netlink, verbs and iwcm
- Revert an old patch in CMA as it is now shown to not be allocating
port numbers properly"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/iwcm: Fix iwcm work deallocation
RDMA/siw: Fix failure handling during device creation
RDMA/nldev: Fix crash when set a QP to a new counter but QPN is missing
RDMA/odp: Ensure the mm is still alive before creating an implicit child
RDMA/core: Fix protection fault in ib_mr_pool_destroy
IB/mlx5: Fix implicit ODP race
IB/hfi1, qib: Ensure RCU is locked when accessing list
RDMA/core: Fix pkey and port assignment in get_new_pps
RMDA/cm: Fix missing ib_cm_destroy_id() in ib_cm_insert_listen()
RDMA/rw: Fix error flow during RDMA context initialization
RDMA/core: Fix use of logical OR in get_new_pps
Revert "RDMA/cma: Simplify rdma_resolve_addr() error flow"