Taehee Yoo [Fri, 25 Sep 2020 18:13:12 +0000 (18:13 +0000)]
net: core: introduce struct netdev_nested_priv for nested interface infrastructure
Functions related to nested interface infrastructure such as
netdev_walk_all_{ upper | lower }_dev() pass both private functions
and "data" pointer to handle their own things.
At this point, the data pointer type is void *.
In order to make it easier to expand common variables and functions,
this new netdev_nested_priv structure is added.
In the following patch, a new member variable will be added into this
struct to fix the lockdep issue.
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Taehee Yoo [Fri, 25 Sep 2020 18:13:02 +0000 (18:13 +0000)]
net: core: add __netdev_upper_dev_unlink()
The netdev_upper_dev_unlink() has to work differently according to flags.
This idea is the same with __netdev_upper_dev_link().
In the following patches, new flags will be added.
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cong Wang [Mon, 28 Sep 2020 18:31:03 +0000 (11:31 -0700)]
net_sched: remove a redundant goto chain check
All TC actions call tcf_action_check_ctrlact() to validate
goto chain, so this check in tcf_action_init_1() is actually
redundant. Remove it to save troubles of leaking memory.
Fixes:
e49d8c22f126 ("net_sched: defer tcf_idr_insert() in tcf_action_init_1()")
Reported-by: Vlad Buslov <vladbu@mellanox.com>
Suggested-by: Davide Caratti <dcaratti@redhat.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nikolay Aleksandrov [Mon, 28 Sep 2020 15:30:02 +0000 (18:30 +0300)]
net: bridge: fdb: don't flush ext_learn entries
When a user-space software manages fdb entries externally it should
set the ext_learn flag which marks the fdb entry as externally managed
and avoids expiring it (they're treated as static fdbs). Unfortunately
on events where fdb entries are flushed (STP down, netlink fdb flush
etc) these fdbs are also deleted automatically by the bridge. That in turn
causes trouble for the managing user-space software (e.g. in MLAG setups
we lose remote fdb entries on port flaps).
These entries are completely externally managed so we should avoid
automatically deleting them, the only exception are offloaded entries
(i.e. BR_FDB_ADDED_BY_EXT_LEARN + BR_FDB_OFFLOADED). They are flushed as
before.
Fixes:
eb100e0e24a2 ("net: bridge: allow to add externally learned entries from user-space")
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 28 Sep 2020 19:25:42 +0000 (12:25 -0700)]
Merge branch 'master' of git://git./linux/kernel/git/klassert/ipsec
Steffen Klassert says:
====================
pull request (net): ipsec 2020-09-28
1) Fix a build warning in ip_vti if CONFIG_IPV6 is not set.
From YueHaibing.
2) Restore IPCB on espintcp before handing the packet to xfrm
as the information there is still needed.
From Sabrina Dubroca.
3) Fix pmtu updating for xfrm interfaces.
From Sabrina Dubroca.
4) Some xfrm state information was not cloned with xfrm_do_migrate.
Fixes to clone the full xfrm state, from Antony Antony.
5) Use the correct address family in xfrm_state_find. The struct
flowi must always be interpreted along with the original
address family. This got lost over the years.
Fix from Herbert Xu.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Sun, 27 Sep 2020 17:44:29 +0000 (19:44 +0200)]
r8169: fix RTL8168f/RTL8411 EPHY config
Mistakenly bit 2 was set instead of bit 3 as in the vendor driver.
Fixes:
a7a92cf81589 ("r8169: sync PCIe PHY init with vendor driver 8.047.01")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marian-Cristian Rotariu [Sun, 27 Sep 2020 12:34:39 +0000 (13:34 +0100)]
dt-bindings: net: renesas,ravb: Add support for r8a774e1 SoC
Document RZ/G2H (R8A774E1) SoC bindings.
Signed-off-by: Marian-Cristian Rotariu <marian-cristian.rotariu.rb@bp.renesas.com>
Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@gmail.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Sun, 27 Sep 2020 06:42:11 +0000 (09:42 +0300)]
mlxsw: spectrum_acl: Fix mlxsw_sp_acl_tcam_group_add()'s error path
If mlxsw_sp_acl_tcam_group_id_get() fails, the mutex initialized earlier
is not destroyed.
Fix this by initializing the mutex after calling the function. This is
symmetric to mlxsw_sp_acl_tcam_group_del().
Fixes:
5ec2ee28d27b ("mlxsw: spectrum_acl: Introduce a mutex to guard region list updates")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Randy Dunlap [Sun, 27 Sep 2020 04:33:43 +0000 (21:33 -0700)]
mdio: fix mdio-thunder.c dependency & build error
Fix build error by selecting MDIO_DEVRES for MDIO_THUNDER.
Fixes this build error:
ld: drivers/net/phy/mdio-thunder.o: in function `thunder_mdiobus_pci_probe':
drivers/net/phy/mdio-thunder.c:78: undefined reference to `devm_mdiobus_alloc_size'
Fixes:
379d7ac7ca31 ("phy: mdio-thunder: Add driver for Cavium Thunder SoC MDIO buses.")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Cc: netdev@vger.kernel.org
Cc: David Daney <david.daney@cavium.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Igor Russkikh [Fri, 25 Sep 2020 20:27:35 +0000 (23:27 +0300)]
net: atlantic: fix build when object tree is separate
Driver subfolder files refer parent folder includes in an
absolute manner.
Makefile contains a -I for this, but apparently that does not
work if object tree is separated.
Adding srctree to fix that.
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Fri, 25 Sep 2020 15:26:16 +0000 (08:26 -0700)]
MAINTAINERS: Add Vladimir as a maintainer for DSA
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Vladimir Oltean <olteanv@gmail.com>
Acked-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ioana Ciornei [Fri, 25 Sep 2020 14:35:30 +0000 (17:35 +0300)]
dpaa2-eth: fix command version for Tx shaping
When adding the support for TBF offload, the improper command version
was added even though the command format is for the V2 of
dpni_set_tx_shaping(). This does not affect the functionality of TBF
since the only change between these two versions is the addition of the
exceeded parameters which are not used in TBF. Still, fix the bug so
that we keep things in sync.
Fixes:
39344a89623d ("dpaa2-eth: add API for Tx shaping")
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wilken Gottwalt [Fri, 25 Sep 2020 13:58:57 +0000 (15:58 +0200)]
net: usb: ax88179_178a: add Toshiba usb 3.0 adapter
Adds the driver_info and usb ids of the AX88179 based Toshiba USB 3.0
ethernet adapter.
Signed-off-by: Wilken Gottwalt <wilken.gottwalt@mailbox.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 26 Sep 2020 00:04:56 +0000 (17:04 -0700)]
Merge branch 'bonding-team-basic-dev-needed_headroom-support'
Eric Dumazet says:
====================
bonding/team: basic dev->needed_headroom support
Both bonding and team drivers support non-ethernet devices,
but missed proper dev->needed_headroom initializations.
syzbot found a crash caused by bonding, I mirrored the fix in team as well.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 25 Sep 2020 13:38:08 +0000 (06:38 -0700)]
team: set dev->needed_headroom in team_setup_by_port()
Some devices set needed_headroom. If we ignore it, we might
end up crashing in various skb_push() for example in ipgre_header()
since some layers assume enough headroom has been reserved.
Fixes:
1d76efe1577b ("team: add support for non-ethernet devices")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 25 Sep 2020 13:38:07 +0000 (06:38 -0700)]
bonding: set dev->needed_headroom in bond_setup_by_slave()
syzbot managed to crash a host by creating a bond
with a GRE device.
For non Ethernet device, bonding calls bond_setup_by_slave()
instead of ether_setup(), and unfortunately dev->needed_headroom
was not copied from the new added member.
[ 171.243095] skbuff: skb_under_panic: text:
ffffffffa184b9ea len:116 put:20 head:
ffff883f84012dc0 data:
ffff883f84012dbc tail:0x70 end:0xd00 dev:bond0
[ 171.243111] ------------[ cut here ]------------
[ 171.243112] kernel BUG at net/core/skbuff.c:112!
[ 171.243117] invalid opcode: 0000 [#1] SMP KASAN PTI
[ 171.243469] gsmi: Log Shutdown Reason 0x03
[ 171.243505] Call Trace:
[ 171.243506] <IRQ>
[ 171.243512] [<
ffffffffa171be59>] skb_push+0x49/0x50
[ 171.243516] [<
ffffffffa184b9ea>] ipgre_header+0x2a/0xf0
[ 171.243520] [<
ffffffffa17452d7>] neigh_connected_output+0xb7/0x100
[ 171.243524] [<
ffffffffa186f1d3>] ip6_finish_output2+0x383/0x490
[ 171.243528] [<
ffffffffa186ede2>] __ip6_finish_output+0xa2/0x110
[ 171.243531] [<
ffffffffa186acbc>] ip6_finish_output+0x2c/0xa0
[ 171.243534] [<
ffffffffa186abe9>] ip6_output+0x69/0x110
[ 171.243537] [<
ffffffffa186ac90>] ? ip6_output+0x110/0x110
[ 171.243541] [<
ffffffffa189d952>] mld_sendpack+0x1b2/0x2d0
[ 171.243544] [<
ffffffffa189d290>] ? mld_send_report+0xf0/0xf0
[ 171.243548] [<
ffffffffa189c797>] mld_ifc_timer_expire+0x2d7/0x3b0
[ 171.243551] [<
ffffffffa189c4c0>] ? mld_gq_timer_expire+0x50/0x50
[ 171.243556] [<
ffffffffa0fea270>] call_timer_fn+0x30/0x130
[ 171.243559] [<
ffffffffa0fea17c>] expire_timers+0x4c/0x110
[ 171.243563] [<
ffffffffa0fea0e3>] __run_timers+0x213/0x260
[ 171.243566] [<
ffffffffa0fecb7d>] ? ktime_get+0x3d/0xa0
[ 171.243570] [<
ffffffffa0ff9c4e>] ? clockevents_program_event+0x7e/0xe0
[ 171.243574] [<
ffffffffa0f7e5d5>] ? sched_clock_cpu+0x15/0x190
[ 171.243577] [<
ffffffffa0fe973d>] run_timer_softirq+0x1d/0x40
[ 171.243581] [<
ffffffffa1c00152>] __do_softirq+0x152/0x2f0
[ 171.243585] [<
ffffffffa0f44e1f>] irq_exit+0x9f/0xb0
[ 171.243588] [<
ffffffffa1a02e1d>] smp_apic_timer_interrupt+0xfd/0x1a0
[ 171.243591] [<
ffffffffa1a01ea6>] apic_timer_interrupt+0x86/0x90
Fixes:
f5184d267c1a ("net: Allow netdevices to specify needed head/tailroom")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ivan Khoronzhuk [Fri, 25 Sep 2020 12:44:39 +0000 (15:44 +0300)]
net: ethernet: cavium: octeon_mgmt: use phy_start and phy_stop
To start also "phy state machine", with UP state as it should be,
the phy_start() has to be used, in another case machine even is not
triggered. After this change negotiation is supposed to be triggered
by SM workqueue.
It's not correct usage, but it appears after the following patch,
so add it as a fix.
Fixes:
74a992b3598a ("net: phy: add phy_check_link_status")
Signed-off-by: Ivan Khoronzhuk <ikhoronz@cisco.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wong Vee Khee [Fri, 25 Sep 2020 09:54:06 +0000 (17:54 +0800)]
net: stmmac: Fix clock handling on remove path
While unloading the dwmac-intel driver, clk_disable_unprepare() is
being called twice in stmmac_dvr_remove() and
intel_eth_pci_remove(). This causes kernel panic on the second call.
Removing the second call of clk_disable_unprepare() in
intel_eth_pci_remove().
Fixes:
09f012e64e4b ("stmmac: intel: Fix clock handling on error and remove paths")
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Voon Weifeng <weifeng.voon@intel.com>
Signed-off-by: Wong Vee Khee <vee.khee.wong@intel.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ronak Doshi [Fri, 25 Sep 2020 06:11:29 +0000 (23:11 -0700)]
vmxnet3: fix cksum offload issues for non-udp tunnels
Commit
dacce2be3312 ("vmxnet3: add geneve and vxlan tunnel offload
support") added support for encapsulation offload. However, the inner
offload capability is to be restrictued to UDP tunnels.
This patch fixes the issue for non-udp tunnels by adding features
check capability and filtering appropriate features for non-udp tunnels.
Fixes:
dacce2be3312 ("vmxnet3: add geneve and vxlan tunnel offload support")
Signed-off-by: Ronak Doshi <doshir@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 25 Sep 2020 23:24:20 +0000 (16:24 -0700)]
Merge branch '100GbE' of git://git./linux/kernel/git/jkirsher/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2020-09-25
This series contains updates to the iavf and ice driver.
Sylwester fixes a crash with iavf resume due to getting the wrong pointers.
Ani fixes a call trace in ice resume by calling pci_save_state().
Jakes fixes memory leaks in case of register_netdev() failure or
ice_cfg_vsi_lan() failure for the ice driver.
v2: Rebased; no other changes
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 25 Sep 2020 20:14:15 +0000 (13:14 -0700)]
Merge tag 'wireless-drivers-2020-09-25' of git://git./linux/kernel/git/kvalo/wireless-drivers
Kalle Valo says:
====================
wireless-drivers fixes for v5.9
Second, and last, set of fixes for v5.9. Only one important regression
fix for mt76.
mt76
* fix a regression in aggregation which appeared after mac80211 changes
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jacob Keller [Wed, 2 Sep 2020 15:53:47 +0000 (08:53 -0700)]
ice: fix memory leak in ice_vsi_setup
During ice_vsi_setup, if ice_cfg_vsi_lan fails, it does not properly
release memory associated with the VSI rings. If we had used devres
allocations for the rings, this would be ok. However, we use kzalloc and
kfree_rcu for these ring structures.
Using the correct label to cleanup the rings during ice_vsi_setup
highlights an issue in the ice_vsi_clear_rings function: it can leave
behind stale ring pointers in the q_vectors structure.
When releasing rings, we must also ensure that no q_vector associated
with the VSI will point to this ring again. To resolve this, loop over
all q_vectors and release their ring mapping. Because we are about to
free all rings, no q_vector should remain pointing to any of the rings
in this VSI.
Fixes:
5513b920a4f7 ("ice: Update Tx scheduler tree for VSI multi-Tx queue support")
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Jacob Keller [Wed, 2 Sep 2020 15:53:46 +0000 (08:53 -0700)]
ice: fix memory leak if register_netdev_fails
The ice_setup_pf_sw function can cause a memory leak if register_netdev
fails, due to accidentally failing to free the VSI rings. Fix the memory
leak by using ice_vsi_release, ensuring we actually go through the full
teardown process.
This should be safe even if the netdevice is not registered because we
will have set the netdev pointer to NULL, ensuring ice_vsi_release won't
call unregister_netdev.
An alternative fix would be moving management of the PF VSI netdev into
the main VSI setup code. This is complicated and likely requires
significant refactor in how we manage VSIs
Fixes:
3a858ba392c3 ("ice: Add support for VSI allocation and deallocation")
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Anirudh Venkataramanan [Wed, 2 Sep 2020 15:53:45 +0000 (08:53 -0700)]
ice: Fix call trace on suspend
It appears that the ice_suspend flow is missing a call to pci_save_state
and this is triggering the message "State of device not saved by
ice_suspend" and a call trace. Fix it.
Fixes:
769c500dcc1e ("ice: Add advanced power mgmt for WoL")
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Sylwester Dziedziuch [Wed, 2 Sep 2020 12:54:59 +0000 (12:54 +0000)]
iavf: Fix incorrect adapter get in iavf_resume
When calling iavf_resume there was a crash because wrong
function was used to get iavf_adapter and net_device pointers.
Changed how iavf_resume is getting iavf_adapter and net_device
pointers from pci_dev.
Fixes:
5eae00c57f5e ("i40evf: main driver core")
Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Herbert Xu [Fri, 25 Sep 2020 04:42:56 +0000 (14:42 +1000)]
xfrm: Use correct address family in xfrm_state_find
The struct flowi must never be interpreted by itself as its size
depends on the address family. Therefore it must always be grouped
with its original family value.
In this particular instance, the original family value is lost in
the function xfrm_state_find. Therefore we get a bogus read when
it's coupled with the wrong family which would occur with inter-
family xfrm states.
This patch fixes it by keeping the original family value.
Note that the same bug could potentially occur in LSM through
the xfrm_state_pol_flow_match hook. I checked the current code
there and it seems to be safe for now as only secid is used which
is part of struct flowi_common. But that API should be changed
so that so that we don't get new bugs in the future. We could
do that by replacing fl with just secid or adding a family field.
Reported-by: syzbot+577fbac3145a6eb2e7a5@syzkaller.appspotmail.com
Fixes:
48b8d78315bf ("[XFRM]: State selection update to use inner...")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Priyaranjan Jha [Thu, 24 Sep 2020 22:23:14 +0000 (15:23 -0700)]
tcp: skip DSACKs with dubious sequence ranges
Currently, we use length of DSACKed range to compute number of
delivered packets. And if sequence range in DSACK is corrupted,
we can get bogus dsacked/acked count, and bogus cwnd.
This patch put bounds on DSACKed range to skip update of data
delivery and spurious retransmission information, if the DSACK
is unlikely caused by sender's action:
- DSACKed range shouldn't be greater than maximum advertised rwnd.
- Total no. of DSACKed segments shouldn't be greater than total
no. of retransmitted segs. Unlike spurious retransmits, network
duplicates or corrupted DSACKs shouldn't be counted as delivery.
Signed-off-by: Priyaranjan Jha <priyarjha@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jamie Iles [Thu, 24 Sep 2020 14:56:45 +0000 (15:56 +0100)]
net/fsl: quieten expected MDIO access failures
MDIO reads can happen during PHY probing, and printing an error with
dev_err can result in a large number of error messages during device
probe. On a platform with a serial console this can result in
excessively long boot times in a way that looks like an infinite loop
when multiple busses are present. Since
0f183fd151c (net/fsl: enable
extended scanning in xgmac_mdio) we perform more scanning so there are
potentially more failures.
Reduce the logging level to dev_dbg which is consistent with the
Freescale enetc driver.
Cc: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Jamie Iles <jamie@nuviainc.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Helmut Grohne [Thu, 24 Sep 2020 08:37:47 +0000 (10:37 +0200)]
net: dsa: microchip: really look for phy-mode in port nodes
The previous implementation failed to account for the "ports" node. The
actual port nodes are not child nodes of the switch node, but a "ports"
node sits in between.
Fixes:
edecfa98f602 ("net: dsa: microchip: look for phy-mode in port nodes")
Signed-off-by: Helmut Grohne <helmut.grohne@intenta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rohit Maheshwari [Thu, 24 Sep 2020 06:58:45 +0000 (12:28 +0530)]
net/tls: race causes kernel panic
BUG: kernel NULL pointer dereference, address:
00000000000000b8
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD
80000008b6fef067 P4D
80000008b6fef067 PUD
8b6fe6067 PMD 0
Oops: 0000 [#1] SMP PTI
CPU: 12 PID: 23871 Comm: kworker/12:80 Kdump: loaded Tainted: G S
5.9.0-rc3+ #1
Hardware name: Supermicro X10SRA-F/X10SRA-F, BIOS 2.1 03/29/2018
Workqueue: events tx_work_handler [tls]
RIP: 0010:tx_work_handler+0x1b/0x70 [tls]
Code: dc fe ff ff e8 16 d4 a3 f6 66 0f 1f 44 00 00 0f 1f 44 00 00 55 53 48 8b
6f 58 48 8b bd a0 04 00 00 48 85 ff 74 1c 48 8b 47 28 <48> 8b 90 b8 00 00 00 83
e2 02 75 0c f0 48 0f ba b0 b8 00 00 00 00
RSP: 0018:
ffffa44ace61fe88 EFLAGS:
00010286
RAX:
0000000000000000 RBX:
ffff91da9e45cc30 RCX:
dead000000000122
RDX:
0000000000000001 RSI:
ffff91da9e45cc38 RDI:
ffff91d95efac200
RBP:
ffff91da133fd780 R08:
0000000000000000 R09:
000073746e657665
R10:
8080808080808080 R11:
0000000000000000 R12:
ffff91dad7d30700
R13:
ffff91dab6561080 R14:
0ffff91dad7d3070 R15:
ffff91da9e45cc38
FS:
0000000000000000(0000) GS:
ffff91dad7d00000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
00000000000000b8 CR3:
0000000906478003 CR4:
00000000003706e0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
Call Trace:
process_one_work+0x1a7/0x370
worker_thread+0x30/0x370
? process_one_work+0x370/0x370
kthread+0x114/0x130
? kthread_park+0x80/0x80
ret_from_fork+0x22/0x30
tls_sw_release_resources_tx() waits for encrypt_pending, which
can have race, so we need similar changes as in commit
0cada33241d9de205522e3858b18e506ca5cce2c here as well.
Fixes:
a42055e8d2c3 ("net/tls: Add support for async encryption of records for performance")
Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wang Qing [Thu, 24 Sep 2020 06:50:24 +0000 (14:50 +0800)]
net/ethernet/broadcom: fix spelling typo
Modify the comment typo: "compliment" -> "complement".
Signed-off-by: Wang Qing <wangqing@vivo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xiaoliang Yang [Thu, 24 Sep 2020 02:11:13 +0000 (10:11 +0800)]
net: mscc: ocelot: fix fields offset in SG_CONFIG_REG_3
INIT_IPS and GATE_ENABLE fields have a wrong offset in SG_CONFIG_REG_3.
This register is used by stream gate control of PSFP, and it has not
been used before, because PSFP is not implemented in ocelot driver.
Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xiaoliang Yang [Thu, 24 Sep 2020 01:57:46 +0000 (09:57 +0800)]
net: dsa: felix: convert TAS link speed based on phylink speed
state->speed holds a value of 10, 100, 1000 or 2500, but
QSYS_TAG_CONFIG_LINK_SPEED expects a value of 0, 1, 2, 3. So convert the
speed to a proper value.
Fixes:
de143c0e274b ("net: dsa: felix: Configure Time-Aware Scheduler via taprio offload")
Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Luo bin [Thu, 24 Sep 2020 01:31:51 +0000 (09:31 +0800)]
hinic: fix wrong return value of mac-set cmd
It should also be regarded as an error when hw return status=4 for PF's
setting mac cmd. Only if PF return status=4 to VF should this cmd be
taken special treatment.
Fixes:
7dd29ee12865 ("hinic: add sriov feature support")
Signed-off-by: Luo bin <luobin9@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xie He [Wed, 23 Sep 2020 18:18:18 +0000 (11:18 -0700)]
drivers/net/wan/x25_asy: Correct the ndo_open and ndo_stop functions
1.
Move the lapb_register/lapb_unregister calls into the ndo_open/ndo_stop
functions.
This makes the LAPB protocol start/stop when the network interface
starts/stops. When the network interface is down, the LAPB protocol
shouldn't be running and the LAPB module shoudn't be generating control
frames.
2.
Move netif_start_queue/netif_stop_queue into the ndo_open/ndo_stop
functions.
This makes the TX queue start/stop when the network interface
starts/stops.
(netif_stop_queue was originally in the ndo_stop function. But to make
the code look better, I created a new function to use as ndo_stop, and
made it call the original ndo_stop function. I moved netif_stop_queue
from the original ndo_stop function to the new ndo_stop function.)
Cc: Martin Schiller <ms@dev.tdt.de>
Signed-off-by: Xie He <xie.he.0141@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Maciej Żenczykowski [Wed, 23 Sep 2020 20:18:15 +0000 (13:18 -0700)]
net/ipv4: always honour route mtu during forwarding
Documentation/networking/ip-sysctl.txt:46 says:
ip_forward_use_pmtu - BOOLEAN
By default we don't trust protocol path MTUs while forwarding
because they could be easily forged and can lead to unwanted
fragmentation by the router.
You only need to enable this if you have user-space software
which tries to discover path mtus by itself and depends on the
kernel honoring this information. This is normally not the case.
Default: 0 (disabled)
Possible values:
0 - disabled
1 - enabled
Which makes it pretty clear that setting it to 1 is a potential
security/safety/DoS issue, and yet it is entirely reasonable to want
forwarded traffic to honour explicitly administrator configured
route mtus (instead of defaulting to device mtu).
Indeed, I can't think of a single reason why you wouldn't want to.
Since you configured a route mtu you probably know better...
It is pretty common to have a higher device mtu to allow receiving
large (jumbo) frames, while having some routes via that interface
(potentially including the default route to the internet) specify
a lower mtu.
Note that ipv6 forwarding uses device mtu unless the route is locked
(in which case it will use the route mtu).
This approach is not usable for IPv4 where an 'mtu lock' on a route
also has the side effect of disabling TCP path mtu discovery via
disabling the IPv4 DF (don't frag) bit on all outgoing frames.
I'm not aware of a way to lock a route from an IPv6 RA, so that also
potentially seems wrong.
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: Sunmeet Gill (Sunny) <sgill@quicinc.com>
Cc: Vinay Paradkar <vparadka@qti.qualcomm.com>
Cc: Tyler Wear <twear@quicinc.com>
Cc: David Ahern <dsahern@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 25 Sep 2020 02:46:21 +0000 (19:46 -0700)]
Merge branch 'net_sched-fix-a-UAF-in-tcf_action_init'
Cong Wang says:
====================
net_sched: fix a UAF in tcf_action_init()
This patchset fixes a use-after-free triggered by syzbot. Please
find more details in each patch description.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Cong Wang [Wed, 23 Sep 2020 03:56:24 +0000 (20:56 -0700)]
net_sched: commit action insertions together
syzbot is able to trigger a failure case inside the loop in
tcf_action_init(), and when this happens we clean up with
tcf_action_destroy(). But, as these actions are already inserted
into the global IDR, other parallel process could free them
before tcf_action_destroy(), then we will trigger a use-after-free.
Fix this by deferring the insertions even later, after the loop,
and committing all the insertions in a separate loop, so we will
never fail in the middle of the insertions any more.
One side effect is that the window between alloction and final
insertion becomes larger, now it is more likely that the loop in
tcf_del_walker() sees the placeholder -EBUSY pointer. So we have
to check for error pointer in tcf_del_walker().
Reported-and-tested-by: syzbot+2287853d392e4b42374a@syzkaller.appspotmail.com
Fixes:
0190c1d452a9 ("net: sched: atomically check-allocate action")
Cc: Vlad Buslov <vladbu@mellanox.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cong Wang [Wed, 23 Sep 2020 03:56:23 +0000 (20:56 -0700)]
net_sched: defer tcf_idr_insert() in tcf_action_init_1()
All TC actions call tcf_idr_insert() for new action at the end
of their ->init(), so we can actually move it to a central place
in tcf_action_init_1().
And once the action is inserted into the global IDR, other parallel
process could free it immediately as its refcnt is still 1, so we can
not fail after this, we need to move it after the goto action
validation to avoid handling the failure case after insertion.
This is found during code review, is not directly triggered by syzbot.
And this prepares for the next patch.
Cc: Vlad Buslov <vladbu@mellanox.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Felix Fietkau [Wed, 23 Sep 2020 05:24:42 +0000 (07:24 +0200)]
mt76: mt7615: reduce maximum VHT MPDU length to 7991
After fixing mac80211 to allow larger A-MSDUs in some cases, there have been
reports of performance regressions and packet loss with some clients.
It appears that the issue occurs when the hardware is transmitting A-MSDUs
bigger than 8k. Limit the local VHT MPDU size capability to 7991, matching
the value used for MT7915 as well.
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200923052442.24141-1-nbd@nbd.name
Voon Weifeng [Wed, 23 Sep 2020 08:56:14 +0000 (16:56 +0800)]
net: stmmac: removed enabling eee in EEE set callback
EEE should be only be enabled during stmmac_mac_link_up() when the
link are up and being set up properly. set_eee should only do settings
configuration and disabling the eee.
Without this fix, turning on EEE using ethtool will return
"Operation not supported". This is due to the driver is in a dead loop
waiting for eee to be advertised in the for eee to be activated but the
driver will only configure the EEE advertisement after the eee is
activated.
Ethtool should only return "Operation not supported" if there is no EEE
capbility in the MAC controller.
Fixes:
8a7493e58ad6 ("net: stmmac: Fix a race in EEE enable callback")
Signed-off-by: Voon Weifeng <weifeng.voon@intel.com>
Acked-by: Mark Gross <mgross@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hauke Mehrtens [Tue, 22 Sep 2020 21:41:12 +0000 (23:41 +0200)]
net: lantiq: Add locking for TX DMA channel
The TX DMA channel data is accessed by the xrx200_start_xmit() and the
xrx200_tx_housekeeping() function from different threads. Make sure the
accesses are synchronized by acquiring the netif_tx_lock() in the
xrx200_tx_housekeeping() function too. This lock is acquired by the
kernel before calling xrx200_start_xmit().
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tian Tao [Tue, 22 Sep 2020 13:32:19 +0000 (21:32 +0800)]
net: switchdev: Fixed kerneldoc warning
Update kernel-doc line comments to fix warnings reported by make W=1.
net/switchdev/switchdev.c:413: warning: Function parameter or
member 'extack' not described in 'call_switchdev_notifiers'
Signed-off-by: Tian Tao <tiantao6@hisilicon.com>
Acked-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Geert Uytterhoeven [Tue, 22 Sep 2020 07:29:31 +0000 (09:29 +0200)]
Revert "ravb: Fixed to be able to unload modules"
This reverts commit
1838d6c62f57836639bd3d83e7855e0ee4f6defc.
This commit moved the ravb_mdio_init() call (and thus the
of_mdiobus_register() call) from the ravb_probe() to the ravb_open()
call. This causes a regression during system resume (s2idle/s2ram), as
new PHY devices cannot be bound while suspended.
During boot, the Micrel PHY is detected like this:
Micrel KSZ9031 Gigabit PHY
e6800000.ethernet-
ffffffff:00: attached PHY driver [Micrel KSZ9031 Gigabit PHY] (mii_bus:phy_addr=
e6800000.ethernet-
ffffffff:00, irq=228)
ravb
e6800000.ethernet eth0: Link is Up - 1Gbps/Full - flow control off
During system suspend, (A) defer_all_probes is set to true, and (B)
usermodehelper_disabled is set to UMH_DISABLED, to avoid drivers being
probed while suspended.
A. If CONFIG_MODULES=n, phy_device_register() calling device_add()
merely adds the device, but does not probe it yet, as
really_probe() returns early due to defer_all_probes being set:
dpm_resume+0x128/0x4f8
device_resume+0xcc/0x1b0
dpm_run_callback+0x74/0x340
ravb_resume+0x190/0x1b8
ravb_open+0x84/0x770
of_mdiobus_register+0x1e0/0x468
of_mdiobus_register_phy+0x1b8/0x250
of_mdiobus_phy_device_register+0x178/0x1e8
phy_device_register+0x114/0x1b8
device_add+0x3d4/0x798
bus_probe_device+0x98/0xa0
device_initial_probe+0x10/0x18
__device_attach+0xe4/0x140
bus_for_each_drv+0x64/0xc8
__device_attach_driver+0xb8/0xe0
driver_probe_device.part.11+0xc4/0xd8
really_probe+0x32c/0x3b8
Later, phy_attach_direct() notices no PHY driver has been bound,
and falls back to the Generic PHY, leading to degraded operation:
Generic PHY
e6800000.ethernet-
ffffffff:00: attached PHY driver [Generic PHY] (mii_bus:phy_addr=
e6800000.ethernet-
ffffffff:00, irq=POLL)
ravb
e6800000.ethernet eth0: Link is Up - 1Gbps/Full - flow control off
B. If CONFIG_MODULES=y, request_module() returns early with -EBUSY due
to UMH_DISABLED, and MDIO initialization fails completely:
mdio_bus
e6800000.ethernet-
ffffffff:00: error -16 loading PHY driver module for ID 0x00221622
ravb
e6800000.ethernet eth0: failed to initialize MDIO
PM: dpm_run_callback(): ravb_resume+0x0/0x1b8 returns -16
PM: Device
e6800000.ethernet failed to resume: error -16
Ignoring -EBUSY in phy_request_driver_module(), like was done for
-ENOENT in commit
21e194425abd65b5 ("net: phy: fix issue with loading
PHY driver w/o initramfs"), would makes it fall back to the Generic
PHY, like in the CONFIG_MODULES=n case.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: stable@vger.kernel.org
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mat Martineau [Mon, 21 Sep 2020 14:57:58 +0000 (16:57 +0200)]
mptcp: Wake up MPTCP worker when DATA_FIN found on a TCP FIN packet
When receiving a DATA_FIN MPTCP option on a TCP FIN packet, the DATA_FIN
information would be stored but the MPTCP worker did not get
scheduled. In turn, the MPTCP socket state would remain in
TCP_ESTABLISHED and no blocked operations would be awakened.
TCP FIN packets are seen by the MPTCP socket when moving skbs out of the
subflow receive queues, so schedule the MPTCP worker when a skb with
DATA_FIN but no data payload is moved from a subflow queue. Other cases
(DATA_FIN on a bare TCP ACK or on a packet with data payload) are
already handled.
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/84
Fixes:
43b54c6ee382 ("mptcp: Use full MPTCP-level disconnect state machine")
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 22 Sep 2020 22:08:41 +0000 (15:08 -0700)]
Merge branch 'fixes' of git://git./linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
"No common topic, just assorted fixes"
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fuse: fix the ->direct_IO() treatment of iov_iter
fs: fix cast in fsparam_u32hex() macro
vboxsf: Fix the check for the old binary mount-arguments struct
Linus Torvalds [Tue, 22 Sep 2020 21:43:50 +0000 (14:43 -0700)]
Merge git://git./linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
- fix failure to add bond interfaces to a bridge, the offload-handling
code was too defensive there and recent refactoring unearthed that.
Users complained (Ido)
- fix unnecessarily reflecting ECN bits within TOS values / QoS marking
in TCP ACK and reset packets (Wei)
- fix a deadlock with bpf iterator. Hopefully we're in the clear on
this front now... (Yonghong)
- BPF fix for clobbering r2 in bpf_gen_ld_abs (Daniel)
- fix AQL on mt76 devices with FW rate control and add a couple of AQL
issues in mac80211 code (Felix)
- fix authentication issue with mwifiex (Maximilian)
- WiFi connectivity fix: revert IGTK support in ti/wlcore (Mauro)
- fix exception handling for multipath routes via same device (David
Ahern)
- revert back to a BH spin lock flavor for nsid_lock: there are paths
which do require the BH context protection (Taehee)
- fix interrupt / queue / NAPI handling in the lantiq driver (Hauke)
- fix ife module load deadlock (Cong)
- make an adjustment to netlink reply message type for code added in
this release (the sole change touching uAPI here) (Michal)
- a number of fixes for small NXP and Microchip switches (Vladimir)
[ Pull request acked by David: "you can expect more of this in the
future as I try to delegate more things to Jakub" ]
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (167 commits)
net: mscc: ocelot: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries
net: dsa: seville: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries
net: dsa: felix: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries
inet_diag: validate INET_DIAG_REQ_PROTOCOL attribute
net: bridge: br_vlan_get_pvid_rcu() should dereference the VLAN group under RCU
net: Update MAINTAINERS for MediaTek switch driver
net/mlx5e: mlx5e_fec_in_caps() returns a boolean
net/mlx5e: kTLS, Avoid kzalloc(GFP_KERNEL) under spinlock
net/mlx5e: kTLS, Fix leak on resync error flow
net/mlx5e: kTLS, Add missing dma_unmap in RX resync
net/mlx5e: kTLS, Fix napi sync and possible use-after-free
net/mlx5e: TLS, Do not expose FPGA TLS counter if not supported
net/mlx5e: Fix using wrong stats_grps in mlx5e_update_ndo_stats()
net/mlx5e: Fix multicast counter not up-to-date in "ip -s"
net/mlx5e: Fix endianness when calculating pedit mask first bit
net/mlx5e: Enable adding peer miss rules only if merged eswitch is supported
net/mlx5e: CT: Fix freeing ct_label mapping
net/mlx5e: Fix memory leak of tunnel info when rule under multipath not ready
net/mlx5e: Use synchronize_rcu to sync with NAPI
net/mlx5e: Use RCU to protect rq->xdp_prog
...
Linus Torvalds [Tue, 22 Sep 2020 21:36:50 +0000 (14:36 -0700)]
Merge tag 'io_uring-5.9-2020-09-22' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
"A few fixes - most of them regression fixes from this cycle, but also
a few stable heading fixes, and a build fix for the included demo tool
since some systems now actually have gettid() available"
* tag 'io_uring-5.9-2020-09-22' of git://git.kernel.dk/linux-block:
io_uring: fix openat/openat2 unified prep handling
io_uring: mark statx/files_update/epoll_ctl as non-SQPOLL
tools/io_uring: fix compile breakage
io_uring: don't use retry based buffered reads for non-async bdev
io_uring: don't re-setup vecs/iter in io_resumit_prep() is already there
io_uring: don't run task work on an exiting task
io_uring: drop 'ctx' ref on task work cancelation
io_uring: grab any needed state during defer prep
Linus Torvalds [Tue, 22 Sep 2020 21:31:38 +0000 (14:31 -0700)]
Merge tag 'block-5.9-2020-09-22' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"A few NVMe fixes, and a dasd write zero fix"
* tag 'block-5.9-2020-09-22' of git://git.kernel.dk/linux-block:
nvmet: get transport reference for passthru ctrl
nvme-core: get/put ctrl and transport module in nvme_dev_open/release()
nvme-tcp: fix kconfig dependency warning when !CRYPTO
nvme-pci: disable the write zeros command for Intel 600P/P3100
s390/dasd: Fix zero write for FBA devices
Linus Torvalds [Tue, 22 Sep 2020 16:08:33 +0000 (09:08 -0700)]
Merge tag 'trace-v5.9-rc5' of git://git./linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
- Check kprobe is enabled before unregistering from ftrace as it isn't
registered when disabled.
- Remove kprobes enabled via command-line that is on init text when
freed.
- Add missing RCU synchronization for ftrace trampoline symbols removed
from kallsyms.
- Free trampoline on error path if ftrace_startup() fails.
- Give more space for the longer PID numbers in trace output.
- Fix a possible double free in the histogram code.
- A couple of fixes that were discovered by sparse.
* tag 'trace-v5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
bootconfig: init: make xbc_namebuf static
kprobes: tracing/kprobes: Fix to kill kprobes on initmem after boot
tracing: fix double free
ftrace: Let ftrace_enable_sysctl take a kernel pointer buffer
tracing: Make the space reserved for the pid wider
ftrace: Fix missing synchronize_rcu() removing trampoline from kallsyms
ftrace: Free the trampoline when ftrace_startup() fails
kprobes: Fix to check probe enabled before disarm_kprobe_ftrace()
David S. Miller [Tue, 22 Sep 2020 00:40:53 +0000 (17:40 -0700)]
Merge branch 'Fix-broken-tc-flower-rules-for-mscc_ocelot-switches'
Vladimir Oltean says:
====================
Fix broken tc-flower rules for mscc_ocelot switches
All 3 switch drivers from the Ocelot family have the same bug in the
VCAP IS2 key offsets, which is that some keys are in the incorrect
order.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Mon, 21 Sep 2020 22:56:38 +0000 (01:56 +0300)]
net: mscc: ocelot: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries
The IS2 IP4_TCP_UDP key offsets do not correspond to the VSC7514
datasheet. Whether they work or not is unknown to me. On VSC9959 and
VSC9953, with the same mistake and same discrepancy from the
documentation, tc-flower src_port and dst_port rules did not work, so I
am assuming the same is true here.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Mon, 21 Sep 2020 22:56:37 +0000 (01:56 +0300)]
net: dsa: seville: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries
Since these were copied from the Felix VCAP IS2 code, and only the
offsets were adjusted, the order of the bit fields is still wrong.
Fix it.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xiaoliang Yang [Mon, 21 Sep 2020 22:56:36 +0000 (01:56 +0300)]
net: dsa: felix: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries
Some of the IS2 IP4_TCP_UDP keys are not correct, like L4_DPORT,
L4_SPORT and other L4 keys. This prevents offloaded tc-flower rules from
matching on src_port and dst_port for TCP and UDP packets.
Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 21 Sep 2020 14:27:20 +0000 (07:27 -0700)]
inet_diag: validate INET_DIAG_REQ_PROTOCOL attribute
User space could send an invalid INET_DIAG_REQ_PROTOCOL attribute
as caught by syzbot.
BUG: KMSAN: uninit-value in inet_diag_lock_handler net/ipv4/inet_diag.c:55 [inline]
BUG: KMSAN: uninit-value in __inet_diag_dump+0x58c/0x720 net/ipv4/inet_diag.c:1147
CPU: 0 PID: 8505 Comm: syz-executor174 Not tainted 5.9.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x21c/0x280 lib/dump_stack.c:118
kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:122
__msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:219
inet_diag_lock_handler net/ipv4/inet_diag.c:55 [inline]
__inet_diag_dump+0x58c/0x720 net/ipv4/inet_diag.c:1147
inet_diag_dump_compat+0x2a5/0x380 net/ipv4/inet_diag.c:1254
netlink_dump+0xb73/0x1cb0 net/netlink/af_netlink.c:2246
__netlink_dump_start+0xcf2/0xea0 net/netlink/af_netlink.c:2354
netlink_dump_start include/linux/netlink.h:246 [inline]
inet_diag_rcv_msg_compat+0x5da/0x6c0 net/ipv4/inet_diag.c:1288
sock_diag_rcv_msg+0x24f/0x620 net/core/sock_diag.c:256
netlink_rcv_skb+0x6d7/0x7e0 net/netlink/af_netlink.c:2470
sock_diag_rcv+0x63/0x80 net/core/sock_diag.c:275
netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline]
netlink_unicast+0x11c8/0x1490 net/netlink/af_netlink.c:1330
netlink_sendmsg+0x173a/0x1840 net/netlink/af_netlink.c:1919
sock_sendmsg_nosec net/socket.c:651 [inline]
sock_sendmsg net/socket.c:671 [inline]
____sys_sendmsg+0xc82/0x1240 net/socket.c:2353
___sys_sendmsg net/socket.c:2407 [inline]
__sys_sendmsg+0x6d1/0x820 net/socket.c:2440
__do_sys_sendmsg net/socket.c:2449 [inline]
__se_sys_sendmsg+0x97/0xb0 net/socket.c:2447
__x64_sys_sendmsg+0x4a/0x70 net/socket.c:2447
do_syscall_64+0x9f/0x140 arch/x86/entry/common.c:48
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x441389
Code: e8 fc ab 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 1b 09 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:
00007fff3b02ce98 EFLAGS:
00000246 ORIG_RAX:
000000000000002e
RAX:
ffffffffffffffda RBX:
0000000000000000 RCX:
0000000000441389
RDX:
0000000000000000 RSI:
0000000020001500 RDI:
0000000000000003
RBP:
00000000006cb018 R08:
00000000004002c8 R09:
00000000004002c8
R10:
0000000000000004 R11:
0000000000000246 R12:
0000000000402130
R13:
00000000004021c0 R14:
0000000000000000 R15:
0000000000000000
Uninit was created at:
kmsan_save_stack_with_flags mm/kmsan/kmsan.c:143 [inline]
kmsan_internal_poison_shadow+0x66/0xd0 mm/kmsan/kmsan.c:126
kmsan_slab_alloc+0x8a/0xe0 mm/kmsan/kmsan_hooks.c:80
slab_alloc_node mm/slub.c:2907 [inline]
__kmalloc_node_track_caller+0x9aa/0x12f0 mm/slub.c:4511
__kmalloc_reserve net/core/skbuff.c:142 [inline]
__alloc_skb+0x35f/0xb30 net/core/skbuff.c:210
alloc_skb include/linux/skbuff.h:1094 [inline]
netlink_alloc_large_skb net/netlink/af_netlink.c:1176 [inline]
netlink_sendmsg+0xdb9/0x1840 net/netlink/af_netlink.c:1894
sock_sendmsg_nosec net/socket.c:651 [inline]
sock_sendmsg net/socket.c:671 [inline]
____sys_sendmsg+0xc82/0x1240 net/socket.c:2353
___sys_sendmsg net/socket.c:2407 [inline]
__sys_sendmsg+0x6d1/0x820 net/socket.c:2440
__do_sys_sendmsg net/socket.c:2449 [inline]
__se_sys_sendmsg+0x97/0xb0 net/socket.c:2447
__x64_sys_sendmsg+0x4a/0x70 net/socket.c:2447
do_syscall_64+0x9f/0x140 arch/x86/entry/common.c:48
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes:
3f935c75eb52 ("inet_diag: support for wider protocol numbers")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Christoph Paasch <cpaasch@apple.com>
Cc: Mat Martineau <mathew.j.martineau@linux.intel.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Mon, 21 Sep 2020 22:07:09 +0000 (01:07 +0300)]
net: bridge: br_vlan_get_pvid_rcu() should dereference the VLAN group under RCU
When calling the RCU brother of br_vlan_get_pvid(), lockdep warns:
=============================
WARNING: suspicious RCU usage
5.9.0-rc3-01631-g13c17acb8e38-dirty #814 Not tainted
-----------------------------
net/bridge/br_private.h:1054 suspicious rcu_dereference_protected() usage!
Call trace:
lockdep_rcu_suspicious+0xd4/0xf8
__br_vlan_get_pvid+0xc0/0x100
br_vlan_get_pvid_rcu+0x78/0x108
The warning is because br_vlan_get_pvid_rcu() calls nbp_vlan_group()
which calls rtnl_dereference() instead of rcu_dereference(). In turn,
rtnl_dereference() calls rcu_dereference_protected() which assumes
operation under an RCU write-side critical section, which obviously is
not the case here. So, when the incorrect primitive is used to access
the RCU-protected VLAN group pointer, READ_ONCE() is not used, which may
cause various unexpected problems.
I'm sad to say that br_vlan_get_pvid() and br_vlan_get_pvid_rcu() cannot
share the same implementation. So fix the bug by splitting the 2
functions, and making br_vlan_get_pvid_rcu() retrieve the VLAN groups
under proper locking annotations.
Fixes:
7582f5b70f9a ("bridge: add br_vlan_get_pvid_rcu()")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 22 Sep 2020 00:32:42 +0000 (17:32 -0700)]
Merge tag 'mlx5-fixes-2020-09-18' of git://git./linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5 fixes-2020-09-18
This series introduces some fixes to mlx5 driver.
Please pull and let me know if there is any problem.
v1->v2:
Remove missing patch from -stable list.
For -stable v5.1
('net/mlx5: Fix FTE cleanup')
For -stable v5.3
('net/mlx5e: TLS, Do not expose FPGA TLS counter if not supported')
('net/mlx5e: Enable adding peer miss rules only if merged eswitch is supported')
For -stable v5.7
('net/mlx5e: Fix memory leak of tunnel info when rule under multipath not ready')
For -stable v5.8
('net/mlx5e: Use RCU to protect rq->xdp_prog')
('net/mlx5e: Fix endianness when calculating pedit mask first bit')
('net/mlx5e: Use synchronize_rcu to sync with NAPI')
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Sean Wang [Mon, 21 Sep 2020 23:09:23 +0000 (07:09 +0800)]
net: Update MAINTAINERS for MediaTek switch driver
Update maintainers for MediaTek switch driver with Landen Chao who is
familiar with MediaTek MT753x switch devices and will help maintenance
from the vendor side.
Cc: Steven Liu <steven.liu@mediatek.com>
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Landen Chao <Landen.Chao@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Saeed Mahameed [Fri, 11 Sep 2020 18:00:06 +0000 (11:00 -0700)]
net/mlx5e: mlx5e_fec_in_caps() returns a boolean
Returning errno is a bug, fix that.
Also fixes smatch warnings:
drivers/net/ethernet/mellanox/mlx5/core/en/port.c:453
mlx5e_fec_in_caps() warn: signedness bug returning '(-95)'
Fixes:
2132b71f78d2 ("net/mlx5e: Advertise globaly supported FEC modes")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Saeed Mahameed [Tue, 8 Sep 2020 05:54:43 +0000 (22:54 -0700)]
net/mlx5e: kTLS, Avoid kzalloc(GFP_KERNEL) under spinlock
The spinlock only needed when accessing the channel's icosq, grab the lock
after the buf allocation in resync_post_get_progress_params() to avoid
kzalloc(GFP_KERNEL) in atomic context.
Fixes:
0419d8c9d8f8 ("net/mlx5e: kTLS, Add kTLS RX resync support")
Reported-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Saeed Mahameed [Fri, 11 Sep 2020 04:16:04 +0000 (21:16 -0700)]
net/mlx5e: kTLS, Fix leak on resync error flow
Resync progress params buffer and dma weren't released on error,
Add missing error unwinding for resync_post_get_progress_params().
Fixes:
0419d8c9d8f8 ("net/mlx5e: kTLS, Add kTLS RX resync support")
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Saeed Mahameed [Tue, 8 Sep 2020 05:58:50 +0000 (22:58 -0700)]
net/mlx5e: kTLS, Add missing dma_unmap in RX resync
Progress params dma address is never unmapped, unmap it when completion
handling is over.
Fixes:
0419d8c9d8f8 ("net/mlx5e: kTLS, Add kTLS RX resync support")
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Tariq Toukan [Mon, 10 Aug 2020 12:59:41 +0000 (15:59 +0300)]
net/mlx5e: kTLS, Fix napi sync and possible use-after-free
Using synchronize_rcu() is sufficient to wait until running NAPI quits.
See similar upstream fix with detailed explanation:
("net/mlx5e: Use synchronize_rcu to sync with NAPI")
This change also fixes a possible use-after-free as the NAPI
might be already released at this stage.
Fixes:
0419d8c9d8f8 ("net/mlx5e: kTLS, Add kTLS RX resync support")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Tariq Toukan [Sun, 28 Jun 2020 10:06:06 +0000 (13:06 +0300)]
net/mlx5e: TLS, Do not expose FPGA TLS counter if not supported
The set of TLS TX global SW counters in mlx5e_tls_sw_stats_desc
is updated from all rings by using atomic ops.
This set of stats is used only in the FPGA TLS use case, not in
the Connect-X TLS one, where regular per-ring counters are used.
Do not expose them in the Connect-X use case, as this would cause
counter duplication. For example, tx_tls_drop_no_sync_data would
appear twice in the ethtool stats.
Fixes:
d2ead1f360e8 ("net/mlx5e: Add kTLS TX HW offload support")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Alaa Hleihel [Tue, 25 Aug 2020 07:41:50 +0000 (10:41 +0300)]
net/mlx5e: Fix using wrong stats_grps in mlx5e_update_ndo_stats()
The cited commit started to reuse function mlx5e_update_ndo_stats() for
the representors as well.
However, the function is hard-coded to work on mlx5e_nic_stats_grps only.
Due to this issue, the representors statistics were not updated in the
output of "ip -s".
Fix it to work with the correct group by extracting it from the caller's
profile.
Also, while at it and since this function became generic, move it to
en_stats.c and rename it accordingly.
Fixes:
8a236b15144b ("net/mlx5e: Convert rep stats to mlx5e_stats_grp-based infra")
Signed-off-by: Alaa Hleihel <alaa@nvidia.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Ron Diskin [Sun, 10 May 2020 11:39:51 +0000 (14:39 +0300)]
net/mlx5e: Fix multicast counter not up-to-date in "ip -s"
Currently the FW does not generate events for counters other than error
counters. Unlike ".get_ethtool_stats", ".ndo_get_stats64" (which ip -s
uses) might run in atomic context, while the FW interface is non atomic.
Thus, 'ip' is not allowed to issue FW commands, so it will only display
cached counters in the driver.
Add a SW counter (mcast_packets) in the driver to count rx multicast
packets. The counter also counts broadcast packets, as we consider it a
special case of multicast.
Use the counter value when calling "ip -s"/"ifconfig".
Fixes:
f62b8bb8f2d3 ("net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality")
Signed-off-by: Ron Diskin <rondi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Maor Dickman [Wed, 2 Sep 2020 13:49:52 +0000 (16:49 +0300)]
net/mlx5e: Fix endianness when calculating pedit mask first bit
The field mask value is provided in network byte order and has to
be converted to host byte order before calculating pedit mask
first bit.
Fixes:
88f30bbcbaaa ("net/mlx5e: Bit sized fields rewrite support")
Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Maor Dickman [Wed, 5 Aug 2020 14:56:04 +0000 (17:56 +0300)]
net/mlx5e: Enable adding peer miss rules only if merged eswitch is supported
The cited commit creates peer miss group during switchdev mode
initialization in order to handle miss packets correctly while in VF
LAG mode. This is done regardless of FW support of such groups which
could cause rules setups failure later on.
Fix by adding FW capability check before creating peer groups/rule.
Fixes:
ac004b832128 ("net/mlx5e: E-Switch, Add peer miss rules")
Signed-off-by: Maor Dickman <maord@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Raed Salem <raeds@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Roi Dayan [Sun, 26 Jul 2020 13:37:47 +0000 (16:37 +0300)]
net/mlx5e: CT: Fix freeing ct_label mapping
Add missing mapping remove call when removing ct rule,
as the mapping was allocated when ct rule was adding with ct_label.
Also there is a missing mapping remove call in error flow.
Fixes:
54b154ecfb8c ("net/mlx5e: CT: Map 128 bits labels to 32 bit map ID")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Eli Britstein <elibr@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Jianbo Liu [Tue, 7 Jul 2020 06:16:24 +0000 (06:16 +0000)]
net/mlx5e: Fix memory leak of tunnel info when rule under multipath not ready
When deleting vxlan flow rule under multipath, tun_info in parse_attr is
not freed when the rule is not ready.
Fixes:
ef06c9ee8933 ("net/mlx5e: Allow one failure when offloading tc encap rules under multipath")
Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Maxim Mikityanskiy [Thu, 11 Jun 2020 11:25:19 +0000 (14:25 +0300)]
net/mlx5e: Use synchronize_rcu to sync with NAPI
As described in the previous commit, napi_synchronize doesn't quite fit
the purpose when we just need to wait until the currently running NAPI
quits. Its implementation waits until NAPI is not running by polling and
waiting for 1ms in between. In cases where we need to deactivate one
queue (e.g., recovery flows) or where we deactivate them one-by-one
(deactivate channel flow), we may get stuck in napi_synchronize forever
if other queues keep NAPI active, causing a soft lockup. Depending on
kernel configuration (CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC), it may result
in a kernel panic.
To fix the issue, use synchronize_rcu to wait for NAPI to quit, and wrap
the whole NAPI in rcu_read_lock.
Fixes:
acc6c5953af1 ("net/mlx5e: Split open/close channels to stages")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Maxim Mikityanskiy [Thu, 11 Jun 2020 10:55:19 +0000 (13:55 +0300)]
net/mlx5e: Use RCU to protect rq->xdp_prog
Currently, the RQs are temporarily deactivated while hot-replacing the
XDP program, and napi_synchronize is used to make sure rq->xdp_prog is
not in use. However, napi_synchronize is not ideal: instead of waiting
till the end of a NAPI cycle, it polls and waits until NAPI is not
running, sleeping for 1ms between the periodic checks. Under heavy
workloads, this loop will never end, which may even lead to a kernel
panic if the kernel detects the hangup. Such workloads include XSK TX
and possibly also heavy RX (XSK or normal).
The fix is inspired by commit
326fe02d1ed6 ("net/mlx4_en: protect
ring->xdp_prog with rcu_read_lock"). As mlx5e_xdp_handle is already
protected by rcu_read_lock, and bpf_prog_put uses call_rcu to free the
program, there is no need for additional synchronization if proper RCU
functions are used to access the pointer. This patch converts all
accesses to rq->xdp_prog to use RCU functions.
Fixes:
86994156c736 ("net/mlx5e: XDP fast RX drop bpf programs support")
Fixes:
db05815b36cb ("net/mlx5e: Add XSK zero-copy support")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Maor Gottlieb [Mon, 31 Aug 2020 17:50:42 +0000 (20:50 +0300)]
net/mlx5: Fix FTE cleanup
Currently, when an FTE is allocated, its refcount is decreased to 0
with the purpose it will not be a stand alone steering object and every
rule (destination) of the FTE would increase the refcount.
When mlx5_cleanup_fs is called while not all rules were deleted by the
steering users, it hit refcount underflow on the FTE once clean_tree
calls to tree_remove_node after the deleted rules already decreased
the refcount to 0.
FTE is no longer destroyed implicitly when the last rule (destination)
is deleted. mlx5_del_flow_rules avoids it by increasing the refcount on
the FTE and destroy it explicitly after all rules were deleted. So we
can avoid the refcount underflow by making FTE as stand alone object.
In addition need to set del_hw_func to FTE so the HW object will be
destroyed when the FTE is deleted from the cleanup_tree flow.
refcount_t: underflow; use-after-free.
WARNING: CPU: 2 PID: 15715 at lib/refcount.c:28 refcount_warn_saturate+0xd9/0xe0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
tree_put_node+0xf2/0x140 [mlx5_core]
clean_tree+0x4e/0xf0 [mlx5_core]
clean_tree+0x4e/0xf0 [mlx5_core]
clean_tree+0x4e/0xf0 [mlx5_core]
clean_tree+0x5f/0xf0 [mlx5_core]
clean_tree+0x4e/0xf0 [mlx5_core]
clean_tree+0x5f/0xf0 [mlx5_core]
mlx5_cleanup_fs+0x26/0x270 [mlx5_core]
mlx5_unload+0x2e/0xa0 [mlx5_core]
mlx5_unload_one+0x51/0x120 [mlx5_core]
mlx5_devlink_reload_down+0x51/0x90 [mlx5_core]
devlink_reload+0x39/0x120
? devlink_nl_cmd_reload+0x43/0x220
genl_rcv_msg+0x1e4/0x420
? genl_family_rcv_msg_attrs_parse+0x100/0x100
netlink_rcv_skb+0x47/0x110
genl_rcv+0x24/0x40
netlink_unicast+0x217/0x2f0
netlink_sendmsg+0x30f/0x430
sock_sendmsg+0x30/0x40
__sys_sendto+0x10e/0x140
? handle_mm_fault+0xc4/0x1f0
? do_page_fault+0x33f/0x630
__x64_sys_sendto+0x24/0x30
do_syscall_64+0x48/0x130
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes:
718ce4d601db ("net/mlx5: Consolidate update FTE for all removal changes")
Fixes:
bd71b08ec2ee ("net/mlx5: Support multiple updates of steering rules in parallel")
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
David S. Miller [Mon, 21 Sep 2020 21:54:35 +0000 (14:54 -0700)]
Merge tag 'mac80211-for-net-2020-09-21' of git://git./linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
Just a few fixes:
* fix using HE on 2.4 GHz
* AQL (airtime queue limit) estimation & VHT160 fix
* do not oversize A-MPDUs if local capability is smaller than peer's
* fix radiotap on 6 GHz to not put 2.4 GHz flag
* fix Kconfig for lib80211
* little fixlet for 6 GHz channel number / frequency conversion
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Xu Wang [Mon, 21 Sep 2020 06:38:56 +0000 (06:38 +0000)]
ipv6: route: convert comma to semicolon
Replace a comma between expression statements by a semicolon.
Signed-off-by: Xu Wang <vulab@iscas.ac.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Fri, 18 Sep 2020 14:33:11 +0000 (17:33 +0300)]
sfc: Fix error code in probe
This failure path should return a negative error code but it currently
returns success.
Fixes:
51b35a454efd ("sfc: skeleton EF100 PF driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Mon, 21 Sep 2020 19:42:31 +0000 (12:42 -0700)]
Merge branch 'rcu/urgent' of git://git./linux/kernel/git/paulmck/linux-rcu
Pull RCU fix from Paul McKenney:
"This contains a single commit that fixes a bug that was introduced in
the last merge window. This bug causes a compiler warning complaining
about show_rcu_tasks_classic_gp_kthread() being an unused static
function in !SMP kernels.
The fix is straightforward, just adding an 'inline' to make this a
static inline function, thus avoiding the warning.
This bug was reported by Laurent Pinchart, who would like it fixed
sooner rather than later"
* 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
rcu-tasks: Prevent complaints of unused show_rcu_tasks_classic_gp_kthread()
Linus Torvalds [Mon, 21 Sep 2020 15:53:48 +0000 (08:53 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"ARM:
- fix fault on page table writes during instruction fetch
s390:
- doc improvement
x86:
- The obvious patches are always the ones that turn out to be
completely broken. /me hangs his head in shame"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
Revert "KVM: Check the allocation of pv cpu mask"
KVM: arm64: Remove S1PTW check from kvm_vcpu_dabt_iswrite()
KVM: arm64: Assume write fault on S1PTW permission fault on instruction fetch
docs: kvm: add documentation for KVM_CAP_S390_DIAG318
Linus Torvalds [Mon, 21 Sep 2020 15:46:20 +0000 (08:46 -0700)]
Merge tag 'libnvdimm-fixes-5.9-rc7' of git://git./linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fix from Dan Williams:
"Fix compilation for the new dax_supported() exported helper"
* tag 'libnvdimm-fixes-5.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
dax: Fix compilation for CONFIG_DAX && !CONFIG_FS_DAX
Jan Kara [Mon, 21 Sep 2020 09:33:23 +0000 (11:33 +0200)]
dax: Fix compilation for CONFIG_DAX && !CONFIG_FS_DAX
dax_supported() is defined whenever CONFIG_DAX is enabled. So dummy
implementation should be defined only in !CONFIG_DAX case, not in
!CONFIG_FS_DAX case.
Fixes:
e2ec51282545 ("dm: Call proper helper to determine dax support")
Cc: <stable@vger.kernel.org>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jens Axboe [Sat, 19 Sep 2020 01:36:24 +0000 (19:36 -0600)]
io_uring: fix openat/openat2 unified prep handling
A previous commit unified how we handle prep for these two functions,
but this means that we check the allowed context (SQPOLL, specifically)
later than we should. Move the ring type checking into the two parent
functions, instead of doing it after we've done some setup work.
Fixes:
ec65fea5a8d7 ("io_uring: deduplicate io_openat{,2}_prep()")
Reported-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Fri, 18 Sep 2020 22:51:19 +0000 (16:51 -0600)]
io_uring: mark statx/files_update/epoll_ctl as non-SQPOLL
These will naturally fail when attempted through SQPOLL, but either
with -EFAULT or -EBADF. Make it explicit that these are not workable
through SQPOLL and return -EINVAL, just like other ops that need to
use ->files.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Douglas Gilbert [Mon, 14 Sep 2020 21:36:09 +0000 (17:36 -0400)]
tools/io_uring: fix compile breakage
It would seem none of the kernel continuous integration does this:
$ cd tools/io_uring
$ make
Otherwise it may have noticed:
cc -Wall -Wextra -g -D_GNU_SOURCE -c -o io_uring-bench.o
io_uring-bench.c
io_uring-bench.c:133:12: error: static declaration of ‘gettid’
follows non-static declaration
133 | static int gettid(void)
| ^~~~~~
In file included from /usr/include/unistd.h:1170,
from io_uring-bench.c:27:
/usr/include/x86_64-linux-gnu/bits/unistd_ext.h:34:16: note:
previous declaration of ‘gettid’ was here
34 | extern __pid_t gettid (void) __THROW;
| ^~~~~~
make: *** [<builtin>: io_uring-bench.o] Error 1
The problem on Ubuntu 20.04 (with lk 5.9.0-rc5) is that unistd.h
already defines gettid(). So prefix the local definition with
"lk_".
Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Mon, 14 Sep 2020 15:30:38 +0000 (09:30 -0600)]
io_uring: don't use retry based buffered reads for non-async bdev
Some block devices, like dm, bubble back -EAGAIN through the completion
handler. We check for this in io_read(), but don't honor it for when
we have copied the iov. Return -EAGAIN for this case before retrying,
to force punt to io-wq.
Fixes:
bcf5a06304d6 ("io_uring: support true async buffered reads, if file provides it")
Reported-by: Zorro Lang <zlang@redhat.com>
Tested-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Mon, 14 Sep 2020 15:28:14 +0000 (09:28 -0600)]
io_uring: don't re-setup vecs/iter in io_resumit_prep() is already there
If we already have mapped the necessary data for retry, then don't set
it up again. It's a pointless operation, and we leak the iovec if it's
a large (non-stack) vec.
Fixes:
b63534c41e20 ("io_uring: re-issue block requests that failed because of resources")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
David S. Miller [Mon, 21 Sep 2020 02:04:45 +0000 (19:04 -0700)]
Merge branch 'bnxt_en-Bug-fixes'
Michael Chan says:
====================
bnxt_en: Bug fixes.
A series of small driver fixes covering VPD length logic,
ethtool_get_regs on VF, hwmon temperature error handling,
mutex locking for EEE and pause ethtool settings, and
parameters for statistics related firmware calls.
Please queue patches 1, 2, and 3 for -stable. Thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 21 Sep 2020 01:08:59 +0000 (21:08 -0400)]
bnxt_en: Fix wrong flag value passed to HWRM_PORT_QSTATS_EXT fw call.
The wrong flag value caused the firmware call to return actual port
counters instead of the counter masks. This messed up the counter
overflow logic and caused erratic extended port counters to be
displayed under ethtool -S.
Fixes:
531d1d269c1d ("bnxt_en: Retrieve hardware masks for port counters.")
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 21 Sep 2020 01:08:58 +0000 (21:08 -0400)]
bnxt_en: Fix HWRM_FUNC_QSTATS_EXT firmware call.
Fix it to set the required fid input parameter. The firmware call
fails without this patch.
Fixes:
d752d0536c97 ("bnxt_en: Retrieve hardware counter masks from firmware if available.")
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasundhara Volam [Mon, 21 Sep 2020 01:08:57 +0000 (21:08 -0400)]
bnxt_en: Return -EOPNOTSUPP for ETHTOOL_GREGS on VFs.
Debug firmware commands are not supported on VFs to read registers.
This patch avoids logging unnecessary access_denied error on VFs
when user calls ETHTOOL_GREGS.
By returning error in get_regs_len() method on the VF, the get_regs()
method will not be called.
Fixes:
b5d600b027eb ("bnxt_en: Add support for 'ethtool -d'")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 21 Sep 2020 01:08:56 +0000 (21:08 -0400)]
bnxt_en: Protect bnxt_set_eee() and bnxt_set_pauseparam() with mutex.
All changes related to bp->link_info require the protection of the
link_lock mutex. It's not sufficient to rely just on RTNL.
Fixes:
163e9ef63641 ("bnxt_en: Fix race when modifying pause settings.")
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Edwin Peer [Mon, 21 Sep 2020 01:08:55 +0000 (21:08 -0400)]
bnxt_en: return proper error codes in bnxt_show_temp
Returning "unknown" as a temperature value violates the hwmon interface
rules. Appropriate error codes should be returned via device_attribute
show instead. These will ultimately be propagated to the user via the
file system interface.
In addition to the corrected error handling, it is an even better idea to
not present the sensor in sysfs at all if it is known that the read will
definitely fail. Given that temp1_input is currently the only sensor
reported, ensure no hwmon registration if TEMP_MONITOR_QUERY is not
supported or if it will fail due to access permissions. Something smarter
may be needed if and when other sensors are added.
Fixes:
12cce90b934b ("bnxt_en: fix HWRM error when querying VF temperature")
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasundhara Volam [Mon, 21 Sep 2020 01:08:54 +0000 (21:08 -0400)]
bnxt_en: Use memcpy to copy VPD field info.
Using strlcpy() to copy from VPD is not correct because VPD strings
are not necessarily NULL terminated. Use memcpy() to copy the VPD
length up to the destination buffer size - 1. The destination is
zeroed memory so it will always be NULL terminated.
Fixes:
a0d0fd70fed5 ("bnxt_en: Read partno and serialno of the board from VPD")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sun, 20 Sep 2020 23:33:55 +0000 (16:33 -0700)]
Linux 5.9-rc6
Linus Torvalds [Sun, 20 Sep 2020 22:37:15 +0000 (15:37 -0700)]
Merge tag 'core_urgent_for_v5.9_rc6' of git://git./linux/kernel/git/tip/tip
Pull syscall tracing fix from Borislav Petkov:
"Fix the seccomp syscall rewriting so that trace and audit see the
rewritten syscall number, from Kees Cook"
* tag 'core_urgent_for_v5.9_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
core/entry: Report syscall correctly for trace and audit
Linus Torvalds [Sun, 20 Sep 2020 22:31:04 +0000 (15:31 -0700)]
Merge tag 'objtool_urgent_for_v5.9_rc6' of git://git./linux/kernel/git/tip/tip
Pull objtool fix from Borislav Petkov:
"Fix noreturn detection for ignored sibling functions (Josh Poimboeuf)"
* tag 'objtool_urgent_for_v5.9_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Fix noreturn detection for ignored functions
Linus Torvalds [Sun, 20 Sep 2020 22:25:33 +0000 (15:25 -0700)]
Merge tag 'locking_urgent_for_v5.9_rc6' of git://git./linux/kernel/git/tip/tip
Pull locking fixes from Borislav Petkov:
"Two fixes from the locking/urgent pile:
- Fix lockdep's detection of "USED" <- "IN-NMI" inversions (Peter
Zijlstra)
- Make percpu-rwsem operations on the semaphore's ->read_count
IRQ-safe because it can be used in an IRQ context (Hou Tao)"
* tag 'locking_urgent_for_v5.9_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/percpu-rwsem: Use this_cpu_{inc,dec}() for read_count
locking/lockdep: Fix "USED" <- "IN-NMI" inversions
Linus Torvalds [Sun, 20 Sep 2020 22:18:11 +0000 (15:18 -0700)]
Merge tag 'efi-urgent-for-v5.9-rc5' of git://git./linux/kernel/git/tip/tip
Pull EFI fix from Borislav Petkov:
"Ensure that the EFI bootloader control module only probes successfully
on systems that support the EFI SetVariable runtime service"
[ Tag and commit from Ard Biesheuvel, forwarded by Borislav ]
* tag 'efi-urgent-for-v5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi: efibc: check for efivars write capability
Linus Torvalds [Sun, 20 Sep 2020 22:06:43 +0000 (15:06 -0700)]
Merge tag 'x86_urgent_for_v5.9_rc6' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
- A defconfig fix (Daniel Díaz)
- Disable relocation relaxation for the compressed kernel when not
built as -pie as in that case kernels built with clang and linked
with LLD fail to boot due to the linker optimizing some instructions
in non-PIE form; the gory details in the commit message (Arvind
Sankar)
- A fix for the "bad bp value" warning issued by the frame-pointer
unwinder (Josh Poimboeuf)
* tag 'x86_urgent_for_v5.9_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/unwind/fp: Fix FP unwinding in ret_from_fork
x86/boot/compressed: Disable relocation relaxation
x86/defconfigs: Explicitly unset CONFIG_64BIT in i386_defconfig
Linus Torvalds [Sun, 20 Sep 2020 22:01:57 +0000 (15:01 -0700)]
Merge tag 'libnvdimm-fixes-5.9-rc6' of git://git./linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fixes from Dan Williams:
"A handful of fixes to address a string of mistakes in the mechanism
for device-mapper to determine if its component devices are dax
capable.
- Fix an original bug in device-mapper table reference counting when
interrogating dax capability in the component device. This bug was
hidden by the following bug.
- Fix device-mapper to use the proper helper (dax_supported() instead
of the leaf helper generic_fsdax_supported()) to determine dax
operation of a stacked block device configuration. The original
implementation is only valid for one level of dax-capable block
device stacking. This bug was discovered while fixing the below
regression.
- Fix an infinite recursion regression introduced by broken attempts
to quiet the generic_fsdax_supported() path and make it bail out
before logging "dax capability not found" errors"
* tag 'libnvdimm-fixes-5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
dax: Fix stack overflow when mounting fsdax pmem device
dm: Call proper helper to determine dax support
dm/dax: Fix table reference counts
Paolo Bonzini [Sun, 20 Sep 2020 21:31:15 +0000 (17:31 -0400)]
Merge tag 'kvm-s390-master-5.9-1' of git://git./linux/kernel/git/kvms390/linux into kvm-master
KVM: s390: add documentation for KVM_CAP_S390_DIAG318
diag318 code was merged in 5.9-rc1, let us add some
missing documentation