David S. Miller [Sun, 12 Jul 2020 22:22:14 +0000 (15:22 -0700)]
Merge branch 'Fix-MTU-warnings-for-fec-mv886xxx-combo'
Andrew Lunn says:
====================
Fix MTU warnings for fec/mv886xxx combo
Since changing the MTU of dsa slave interfaces was implemented, the
fec/mv88e6xxx combo has been giving warnings:
[ 2.275925] mv88e6085 0.2:00: nonfatal error -95 setting MTU on port 9
[ 2.284306] eth1: mtu greater than device maximum
[ 2.287759] fec
400d1000.ethernet eth1: error -22 setting MTU to include DSA overhead
This patchset adds support for changing the MTU on mv88e6xxx switches,
which do support jumbo frames. And it modifies the FEC driver to
support its true MTU range, which is larger than the default Ethernet
MTU.
====================
Tested-by: Chris Healy <cphealy@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sat, 11 Jul 2020 20:32:06 +0000 (22:32 +0200)]
net: fec: Set max MTU size to allow the MTU to be changed
The FEC allocates 2K buffers, but looses some of it due to
alignment. It can however support an MTU bigger than the default. This
is particularly interesting when used in combination with Ethernet
switches supporting DSA, which have extra headers. The DSA core will
try to increase the MTU to support these extra headers. If the max
size defaults to that of standard Ethernet we get a warning. By
setting the max to what the driver actually supports, we avoid this
warning.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sat, 11 Jul 2020 20:32:05 +0000 (22:32 +0200)]
net: dsa: mv88e6xxx: Implement MTU change
The Marvell Switches support jumbo packages. So implement the
callbacks needed for changing the MTU.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nikolay Aleksandrov [Sat, 11 Jul 2020 15:05:04 +0000 (18:05 +0300)]
net: bridge: notify on vlan tunnel changes done via the old api
If someone uses the old vlan API to configure tunnel mappings we'll only
generate the old-style full port notification. That would be a problem
if we are monitoring the new vlan notifications for changes. The patch
resolves the issue by adding vlan notifications to the old tunnel netlink
code. As usual we try to compress the notifications for as many vlans
in a range as possible, thus a vlan tunnel change is considered able
to enter the "current" vlan notification range if:
1. vlan exists
2. it has actually changed (curr_change == true)
3. it passes all standard vlan notification range checks done by
br_vlan_can_enter_range() such as option equality, id continuity etc
Note that vlan tunnel changes (add/del) are considered a part of vlan
options so only RTM_NEWVLAN notification is generated with the relevant
information inside.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 11 Jul 2020 07:46:00 +0000 (00:46 -0700)]
Merge git://git./linux/kernel/git/netdev/net
All conflicts seemed rather trivial, with some guidance from
Saeed Mameed on the tc_ct.c one.
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sat, 11 Jul 2020 04:23:10 +0000 (21:23 -0700)]
Merge tag 'libnvdimm-fix-v5.8-rc5' of git://git./linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fix from Dan Williams:
"A one-line Fix for key ring search permissions to address a regression
from -rc1"
* tag 'libnvdimm-fix-v5.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
libnvdimm/security: Fix key lookup permissions
Linus Torvalds [Sat, 11 Jul 2020 04:16:48 +0000 (21:16 -0700)]
Merge tag '5.8-rc4-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"Four cifs/smb3 fixes: the three for stable fix problems found recently
with change notification including a reference count leak"
* tag '5.8-rc4-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: update internal module version number
cifs: fix reference leak for tlink
smb3: fix unneeded error message on change notify
cifs: remove the retry in cifs_poxis_lock_set
smb3: fix access denied on change notify request to some servers
Linus Torvalds [Sat, 11 Jul 2020 04:15:25 +0000 (21:15 -0700)]
Merge tag 'inclusive-terminology' of git://git./linux/kernel/git/djbw/linux
Pull coding style terminology documentation from Dan Williams:
"The discussion has tapered off as well as the incoming ack, review,
and sign-off tags. I did not see a reason to wait for the next merge
window"
* tag 'inclusive-terminology' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/linux:
CodingStyle: Inclusive Terminology
Linus Torvalds [Sat, 11 Jul 2020 01:16:22 +0000 (18:16 -0700)]
Merge git://git./linux/kernel/git/netdev/net
Pull networking fixes from David Miller:
1) Restore previous behavior of CAP_SYS_ADMIN wrt loading networking
BPF programs, from Maciej Żenczykowski.
2) Fix dropped broadcasts in mac80211 code, from Seevalamuthu
Mariappan.
3) Slay memory leak in nl80211 bss color attribute parsing code, from
Luca Coelho.
4) Get route from skb properly in ip_route_use_hint(), from Miaohe Lin.
5) Don't allow anything other than ARPHRD_ETHER in llc code, from Eric
Dumazet.
6) xsk code dips too deeply into DMA mapping implementation internals.
Add dma_need_sync and use it. From Christoph Hellwig
7) Enforce power-of-2 for BPF ringbuf sizes. From Andrii Nakryiko.
8) Check for disallowed attributes when loading flow dissector BPF
programs. From Lorenz Bauer.
9) Correct packet injection to L3 tunnel devices via AF_PACKET, from
Jason A. Donenfeld.
10) Don't advertise checksum offload on ipa devices that don't support
it. From Alex Elder.
11) Resolve several issues in TCP MD5 signature support. Missing memory
barriers, bogus options emitted when using syncookies, and failure
to allow md5 key changes in established states. All from Eric
Dumazet.
12) Fix interface leak in hsr code, from Taehee Yoo.
13) VF reset fixes in hns3 driver, from Huazhong Tan.
14) Make loopback work again with ipv6 anycast, from David Ahern.
15) Fix TX starvation under high load in fec driver, from Tobias
Waldekranz.
16) MLD2 payload lengths not checked properly in bridge multicast code,
from Linus Lüssing.
17) Packet scheduler code that wants to find the inner protocol
currently only works for one level of VLAN encapsulation. Allow
Q-in-Q situations to work properly here, from Toke
Høiland-Jørgensen.
18) Fix route leak in l2tp, from Xin Long.
19) Resolve conflict between the sk->sk_user_data usage of bpf reuseport
support and various protocols. From Martin KaFai Lau.
20) Fix socket cgroup v2 reference counting in some situations, from
Cong Wang.
21) Cure memory leak in mlx5 connection tracking offload support, from
Eli Britstein.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (146 commits)
mlxsw: pci: Fix use-after-free in case of failed devlink reload
mlxsw: spectrum_router: Remove inappropriate usage of WARN_ON()
net: macb: fix call to pm_runtime in the suspend/resume functions
net: macb: fix macb_suspend() by removing call to netif_carrier_off()
net: macb: fix macb_get/set_wol() when moving to phylink
net: macb: mark device wake capable when "magic-packet" property present
net: macb: fix wakeup test in runtime suspend/resume routines
bnxt_en: fix NULL dereference in case SR-IOV configuration fails
libbpf: Fix libbpf hashmap on (I)LP32 architectures
net/mlx5e: CT: Fix memory leak in cleanup
net/mlx5e: Fix port buffers cell size value
net/mlx5e: Fix 50G per lane indication
net/mlx5e: Fix CPU mapping after function reload to avoid aRFS RX crash
net/mlx5e: Fix VXLAN configuration restore after function reload
net/mlx5e: Fix usage of rcu-protected pointer
net/mxl5e: Verify that rpriv is not NULL
net/mlx5: E-Switch, Fix vlan or qos setting in legacy mode
net/mlx5: Fix eeprom support for SFP module
cgroup: Fix sock_cgroup_data on big-endian.
selftests: bpf: Fix detach from sockmap tests
...
Nathan Chancellor [Fri, 10 Jul 2020 22:34:41 +0000 (15:34 -0700)]
mips: Remove compiler check in unroll macro
CONFIG_CC_IS_GCC is undefined when Clang is used, which breaks the build
(see our Travis link below).
Clang 8 was chosen as a minimum version for this check because there
were some improvements around __builtin_constant_p in that release. In
reality, MIPS was not even buildable until clang 9 so that check was not
technically necessary. Just remove all compiler checks and just assume
that we have a working compiler.
Fixes:
d4e60453266b ("Restore gcc check in mips asm/unroll.h")
Link: https://travis-ci.com/github/ClangBuiltLinux/continuous-integration/jobs/359642821
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kuniyuki Iwashima [Fri, 10 Jul 2020 15:57:59 +0000 (00:57 +0900)]
inet: Remove an unnecessary argument of syn_ack_recalc().
Commit
0c3d79bce48034018e840468ac5a642894a521a3 ("tcp: reduce SYN-ACK
retrans for TCP_DEFER_ACCEPT") introduces syn_ack_recalc() which decides
if a minisock is held and a SYN+ACK is retransmitted or not.
If rskq_defer_accept is not zero in syn_ack_recalc(), max_retries always
has the same value because max_retries is overwritten by rskq_defer_accept
in reqsk_timer_handler().
This commit adds three changes:
- remove redundant non-zero check for rskq_defer_accept in
reqsk_timer_handler().
- remove max_retries from the arguments of syn_ack_recalc() and use
rskq_defer_accept instead.
- rename thresh to max_syn_ack_retries for readability.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Reviewed-by: Benjamin Herrenschmidt <benh@amazon.com>
CC: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 10 Jul 2020 21:33:34 +0000 (14:33 -0700)]
Merge branch 'mlxsw-Various-fixes'
Ido Schimmel says:
====================
mlxsw: Various fixes
Fix two issues found by syzkaller.
Patch #1 removes inappropriate usage of WARN_ON() following memory
allocation failure. Constantly triggered when syzkaller injects faults.
Patch #2 fixes a use-after-free that can be triggered by 'devlink dev
info' following a failed devlink reload.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Fri, 10 Jul 2020 13:41:39 +0000 (16:41 +0300)]
mlxsw: pci: Fix use-after-free in case of failed devlink reload
In case devlink reload failed, it is possible to trigger a
use-after-free when querying the kernel for device info via 'devlink dev
info' [1].
This happens because as part of the reload error path the PCI command
interface is de-initialized and its mailboxes are freed. When the
devlink '->info_get()' callback is invoked the device is queried via the
command interface and the freed mailboxes are accessed.
Fix this by initializing the command interface once during probe and not
during every reload.
This is consistent with the other bus used by mlxsw (i.e., 'mlxsw_i2c')
and also allows user space to query the running firmware version (for
example) from the device after a failed reload.
[1]
BUG: KASAN: use-after-free in memcpy include/linux/string.h:406 [inline]
BUG: KASAN: use-after-free in mlxsw_pci_cmd_exec+0x177/0xa60 drivers/net/ethernet/mellanox/mlxsw/pci.c:1675
Write of size 4096 at addr
ffff88810ae32000 by task syz-executor.1/2355
CPU: 1 PID: 2355 Comm: syz-executor.1 Not tainted 5.8.0-rc2+ #29
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0xf6/0x16e lib/dump_stack.c:118
print_address_description.constprop.0+0x1c/0x250 mm/kasan/report.c:383
__kasan_report mm/kasan/report.c:513 [inline]
kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
check_memory_region_inline mm/kasan/generic.c:186 [inline]
check_memory_region+0x14e/0x1b0 mm/kasan/generic.c:192
memcpy+0x39/0x60 mm/kasan/common.c:106
memcpy include/linux/string.h:406 [inline]
mlxsw_pci_cmd_exec+0x177/0xa60 drivers/net/ethernet/mellanox/mlxsw/pci.c:1675
mlxsw_cmd_exec+0x249/0x550 drivers/net/ethernet/mellanox/mlxsw/core.c:2335
mlxsw_cmd_access_reg drivers/net/ethernet/mellanox/mlxsw/cmd.h:859 [inline]
mlxsw_core_reg_access_cmd drivers/net/ethernet/mellanox/mlxsw/core.c:1938 [inline]
mlxsw_core_reg_access+0x2f6/0x540 drivers/net/ethernet/mellanox/mlxsw/core.c:1985
mlxsw_reg_query drivers/net/ethernet/mellanox/mlxsw/core.c:2000 [inline]
mlxsw_devlink_info_get+0x17f/0x6e0 drivers/net/ethernet/mellanox/mlxsw/core.c:1090
devlink_nl_info_fill.constprop.0+0x13c/0x2d0 net/core/devlink.c:4588
devlink_nl_cmd_info_get_dumpit+0x246/0x460 net/core/devlink.c:4648
genl_lock_dumpit+0x85/0xc0 net/netlink/genetlink.c:575
netlink_dump+0x515/0xe50 net/netlink/af_netlink.c:2245
__netlink_dump_start+0x53d/0x830 net/netlink/af_netlink.c:2353
genl_family_rcv_msg_dumpit.isra.0+0x296/0x300 net/netlink/genetlink.c:638
genl_family_rcv_msg net/netlink/genetlink.c:733 [inline]
genl_rcv_msg+0x78d/0x9d0 net/netlink/genetlink.c:753
netlink_rcv_skb+0x152/0x440 net/netlink/af_netlink.c:2469
genl_rcv+0x24/0x40 net/netlink/genetlink.c:764
netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline]
netlink_unicast+0x53a/0x750 net/netlink/af_netlink.c:1329
netlink_sendmsg+0x850/0xd90 net/netlink/af_netlink.c:1918
sock_sendmsg_nosec net/socket.c:652 [inline]
sock_sendmsg+0x150/0x190 net/socket.c:672
____sys_sendmsg+0x6d8/0x840 net/socket.c:2363
___sys_sendmsg+0xff/0x170 net/socket.c:2417
__sys_sendmsg+0xe5/0x1b0 net/socket.c:2450
do_syscall_64+0x56/0xa0 arch/x86/entry/common.c:359
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes:
a9c8336f6544 ("mlxsw: core: Add support for devlink info command")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Fri, 10 Jul 2020 13:41:38 +0000 (16:41 +0300)]
mlxsw: spectrum_router: Remove inappropriate usage of WARN_ON()
We should not trigger a warning when a memory allocation fails. Remove
the WARN_ON().
The warning is constantly triggered by syzkaller when it is injecting
faults:
[ 2230.758664] FAULT_INJECTION: forcing a failure.
[ 2230.758664] name failslab, interval 1, probability 0, space 0, times 0
[ 2230.762329] CPU: 3 PID: 1407 Comm: syz-executor.0 Not tainted 5.8.0-rc2+ #28
...
[ 2230.898175] WARNING: CPU: 3 PID: 1407 at drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:6265 mlxsw_sp_router_fib_event+0xfad/0x13e0
[ 2230.898179] Kernel panic - not syncing: panic_on_warn set ...
[ 2230.898183] CPU: 3 PID: 1407 Comm: syz-executor.0 Not tainted 5.8.0-rc2+ #28
[ 2230.898190] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
Fixes:
3057224e014c ("mlxsw: spectrum_router: Implement FIB offload in deferred work")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 10 Jul 2020 21:32:02 +0000 (14:32 -0700)]
Merge branch 'devlink-health'
Moshe Shemesh says:
====================
Add devlink-health support for devlink ports
Implement support for devlink health reporters on per-port basis.
This patchset comes to fix a design issue as some health reporters report
on errors and run recovery on device level while the actual functionality
is on port level. As for the current implemented devlink health reporters
it is relevant only to Tx and Rx reporters of mlx5, which has only one
port, so no real effect on functionality, but this should be fixed before
more drivers will use devlink health reporters.
First part in the series prepares common functions parts for health
reporter implementation. Second introduces required API to devlink-health
and mlx5e ones demonstrate its usage and implement the feature for mlx5
driver.
The per-port reporter functionality is achieved by adding a list of
devlink_health_reporters to devlink_port struct in a manner similar to
existing device infrastructure. This is the only major difference and
it makes possible to fully reuse device reporters operations.
The effect will be seen in conjunction with iproute2 additions and
will affect all devlink health commands. User can distinguish between
device and port reporters by looking at a devlink handle. Port reporters
have a port index at the end of the address and such addresses can be
provided as a parameter in every place where devlink-health accepted it.
These can be obtained from devlink port show command.
For example:
$ devlink health show
pci/0000:00:0a.0:
reporter fw
state healthy error 0 recover 0 auto_dump true
pci/0000:00:0a.0/1:
reporter tx
state healthy error 0 recover 0 grace_period 500 auto_recover true auto_dump true
$ devlink health set pci/0000:00:0a.0/1 reporter tx grace_period 1000 \
auto_recover false auto_dump false
$ devlink health show pci/0000:00:0a.0/1 reporter tx
pci/0000:00:0a.0/1:
reporter tx
state healthy error 0 recover 0 grace_period 1000 auto_recover flase auto_dump false
Note: User can use the same devlink health uAPI commands can get now either
port health reporter or device health reporter.
For example, the recover command:
Before this patchset: devlink health recover DEV reporter REPORTER_NAME
After this patchset: devlink health recover { DEV | DEV/PORT_INDEX } reporter REPORTER_NAME
Changes v1 -> v2:
Fixed functions comment to match parameters list.
Changes v2 -> v3:
Added motivation to cover letter and note on uAPI.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladyslav Tarasiuk [Fri, 10 Jul 2020 12:25:13 +0000 (15:25 +0300)]
net/mlx5e: Move devlink-health rx and tx reporters to devlink port
Utilize new devlink-health port reporters API to move rx and tx
reporters from device to port.
Signed-off-by: Vladyslav Tarasiuk <vladyslavt@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladyslav Tarasiuk [Fri, 10 Jul 2020 12:25:12 +0000 (15:25 +0300)]
net/mlx5e: Move devlink port register and unregister calls
Register devlink ports upon NIC init. TX and RX health reporters handle
errors which may occur early on at driver initialization. And because
these reporters are to be moved to port context, they require devlink
ports to be already registered.
Signed-off-by: Vladyslav Tarasiuk <vladyslavt@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladyslav Tarasiuk [Fri, 10 Jul 2020 12:25:11 +0000 (15:25 +0300)]
devlink: Add devlink health port reporters API
In order to use new devlink port health reporters infrastructure, add
corresponding constructor and destructor functions.
Signed-off-by: Vladyslav Tarasiuk <vladyslavt@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladyslav Tarasiuk [Fri, 10 Jul 2020 12:25:10 +0000 (15:25 +0300)]
devlink: Implement devlink health reporters on per-port basis
Add devlink-health reporter support on per-port basis.
The main difference existing devlink-health is that port reporters are
stored in per-devlink_port lists. Upon creation of such health reporter the
reference to a port it belongs to is stored in reporter struct.
Fill the port index attribute in devlink-health response to
allow devlink userspace utility to distinguish between device and port
reporters.
Signed-off-by: Vladyslav Tarasiuk <vladyslavt@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladyslav Tarasiuk [Fri, 10 Jul 2020 12:25:09 +0000 (15:25 +0300)]
devlink: Create generic devlink health reporter search function
Add a generic __devlink_health_reporter_find_by_name() that can be used
with arbitrary devlink health reporter list.
Signed-off-by: Vladyslav Tarasiuk <vladyslavt@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladyslav Tarasiuk [Fri, 10 Jul 2020 12:25:08 +0000 (15:25 +0300)]
devlink: Rework devlink health reporter destructor
Devlink keeps its own reference to every reporter in a list and inits
refcount to 1 upon reporter's creation. Existing destructor waits to
free the memory indefinitely using msleep() until all references except
devlink's own are put.
Rework this mechanism by moving memory free routine to a separate
function, which is called when the last reporter reference is put.
Besides, it allows to call __devlink_health_reporter_destroy() while
locked on a reporters list mutex in symmetry to
__devlink_health_reporter_create(), which is required in follow-up
patch.
Signed-off-by: Vladyslav Tarasiuk <vladyslavt@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladyslav Tarasiuk [Fri, 10 Jul 2020 12:25:07 +0000 (15:25 +0300)]
devlink: Refactor devlink health reporter constructor
Prepare a common routine in devlink_health_reporter_create() for usage
in similar functions for devlink port health reporters.
Signed-off-by: Vladyslav Tarasiuk <vladyslavt@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 10 Jul 2020 21:29:38 +0000 (14:29 -0700)]
Merge branch 'macb-WOL-fixes'
Nicolas Ferre says:
====================
net: macb: Wake-on-Lan magic packet fixes and GEM handling
Here is a split series to fix WoL magic-packet on the current macb driver. Only
fixes in this one based on current net/master.
Changes in v5:
- Addressed the error code returned by phylink_ethtool_set_wol() as suggested
by Russell.
If PHY handles WoL, MAC doesn't stay in the way.
- Removed Florian's tag on 3/5 because of the above changes.
- Correct the "Fixes" tag on 1/5.
Changes in v4:
- Pure bug fix series for 'net'. GEM addition and MACB update removed: will be
sent later.
Changes in v3:
- Revert some of the v2 changes done in macb_resume(). Now the resume function
supports in-depth re-configuration of the controller in order to deal with
deeper sleep states. Basically as it was before changes introduced by this
series
- Tested for non-regression with our deeper Power Management mode which cuts
power to the controller completely
Changes in v2:
- Add patch 4/7 ("net: macb: fix macb_suspend() by removing call to netif_carrier_off()")
needed for keeping phy state consistent
- Add patch 5/7 ("net: macb: fix call to pm_runtime in the suspend/resume functions") that prevent
putting the macb in runtime pm suspend mode when WoL is used
- Collect review tags on 3 first patches from Florian: Thanks!
- Review of macb_resume() function
- Addition of pm_wakeup_event() in both MACB and GEM WoL IRQ handlers
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Ferre [Fri, 10 Jul 2020 12:46:45 +0000 (14:46 +0200)]
net: macb: fix call to pm_runtime in the suspend/resume functions
The calls to pm_runtime_force_suspend/resume() functions are only
relevant if the device is not configured to act as a WoL wakeup source.
Add the device_may_wakeup() test before calling them.
Fixes:
3e2a5e153906 ("net: macb: add wake-on-lan support via magic packet")
Cc: Claudiu Beznea <claudiu.beznea@microchip.com>
Cc: Harini Katakam <harini.katakam@xilinx.com>
Cc: Sergio Prado <sergio.prado@e-labworks.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Ferre [Fri, 10 Jul 2020 12:46:44 +0000 (14:46 +0200)]
net: macb: fix macb_suspend() by removing call to netif_carrier_off()
As we now use the phylink call to phylink_stop() in the non-WoL path,
there is no need for this call to netif_carrier_off() anymore. It can
disturb the underlying phylink FSM.
Fixes:
7897b071ac3b ("net: macb: convert to phylink")
Cc: Claudiu Beznea <claudiu.beznea@microchip.com>
Cc: Harini Katakam <harini.katakam@xilinx.com>
Cc: Antoine Tenart <antoine.tenart@bootlin.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Ferre [Fri, 10 Jul 2020 12:46:43 +0000 (14:46 +0200)]
net: macb: fix macb_get/set_wol() when moving to phylink
Keep previous function goals and integrate phylink actions to them.
phylink_ethtool_get_wol() is not enough to figure out if Ethernet driver
supports Wake-on-Lan.
Initialization of "supported" and "wolopts" members is done in phylink
function, no need to keep them in calling function.
phylink_ethtool_set_wol() return value is considered and determines
if the MAC has to handle WoL or not. The case where the PHY doesn't
implement WoL leads to the MAC configuring it to provide this feature.
Fixes:
7897b071ac3b ("net: macb: convert to phylink")
Cc: Claudiu Beznea <claudiu.beznea@microchip.com>
Cc: Harini Katakam <harini.katakam@xilinx.com>
Cc: Antoine Tenart <antoine.tenart@bootlin.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Ferre [Fri, 10 Jul 2020 12:46:42 +0000 (14:46 +0200)]
net: macb: mark device wake capable when "magic-packet" property present
Change the way the "magic-packet" DT property is handled in the
macb_probe() function, matching DT binding documentation.
Now we mark the device as "wakeup capable" instead of calling the
device_init_wakeup() function that would enable the wakeup source.
For Ethernet WoL, enabling the wakeup_source is done by
using ethtool and associated macb_set_wol() function that
already calls device_set_wakeup_enable() for this purpose.
That would reduce power consumption by cutting more clocks if
"magic-packet" property is set but WoL is not configured by ethtool.
Fixes:
3e2a5e153906 ("net: macb: add wake-on-lan support via magic packet")
Cc: Claudiu Beznea <claudiu.beznea@microchip.com>
Cc: Harini Katakam <harini.katakam@xilinx.com>
Cc: Sergio Prado <sergio.prado@e-labworks.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Ferre [Fri, 10 Jul 2020 12:46:41 +0000 (14:46 +0200)]
net: macb: fix wakeup test in runtime suspend/resume routines
Use the proper struct device pointer to check if the wakeup flag
and wakeup source are positioned.
Use the one passed by function call which is equivalent to
&bp->dev->dev.parent.
It's preventing the trigger of a spurious interrupt in case the
Wake-on-Lan feature is used.
Fixes:
d54f89af6cc4 ("net: macb: Add pm runtime support")
Cc: Claudiu Beznea <claudiu.beznea@microchip.com>
Cc: Harini Katakam <harini.katakam@xilinx.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Davide Caratti [Fri, 10 Jul 2020 10:55:08 +0000 (12:55 +0200)]
bnxt_en: fix NULL dereference in case SR-IOV configuration fails
we need to set 'active_vfs' back to 0, if something goes wrong during the
allocation of SR-IOV resources: otherwise, further VF configurations will
wrongly assume that bp->pf.vf[x] are valid memory locations, and commands
like the ones in the following sequence:
# echo 2 >/sys/bus/pci/devices/${ADDR}/sriov_numvfs
# ip link set dev ens1f0np0 up
# ip link set dev ens1f0np0 vf 0 trust on
will cause a kernel crash similar to this:
bnxt_en 0000:3b:00.0: not enough MMIO resources for SR-IOV
BUG: kernel NULL pointer dereference, address:
0000000000000014
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] SMP PTI
CPU: 43 PID: 2059 Comm: ip Tainted: G I 5.8.0-rc2.upstream+ #871
Hardware name: Dell Inc. PowerEdge R740/08D89F, BIOS 2.2.11 06/13/2019
RIP: 0010:bnxt_set_vf_trust+0x5b/0x110 [bnxt_en]
Code: 44 24 58 31 c0 e8 f5 fb ff ff 85 c0 0f 85 b6 00 00 00 48 8d 1c 5b 41 89 c6 b9 0b 00 00 00 48 c1 e3 04 49 03 9c 24 f0 0e 00 00 <8b> 43 14 89 c2 83 c8 10 83 e2 ef 45 84 ed 49 89 e5 0f 44 c2 4c 89
RSP: 0018:
ffffac6246a1f570 EFLAGS:
00010246
RAX:
0000000000000000 RBX:
0000000000000000 RCX:
000000000000000b
RDX:
0000000000000001 RSI:
0000000000000000 RDI:
ffff98b28f538900
RBP:
ffff98b28f538900 R08:
0000000000000000 R09:
0000000000000008
R10:
ffffffffb9515be0 R11:
ffffac6246a1f678 R12:
ffff98b28f538000
R13:
0000000000000001 R14:
0000000000000000 R15:
ffffffffc05451e0
FS:
00007fde0f688800(0000) GS:
ffff98baffd40000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
0000000000000014 CR3:
000000104bb0a003 CR4:
00000000007606e0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
PKRU:
55555554
Call Trace:
do_setlink+0x994/0xfe0
__rtnl_newlink+0x544/0x8d0
rtnl_newlink+0x47/0x70
rtnetlink_rcv_msg+0x29f/0x350
netlink_rcv_skb+0x4a/0x110
netlink_unicast+0x21d/0x300
netlink_sendmsg+0x329/0x450
sock_sendmsg+0x5b/0x60
____sys_sendmsg+0x204/0x280
___sys_sendmsg+0x88/0xd0
__sys_sendmsg+0x5e/0xa0
do_syscall_64+0x47/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes:
c0c050c58d840 ("bnxt_en: New Broadcom ethernet driver.")
Reported-by: Fei Liu <feliu@redhat.com>
CC: Jonathan Toppins <jtoppins@redhat.com>
CC: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Jonathan Toppins <jtoppins@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 10 Jul 2020 21:10:45 +0000 (14:10 -0700)]
Merge tag 'mlx5-updates-2020-07-09' of git://git./linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2020-07-09
This series provides updates to mlx5 CT (connection tracking) offloads
For more information please see tag log below.
Please pull and let me know if there is any problem.
The following conflict is expected when net is merged into net-next:
to resolve just use the hunks from net-next.
<<<<<<< HEAD (net-next)
mlx5_tc_ct_del_ft_entry(ct_priv, entry);
kfree(entry);
======= (net)
mlx5_tc_ct_entry_del_rules(ct_priv, entry);
kfree(entry);
>>>>>>>
b1a7d5bdfe54c98eca46e2c997d4e3b1484a49af
mlx5 connection tracking offloads updates:
1) Restore CT state from lookup in zone instead of tupleid
On a miss, Use this zone + 5 tuple taken from the skb, to lookup the CT
entry and restore it, instead of the driver allocated tuple id.
This improves flow insertion rate by avoiding the allocation of a header
rewrite context to maintain the tupleid.
2) Re-use modify header HW objects for identical modify actions.
3) Expand tunnel register mappings
Reg_c1 is 32 bits wide. Before this patchset, 24 bit were allocated
for the tuple_id, 6 bits for tunnel mapping and 2 bits for tunnel
options mappings.
Restoring the ct state from zone lookup instead of tuple id requires
reg_c1 to store 8 bits mapping the ct zone, leaving 24 bits for tunnel
mappings.
Expand tunnel and tunnel options register mappings to 12 bit each.
4) Trivial cleanup and fixes.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 10 Jul 2020 21:07:43 +0000 (14:07 -0700)]
Merge git://git./pub/scm/linux/kernel/git/bpf/bpf
Alexei Starovoitov says:
====================
pull-request: bpf 2020-07-09
The following pull-request contains BPF updates for your *net* tree.
We've added 4 non-merge commits during the last 1 day(s) which contain
a total of 4 files changed, 26 insertions(+), 15 deletions(-).
The main changes are:
1) fix crash in libbpf on 32-bit archs, from Jakub and Andrii.
2) fix crash when l2tp and bpf_sk_reuseport conflict, from Martin.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 10 Jul 2020 21:02:01 +0000 (14:02 -0700)]
Merge tag 'mlx5-fixes-2020-07-02' of git://git./linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5 fixes 2020-07-02
This series introduces some fixes to mlx5 driver.
V1->v2:
- Drop "ip -s" patch and mirred device hold reference patch.
- Will revise them in a later submission.
Please pull and let me know if there is any problem.
For -stable v5.2
('net/mlx5: Fix eeprom support for SFP module')
For -stable v5.4
('net/mlx5e: Fix 50G per lane indication')
For -stable v5.5
('net/mlx5e: Fix CPU mapping after function reload to avoid aRFS RX crash')
('net/mlx5e: Fix VXLAN configuration restore after function reload')
For -stable v5.7
('net/mlx5e: CT: Fix memory leak in cleanup')
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 10 Jul 2020 20:54:00 +0000 (13:54 -0700)]
Merge branch 'udp_tunnel-add-NIC-RX-port-offload-infrastructure'
Jakub Kicinski says:
====================
udp_tunnel: add NIC RX port offload infrastructure
Kernel has a facility to notify drivers about the UDP tunnel ports
so that devices can recognize tunneled packets. This is important
mostly for RX - devices which don't support CHECKSUM_COMPLETE can
report checksums of inner packets, and compute RSS over inner headers.
Some drivers also match the UDP tunnel ports also for TX, although
doing so may lead to false positives and negatives.
Unfortunately the user experience when trying to take adavantage
of these facilities is suboptimal. First of all there is no way
for users to check which ports are offloaded. Many drivers resort
to printing messages to aid debugging, other use debugfs. Even worse
the availability of the RX features (NETIF_F_RX_UDP_TUNNEL_PORT)
is established purely on the basis of the driver having the ndos
installed. For most drivers, however, the ability to perform offloads
is contingent on device capabilities (driver support multiple device
and firmware versions). Unless driver resorts to hackish clearing
of features set incorrectly by the core - users are left guessing
whether their device really supports UDP tunnel port offload or not.
There is currently no way to indicate or configure whether RX
features include just the checksum offload or checksum and using
inner headers for RSS. Many drivers default to not using inner
headers for RSS because most implementations populate the source
port with entropy from the inner headers. This, however, is not
always the case, for example certain switches are only able to
use a fixed source port during encapsulation.
We have also seen many driver authors get the intricacies of UDP
tunnel port offloads wrong. Most commonly the drivers forget to
perform reference counting, or take sleeping locks in the callbacks.
This work tries to improve the situation by pulling the UDP tunnel
port table maintenance out of the drivers. It turns out that almost
all drivers maintain a fixed size table of ports (in most cases one
per tunnel type), so we can take care of all the refcounting in the
core, and let the driver specify if they need to sleep in the
callbacks or not. The new common implementation will also support
replacing ports - when a port is removed from a full table it will
try to find a previously missing port to take its place.
This patch only implements the core functionality along with a few
drivers I was hoping to test manually [1] along with a test based
on a netdevsim implementation. Following patches will convert all
the drivers. Once that's complete we can remove the ndos, and rely
directly on the new infrastrucutre.
Then after RSS (RXFH) is converted to netlink we can add the ability
to configure the use of inner RSS headers for UDP tunnels.
[1] Unfortunately I wasn't able to, turns out 2 of the devices
I had access to were older generation or had old FW, and they
did not actually support UDP tunnel port notifications (see
the second paragraph). The thrid device appears to program
the UDP ports correctly but it generates bad UDP checksums with
or without these patches. Long story short - I'd appreciate
reviews and testing here..
v4:
- better build fix (hopefully this one does it..)
v3:
- fix build issue;
- improve bnxt changes.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 10 Jul 2020 00:42:53 +0000 (17:42 -0700)]
mlx4: convert to new udp_tunnel_nic infra
Convert to new infra, make use of the ability to sleep in the callback.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 10 Jul 2020 00:42:52 +0000 (17:42 -0700)]
bnxt: convert to new udp_tunnel_nic infra
Convert to new infra, taking advantage of sleeping in callbacks.
v2:
- use bp->*_fw_dst_port_id != INVALID_HW_RING_ID as indication
that the offload is active.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 10 Jul 2020 00:42:51 +0000 (17:42 -0700)]
ixgbe: convert to new udp_tunnel_nic infra
Make use of new common udp_tunnel_nic infra. ixgbe supports
IPv4 only, and only single VxLAN and Geneve ports (one each).
v2:
- split out the RXCSUM feature handling to separate change;
- declare structs separately;
- use ti.type instead of assuming table 0 is VxLAN;
- move setting netdev->udp_tunnel_nic_info to its own switch.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 10 Jul 2020 00:42:50 +0000 (17:42 -0700)]
ixgbe: don't clear UDP tunnel ports when RXCSUM is disabled
It appears the clearing of UDP tunnel ports when RXCSUM
is disabled is unnecessary. Driver will not pay attention
to checksum bits if RXCSUM is not set, so we can let
the hardware parse the packets.
Note that the UDP tunnel port NDO handlers don't pay attention
to the state of RXCSUM, so the ports could had been re-programmed,
anyway.
This cleanup simplifies later conversion patch.
v2:
- break this out of the following patch.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 10 Jul 2020 00:42:49 +0000 (17:42 -0700)]
selftests: net: add a test for UDP tunnel info infra
Add validating the UDP tunnel infra works.
$ ./udp_tunnel_nic.sh
PASSED all 383 checks
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 10 Jul 2020 00:42:48 +0000 (17:42 -0700)]
netdevsim: add UDP tunnel port offload support
Add UDP tunnel port handlers to our fake driver so we can test
the core infra.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 10 Jul 2020 00:42:47 +0000 (17:42 -0700)]
ethtool: add tunnel info interface
Add an interface to report offloaded UDP ports via ethtool netlink.
Now that core takes care of tracking which UDP tunnel ports the NICs
are aware of we can quite easily export this information out to
user space.
The responsibility of writing the netlink dumps is split between
ethtool code and udp_tunnel_nic.c - since udp_tunnel module may
not always be loaded, yet we should always report the capabilities
of the NIC.
$ ethtool --show-tunnels eth0
Tunnel information for eth0:
UDP port table 0:
Size: 4
Types: vxlan
No entries
UDP port table 1:
Size: 4
Types: geneve, vxlan-gpe
Entries (1):
port 1230, vxlan-gpe
v4:
- back to v2, build fix is now directly in udp_tunnel.h
v3:
- don't compile ETHTOOL_MSG_TUNNEL_INFO_GET in if CONFIG_INET
not set.
v2:
- fix string set count,
- reorder enums in the uAPI,
- fix type of ETHTOOL_A_TUNNEL_UDP_TABLE_TYPES to bitset
in docs and comments.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 10 Jul 2020 00:42:46 +0000 (17:42 -0700)]
udp_tunnel: add central NIC RX port offload infrastructure
Cater to devices which:
(a) may want to sleep in the callbacks;
(b) only have IPv4 support;
(c) need all the programming to happen while the netdev is up.
Drivers attach UDP tunnel offload info struct to their netdevs,
where they declare how many UDP ports of various tunnel types
they support. Core takes care of tracking which ports to offload.
Use a fixed-size array since this matches what almost all drivers
do, and avoids a complexity and uncertainty around memory allocations
in an atomic context.
Make sure that tunnel drivers don't try to replay the ports when
new NIC netdev is registered. Automatic replays would mess up
reference counting, and will be removed completely once all drivers
are converted.
v4:
- use a #define NULL to avoid build issues with CONFIG_INET=n.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 10 Jul 2020 00:42:45 +0000 (17:42 -0700)]
udp_tunnel: re-number the offload tunnel types
Make it possible to use tunnel types as flags more easily.
There doesn't appear to be any user using the type as an
array index, so this should make no difference.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 10 Jul 2020 00:42:44 +0000 (17:42 -0700)]
debugfs: make sure we can remove u32_array files cleanly
debugfs_create_u32_array() allocates a small structure to wrap
the data and size information about the array. If users ever
try to remove the file this leads to a leak since nothing ever
frees this wrapper.
That said there are no upstream users of debugfs_create_u32_array()
that'd remove a u32 array file (we only have one u32 array user in
CMA), so there is no real bug here.
Make callers pass a wrapper they allocated. This way the lifetime
management of the wrapper is on the caller, and we can avoid the
potential leak in debugfs.
CC: Chucheng Luo <luochucheng@vivo.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 10 Jul 2020 20:09:41 +0000 (13:09 -0700)]
Merge tag 'for-linus' of git://git./linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
"Small update, a few more merge window bugs and normal driver bug
fixes:
- Two merge window regressions in mlx5: a error path bug found by
syzkaller and some lost code during a rework preventing ipoib from
working in some configurations
- Silence clang compilation warning in OPA related code
- Fix a long standing race condition in ib_nl for ACM
- Resolve when the HFI1 is shutdown"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/mlx5: Set PD pointers for the error flow unwind
IB/mlx5: Fix 50G per lane indication
RDMA/siw: Fix reporting vendor_part_id
IB/sa: Resolv use-after-free in ib_nl_make_request()
IB/hfi1: Do not destroy link_wq when the device is shut down
IB/hfi1: Do not destroy hfi1_wq when the device is shut down
RDMA/mlx5: Fix legacy IPoIB QP initialization
IB/hfi1: Add explicit cast OPA_MTU_8192 to 'enum ib_mtu'
Linus Torvalds [Fri, 10 Jul 2020 17:15:37 +0000 (10:15 -0700)]
Merge tag 'linux-kselftest-fixes-5.8-rc5' of git://git./linux/kernel/git/shuah/linux-kselftest
Pull kselftest fixes from Shuah Khan:
"TPM2 test changes to run on python3 and kselftest framework fix to
incorrect return type"
* tag 'linux-kselftest-fixes-5.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
kselftest: ksft_test_num return type should be unsigned
selftests: tpm: upgrade TPM2 tests from Python 2 to Python 3
Linus Torvalds [Fri, 10 Jul 2020 16:57:57 +0000 (09:57 -0700)]
Merge tag 'io_uring-5.8-2020-07-10' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
- Fix memleak for error path in registered files (Yang)
- Export CQ overflow state in flags, necessary to fix a case where
liburing doesn't know if it needs to enter the kernel (Xiaoguang)
- Fix for a regression in when user memory is accounted freed, causing
issues with back-to-back ring exit + init if the ulimit -l setting is
very tight.
* tag 'io_uring-5.8-2020-07-10' of git://git.kernel.dk/linux-block:
io_uring: account user memory freed when exit has been queued
io_uring: fix memleak in io_sqe_files_register()
io_uring: fix memleak in __io_sqe_files_update()
io_uring: export cq overflow status to userspace
Linus Torvalds [Fri, 10 Jul 2020 16:55:46 +0000 (09:55 -0700)]
Merge tag 'block-5.8-2020-07-10' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
- Fix for inflight accounting, which affects only dm (Ming)
- Fix documentation error for bfq (Yufen)
- Fix memory leak for nbd (Zheng)
* tag 'block-5.8-2020-07-10' of git://git.kernel.dk/linux-block:
nbd: Fix memory leak in nbd_add_socket
blk-mq: consider non-idle request as "inflight" in blk_mq_rq_inflight()
docs: block: update and fix tiny error for bfq
Linus Torvalds [Fri, 10 Jul 2020 16:45:15 +0000 (09:45 -0700)]
Merge tag 'cleanup-kernel_read_write' of git://git.infradead.org/users/hch/misc
Pull in-kernel read and write op cleanups from Christoph Hellwig:
"Cleanup in-kernel read and write operations
Reshuffle the (__)kernel_read and (__)kernel_write helpers, and ensure
all users of in-kernel file I/O use them if they don't use iov_iter
based methods already.
The new WARN_ONs in combination with syzcaller already found a missing
input validation in 9p. The fix should be on your way through the
maintainer ASAP".
[ This is prep-work for the real changes coming 5.9 ]
* tag 'cleanup-kernel_read_write' of git://git.infradead.org/users/hch/misc:
fs: remove __vfs_read
fs: implement kernel_read using __kernel_read
integrity/ima: switch to using __kernel_read
fs: add a __kernel_read helper
fs: remove __vfs_write
fs: implement kernel_write using __kernel_write
fs: check FMODE_WRITE in __kernel_write
fs: unexport __kernel_write
bpfilter: switch to kernel_write
autofs: switch to kernel_write
cachefiles: switch to kernel_write
Linus Torvalds [Fri, 10 Jul 2020 16:36:03 +0000 (09:36 -0700)]
Merge tag 'dma-mapping-5.8-5' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping fixes from Christoph Hellwig:
- add a warning when the atomic pool is depleted (David Rientjes)
- protect the parameters of the new scatterlist helper macros (Marek
Szyprowski )
* tag 'dma-mapping-5.8-5' of git://git.infradead.org/users/hch/dma-mapping:
scatterlist: protect parameters of the sg_table related macros
dma-mapping: warn when coherent pool is depleted
Linus Torvalds [Fri, 10 Jul 2020 16:28:52 +0000 (09:28 -0700)]
Merge tag 'pinctrl-v5.8-3' of git://git./linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
- Fix an issue in the AMD driver for the UART0 group
- Fix a glitch issue in the Baytrail pin controller
* tag 'pinctrl-v5.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: baytrail: Fix pin being driven low for a while on gpiod_get(..., GPIOD_OUT_HIGH)
pinctrl: amd: fix npins for uart0 in kerncz_groups
Linus Torvalds [Fri, 10 Jul 2020 16:19:39 +0000 (09:19 -0700)]
Merge tag 'gpio-v5.8-2' of git://git./linux/kernel/git/linusw/linux-gpio
Pull GPIO fixes from Linus Walleij:
"Some GPIO fixes, most of them for the PCA953x that Andy worked hard to
fix up.
- Fix two runtime PM errorpath problems in the Arizona GPIO driver.
- Fix three interrupt issues in the PCA953x driver.
- Fix the automatic address increment handling in the PCA953x driver
again.
- Add a quirk to the PCA953x that fixes a problem in the Intel
Galileo Gen 2"
* tag 'gpio-v5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: pca953x: Fix GPIO resource leak on Intel Galileo Gen 2
gpio: pca953x: disable regmap locking for automatic address incrementing
gpio: pca953x: Fix direction setting when configure an IRQ
gpio: pca953x: Override IRQ for one of the expanders on Galileo Gen 2
gpio: pca953x: Synchronize interrupt handler properly
gpio: arizona: put pm_runtime in case of failure
gpio: arizona: handle pm_runtime_get_sync failure case
Linus Torvalds [Fri, 10 Jul 2020 15:53:21 +0000 (08:53 -0700)]
Merge tag 'gfs2-v5.8-rc4.fixes' of git://git./linux/kernel/git/gfs2/linux-gfs2
Pull gfs2 fixes from Andreas Gruenbacher:
"Fix gfs2 readahead deadlocks by adding a IOCB_NOIO flag that allows
gfs2 to use the generic fiel read iterator functions without having to
worry about being called back while holding locks".
* tag 'gfs2-v5.8-rc4.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
gfs2: Rework read and page fault locking
fs: Add IOCB_NOIO flag for generic_file_read_iter
Linus Torvalds [Fri, 10 Jul 2020 15:42:17 +0000 (08:42 -0700)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"An unfortunately large collection of arm64 fixes for -rc5.
Some of this is absolutely trivial, but the alternatives, vDSO and CPU
errata workaround fixes are significant. At least people are finding
and fixing these things, I suppose.
- Fix workaround for CPU erratum #1418040 to disable the compat vDSO
- Fix Oops when single-stepping with KGDB
- Fix memory attributes for hypervisor device mappings at EL2
- Fix memory leak in PSCI and remove useless variable assignment
- Fix up some comments and asm labels in our entry code
- Fix broken register table formatting in our generated html docs
- Fix missing NULL sentinel in CPU errata workaround list
- Fix patching of branches in alternative instruction sections"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64/alternatives: don't patch up internal branches
arm64: Add missing sentinel to erratum_1463225
arm64: Documentation: Fix broken table in generated HTML
arm64: kgdb: Fix single-step exception handling oops
arm64: entry: Tidy up block comments and label numbers
arm64: Rework ARM_ERRATUM_1414080 handling
arm64: arch_timer: Disable the compat vdso for cores affected by ARM64_WORKAROUND_1418040
arm64: arch_timer: Allow an workaround descriptor to disable compat vdso
arm64: Introduce a way to disable the 32bit vdso
arm64: entry: Fix the typo in the comment of el1_dbg()
drivers/firmware/psci: Assign @err directly in hotplug_tests()
drivers/firmware/psci: Fix memory leakage in alloc_init_cpu_groups()
KVM: arm64: Fix definition of PAGE_HYP_DEVICE
Linus Torvalds [Fri, 10 Jul 2020 15:39:33 +0000 (08:39 -0700)]
Merge tag 's390-5.8-5' of git://git./linux/kernel/git/s390/linux
Pull s390 fixes from Heiko Carstens:
"This is mainly due to the fact that Gerald Schaefer's and also my old
email addresses currently do not work any longer. Therefore we decided
to switch to new email addresses and reflect that in the MAINTAINERS
file.
- Update email addresses in MAINTAINERS file and add .mailmap entries
for Gerald Schaefer and Heiko Carstens.
- Fix huge pte soft dirty copying"
* tag 's390-5.8-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
MAINTAINERS: update email address for Gerald Schaefer
MAINTAINERS: update email address for Heiko Carstens
s390/mm: fix huge pte soft dirty copying
Linus Torvalds [Fri, 10 Jul 2020 15:34:12 +0000 (08:34 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull vkm fixes from Paolo Bonzini:
"Two simple but important bugfixes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: MIPS: Fix build errors for 32bit kernel
KVM: nVMX: fixes for preemption timer migration
Linus Torvalds [Fri, 10 Jul 2020 15:28:49 +0000 (08:28 -0700)]
Merge tag 'mmc-v5.8-rc1' of git://git./linux/kernel/git/ulfh/mmc
Pull MMC fixes from Ulf Hansson:
- Override DLL_CONFIG only with valid values in sdhci-msm
- Get rid of of_match_ptr() macro to fix warning in owl-mmc
- Limit segments to 1 to fix meson-gx G12A/G12B SoCs
* tag 'mmc-v5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: sdhci-msm: Override DLL_CONFIG only if the valid value is supplied
mmc: owl-mmc: Get rid of of_match_ptr() macro
mmc: meson-gx: limit segments to 1 when dram-access-quirk is needed
Jens Axboe [Fri, 10 Jul 2020 15:13:34 +0000 (09:13 -0600)]
io_uring: account user memory freed when exit has been queued
We currently account the memory after the exit work has been run, but
that leaves a gap where a process has closed its ring and until the
memory has been accounted as freed. If the memlocked ulimit is
borderline, then that can introduce spurious setup errors returning
-ENOMEM because the free work hasn't been run yet.
Account this as freed when we close the ring, as not to expose a tiny
gap where setting up a new ring can fail.
Fixes:
85faa7b8346e ("io_uring: punt final io_ring_ctx wait-and-free to workqueue")
Cc: stable@vger.kernel.org # v5.7
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Yang Yingliang [Fri, 10 Jul 2020 14:14:20 +0000 (14:14 +0000)]
io_uring: fix memleak in io_sqe_files_register()
I got a memleak report when doing some fuzz test:
BUG: memory leak
unreferenced object 0x607eeac06e78 (size 8):
comm "test", pid 295, jiffies
4294735835 (age 31.745s)
hex dump (first 8 bytes):
00 00 00 00 00 00 00 00 ........
backtrace:
[<
00000000932632e6>] percpu_ref_init+0x2a/0x1b0
[<
0000000092ddb796>] __io_uring_register+0x111d/0x22a0
[<
00000000eadd6c77>] __x64_sys_io_uring_register+0x17b/0x480
[<
00000000591b89a6>] do_syscall_64+0x56/0xa0
[<
00000000864a281d>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Call percpu_ref_exit() on error path to avoid
refcount memleak.
Fixes:
05f3fb3c5397 ("io_uring: avoid ring quiesce for fixed file set unregister and update")
Cc: stable@vger.kernel.org
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Gerald Schaefer [Fri, 10 Jul 2020 11:36:26 +0000 (13:36 +0200)]
MAINTAINERS: update email address for Gerald Schaefer
Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Heiko Carstens [Thu, 9 Jul 2020 08:37:54 +0000 (10:37 +0200)]
MAINTAINERS: update email address for Heiko Carstens
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Huacai Chen [Fri, 10 Jul 2020 07:23:17 +0000 (15:23 +0800)]
KVM: MIPS: Fix build errors for 32bit kernel
Commit
dc6d95b153e78ed70b1b2c04a ("KVM: MIPS: Add more MMIO load/store
instructions emulation") introduced some 64bit load/store instructions
emulation which are unavailable on 32bit platform, and it causes build
errors:
arch/mips/kvm/emulate.c: In function 'kvm_mips_emulate_store':
arch/mips/kvm/emulate.c:1734:6: error: right shift count >= width of type [-Werror]
((vcpu->arch.gprs[rt] >> 56) & 0xff);
^
arch/mips/kvm/emulate.c:1738:6: error: right shift count >= width of type [-Werror]
((vcpu->arch.gprs[rt] >> 48) & 0xffff);
^
arch/mips/kvm/emulate.c:1742:6: error: right shift count >= width of type [-Werror]
((vcpu->arch.gprs[rt] >> 40) & 0xffffff);
^
arch/mips/kvm/emulate.c:1746:6: error: right shift count >= width of type [-Werror]
((vcpu->arch.gprs[rt] >> 32) & 0xffffffff);
^
arch/mips/kvm/emulate.c:1796:6: error: left shift count >= width of type [-Werror]
(vcpu->arch.gprs[rt] << 32);
^
arch/mips/kvm/emulate.c:1800:6: error: left shift count >= width of type [-Werror]
(vcpu->arch.gprs[rt] << 40);
^
arch/mips/kvm/emulate.c:1804:6: error: left shift count >= width of type [-Werror]
(vcpu->arch.gprs[rt] << 48);
^
arch/mips/kvm/emulate.c:1808:6: error: left shift count >= width of type [-Werror]
(vcpu->arch.gprs[rt] << 56);
^
cc1: all warnings being treated as errors
make[3]: *** [arch/mips/kvm/emulate.o] Error 1
So, use #if defined(CONFIG_64BIT) && defined(CONFIG_KVM_MIPS_VZ) to
guard the 64bit load/store instructions emulation.
Reported-by: kernel test robot <lkp@intel.com>
Fixes:
dc6d95b153e78ed70b1b2c04a ("KVM: MIPS: Add more MMIO load/store instructions emulation")
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Message-Id: <
1594365797-536-1-git-send-email-chenhc@lemote.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Thu, 9 Jul 2020 17:12:09 +0000 (13:12 -0400)]
KVM: nVMX: fixes for preemption timer migration
Commit
850448f35aaf ("KVM: nVMX: Fix VMX preemption timer migration",
2020-06-01) accidentally broke nVMX live migration from older version
by changing the userspace ABI. Restore it and, while at it, ensure
that vmx->nested.has_preemption_timer_deadline is always initialized
according to the KVM_STATE_VMX_PREEMPTION_TIMER_DEADLINE flag.
Cc: Makarand Sonare <makarandsonare@google.com>
Fixes:
850448f35aaf ("KVM: nVMX: Fix VMX preemption timer migration")
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Roi Dayan [Tue, 30 Jun 2020 12:40:37 +0000 (15:40 +0300)]
net/mlx5e: CT: Fix releasing ft entries
Before this commit, on ft flush, ft entries were not removed
from the ct_tuple hashtables. Fix it.
Fixes:
ac991b48d43c ("net/mlx5e: CT: Offload established flows")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Saeed Mahameed [Mon, 4 May 2020 22:53:06 +0000 (15:53 -0700)]
net/mlx5e: CT: Remove unused function param
"flow" parameter is not used in __mlx5_tc_ct_flow_offload_clear(),
remove it.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Saeed Mahameed [Mon, 4 May 2020 22:52:14 +0000 (15:52 -0700)]
net/mlx5e: CT: Return err_ptr from internal functions
Instead of having to deal with converting between int and ERR_PTR for
return values in mlx5_tc_ct_flow_offload(), make the internal helper
functions return a ptr to mlx5_flow_handle instead of passing it as
output param, this will also avoid gcc confusion and false alarms,
thus we remove the redundant ERR_PTR rule initialization.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Suggested-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Paul Blakey [Tue, 5 May 2020 13:41:02 +0000 (16:41 +0300)]
net/mlx5e: CT: Expand tunnel register mappings
Reg_c1 is 32 bits wide. Originally, 24 bit were allocated for the tuple_id,
6 bits for tunnel mapping and 2 bits for tunnel options mappings.
Restoring the ct state from zone lookup instead of tuple id requires
reg_c1 to store 8 bits mapping the ct zone, leaving 24 bits for tunnel
mappings.
Expand tunnel and tunnel options register mappings to 12 bit each.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Paul Blakey [Tue, 5 May 2020 13:37:22 +0000 (16:37 +0300)]
net/mlx5e: CT: Use mapping for zone restore register
Use a single byte mapping for zone restore register (zone matching
remains 16 bit).
This makes room for using the freed 8 bits on register C1 for
mapping more tunnels and tunnel options.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Paul Blakey [Tue, 18 Feb 2020 08:24:07 +0000 (10:24 +0200)]
net/mlx5e: CT: Re-use tuple modify headers for identical modify actions
After removing the tupleid register which changed per tuple,
tuple modify headers set the ct_state, zone, mark, and label registers.
For non-natted tuples going through the same tc rules path, their values
will be the same, and all their modify headers will be the same.
Re-use tuple modify header when possible, by adding each new modify
header to an hahstable, and looking up identical ones before creating
a new one.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Paul Blakey [Tue, 17 Mar 2020 14:32:21 +0000 (16:32 +0200)]
net/mlx5e: Export sharing of mod headers to a new file
Refactor sharing of mod headers to new file and while there,
remove spin lock and flows list, as this is only used for warn on.
Use the generic API in the next patch to re-use tuple modify headers
for identical modify actions,
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Paul Blakey [Sun, 29 Mar 2020 10:50:47 +0000 (13:50 +0300)]
net/mlx5e: CT: Restore ct state from lookup in zone instead of tupleid
Remove tupleid, and replace it with zone_restore, which is the zone an
established tuple sets after match. On miss, Use this zone + tuple
taken from the skb, to lookup the ct entry and restore it.
This improves flow insertion rate by avoiding the allocation of a header
rewrite context.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Paul Blakey [Wed, 22 Apr 2020 15:00:25 +0000 (18:00 +0300)]
net/mlx5e: CT: Don't offload tuple rewrites for established tuples
Next patches will remove the tupleid registers that is used
to restore the ct state on miss, and instead use the tuple on
the missed packet to lookup which state to restore.
Disable tuple rewrites after connection tracking.
For tuple rewrites, inject a ct_state=-trk match so it won't
change the tuple for established flows (+trk) that passed connection
tracking, and instead miss to software.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Oz Shlomo [Mon, 1 Jun 2020 17:08:55 +0000 (17:08 +0000)]
net/mlx5e: Use netdev_info instead of pr_info
The next patch will pass the mlx5e_priv struct to the
modify_header_match_supported method. Use this opportunity to refactor
the existing pr_info call to a netdev_info call.
Signed-off-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Paul Blakey [Sun, 3 May 2020 13:45:02 +0000 (16:45 +0300)]
net/mlx5e: CT: Allow header rewrite of 5-tuple and ct clear action
With ct clear we don't jump to the ct tables, so header rewrite
of 5-tuple can be done in place (and not moved to after the CT action).
Check for ct clear action, and if so, allow 5-tuple header
rewrite.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Paul Blakey [Sun, 29 Mar 2020 10:07:49 +0000 (13:07 +0300)]
net/mlx5e: CT: Save ct entries tuples in hashtables
Save original tuple and natted tuple in two new hashtables.
This is a pre-step for restoring ct state after hw miss by performing a
5-tuple lookup on the hash tables.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Parav Pandit [Wed, 24 Jun 2020 10:56:25 +0000 (05:56 -0500)]
net/mlx5: E-switch, When eswitch is unsupported, return -EOPNOTSUPP
When eswitch is unsupported, currently -EPERM error code is returned
instead of -EOPNOTSUPP.
Due to this VF device's devlink virtual port is not enumerated because
port_function_get() callback returned -EPERM instead of -EOPNOTSUPP.
Hence, return the error code -EOPNOTSUPP when eswitch is unsupported.
Fixes:
bd93975353d5 ("net/mlx5: E-switch, Introduce and use eswitch support check helper")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Jakub Bogusz [Thu, 9 Jul 2020 22:57:23 +0000 (15:57 -0700)]
libbpf: Fix libbpf hashmap on (I)LP32 architectures
On ILP32, 64-bit result was shifted by value calculated for 32-bit long type
and returned value was much outside hashmap capacity.
As advised by Andrii Nakryiko, this patch uses different hashing variant for
architectures with size_t shorter than long long.
Fixes:
e3b924224028 ("libbpf: add resizable non-thread safe internal hashmap")
Signed-off-by: Jakub Bogusz <qboosh@pld-linux.org>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200709225723.1069937-1-andriin@fb.com
Eli Britstein [Sun, 28 Jun 2020 12:42:26 +0000 (15:42 +0300)]
net/mlx5e: CT: Fix memory leak in cleanup
CT entries are deleted via a workqueue from netfilter. If removing the
module before that, the rules are cleaned by the driver itself, but the
memory entries for them are not freed. Fix that.
Fixes:
ac991b48d43c ("net/mlx5e: CT: Offload established flows")
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Eran Ben Elisha [Mon, 22 Jun 2020 06:03:31 +0000 (09:03 +0300)]
net/mlx5e: Fix port buffers cell size value
Device unit for port buffers size, xoff_threshold and xon_threshold is
cells. Fix a bug in driver where cell unit size was hard-coded to
128 bytes. This hard-coded value is buggy, as it is wrong for some hardware
versions.
Driver to read cell size from SBCAM register and translate bytes to cell
units accordingly.
In order to fix the bug, this patch exposes SBCAM (Shared buffer
capabilities mask) layout and defines.
If SBCAM.cap_cell_size is valid, use it for all bytes to cells
calculations. If not valid, fallback to 128.
Cell size do not change on the fly per device. Instead of issuing SBCAM
access reg command every time such translation is needed, cache it in
mlx5e_dcbx as part of mlx5e_dcbnl_initialize(). Pass dcbx.port_buff_cell_sz
as a param to every function that needs bytes to cells translation.
While fixing the bug, move MLX5E_BUFFER_CELL_SHIFT macro to
en_dcbnl.c, as it is only used by that file.
Fixes:
0696d60853d5 ("net/mlx5e: Receive buffer configuration")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Aya Levin [Mon, 15 Jun 2020 09:48:47 +0000 (12:48 +0300)]
net/mlx5e: Fix 50G per lane indication
Some released FW versions mistakenly don't set the capability that 50G
per lane link-modes are supported for VFs (ptys_extended_ethernet
capability bit). When the capability is unset, read
PTYS.ext_eth_proto_capability (always reliable).
If PTYS.ext_eth_proto_capability is valid (has a non-zero value)
conclude that the HCA supports 50G per lane. Otherwise, conclude that
the HCA doesn't support 50G per lane.
Fixes:
a08b4ed1373d ("net/mlx5: Add support to ext_* fields introduced in Port Type and Speed register")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Aya Levin [Wed, 20 May 2020 07:37:42 +0000 (10:37 +0300)]
net/mlx5e: Fix CPU mapping after function reload to avoid aRFS RX crash
After function reload, CPU mapping used by aRFS RX is broken, leading to
a kernel panic. Fix by moving initialization of rx_cpu_rmap from
netdev_init to netdev_attach. IRQ table is re-allocated on mlx5_load,
but netdev is not re-initialize.
Trace of the panic:
[ 22.055672] general protection fault, probably for non-canonical address 0x785634120000ff1c: 0000 [#1] SMP PTI
[ 22.065010] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 5.7.0-rc2-for-upstream-perf-2020-04-21_16-34-03-31 #1
[ 22.067967] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
[ 22.071174] RIP: 0010:get_rps_cpu+0x267/0x300
[ 22.075692] RSP: 0018:
ffffc90000244d60 EFLAGS:
00010202
[ 22.076888] RAX:
ffff888459b0e400 RBX:
0000000000000000 RCX:
0000000000000007
[ 22.078364] RDX:
0000000000008884 RSI:
ffff888467cb5b00 RDI:
0000000000000000
[ 22.079815] RBP:
00000000ff342b27 R08:
0000000000000007 R09:
0000000000000003
[ 22.081289] R10:
ffffffffffffffff R11:
00000000000070cc R12:
ffff888454900000
[ 22.082767] R13:
ffffc90000e5a950 R14:
ffffc90000244dc0 R15:
0000000000000007
[ 22.084190] FS:
0000000000000000(0000) GS:
ffff88846fc80000(0000)knlGS:
0000000000000000
[ 22.086161] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 22.087427] CR2:
ffffffffffffffff CR3:
0000000464426003 CR4:
0000000000760ee0
[ 22.088888] DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
[ 22.090336] DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
[ 22.091764] PKRU:
55555554
[ 22.092618] Call Trace:
[ 22.093442] <IRQ>
[ 22.094211] ? kvm_clock_get_cycles+0xd/0x10
[ 22.095272] netif_receive_skb_list_internal+0x258/0x2a0
[ 22.096460] gro_normal_list.part.137+0x19/0x40
[ 22.097547] napi_complete_done+0xc6/0x110
[ 22.098685] mlx5e_napi_poll+0x190/0x670 [mlx5_core]
[ 22.099859] net_rx_action+0x2a0/0x400
[ 22.100848] __do_softirq+0xd8/0x2a8
[ 22.101829] irq_exit+0xa5/0xb0
[ 22.102750] do_IRQ+0x52/0xd0
[ 22.103654] common_interrupt+0xf/0xf
[ 22.104641] </IRQ>
Fixes:
4383cfcc65e7 ("net/mlx5: Add devlink reload")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Aya Levin [Wed, 24 Jun 2020 16:04:03 +0000 (19:04 +0300)]
net/mlx5e: Fix VXLAN configuration restore after function reload
When detaching netdev, remove vxlan port configuration using
udp_tunnel_drop_rx_info. During function reload, configuration will be
restored using udp_tunnel_get_rx_info. This ensures sync between
firmware and driver. Use udp_tunnel_get_rx_info even if its physical
interface is down.
Fixes:
4383cfcc65e7 ("net/mlx5: Add devlink reload")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Vlad Buslov [Wed, 17 Jun 2020 14:51:53 +0000 (17:51 +0300)]
net/mlx5e: Fix usage of rcu-protected pointer
In mlx5e_configure_flower() flow pointer is protected by rcu read lock.
However, after cited commit the pointer is being used outside of rcu read
block. Extend the block to protect all pointer accesses.
Fixes:
553f9328385d ("net/mlx5e: Support tc block sharing for representors")
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Vlad Buslov [Wed, 17 Jun 2020 14:26:33 +0000 (17:26 +0300)]
net/mxl5e: Verify that rpriv is not NULL
In helper function is_flow_rule_duplicate_allowed() verify that rpviv
pointer is not NULL before dereferencing it. This can happen when device is
in NIC mode and leads to following crash:
[90444.046419] BUG: kernel NULL pointer dereference, address:
0000000000000000
[90444.048149] #PF: supervisor read access in kernel mode
[90444.049781] #PF: error_code(0x0000) - not-present page
[90444.051386] PGD
80000003d35a4067 P4D
80000003d35a4067 PUD
3d35a3067 PMD 0
[90444.053051] Oops: 0000 [#1] SMP PTI
[90444.054683] CPU: 16 PID: 31736 Comm: tc Not tainted 5.8.0-rc1+ #1157
[90444.056340] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
[90444.058079] RIP: 0010:mlx5e_configure_flower+0x3aa/0x9b0 [mlx5_core]
[90444.059753] Code: 24 50 49 8b 95 08 02 00 00 48 b8 00 08 00 00 04 00 00 00 48 21 c2 48 39 c2 74 0a 41 f6 85 0d 02 00 00 20 74 16 48 8b 44 24 20 <48> 8b 00 66 83 78 20 ff 74 07 4d 89 aa e0 00 00 00 48 83 7d 28 00
[90444.063232] RSP: 0018:
ffffabe9c61ff768 EFLAGS:
00010246
[90444.065014] RAX:
0000000000000000 RBX:
ffff9b13c4c91e80 RCX:
00000000000093fa
[90444.066784] RDX:
0000000400000800 RSI:
0000000000000000 RDI:
000000000002d5e0
[90444.068533] RBP:
ffff9b174d308468 R08:
0000000000000000 R09:
ffff9b17d63003f0
[90444.070285] R10:
ffff9b17ea288600 R11:
0000000000000000 R12:
ffffabe9c61ff878
[90444.072032] R13:
ffff9b174d300000 R14:
ffffabe9c61ffbb8 R15:
ffff9b174d300880
[90444.073760] FS:
00007f3c23775480(0000) GS:
ffff9b13efc80000(0000) knlGS:
0000000000000000
[90444.075492] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[90444.077266] CR2:
0000000000000000 CR3:
00000003e2a60002 CR4:
00000000001606e0
[90444.079024] Call Trace:
[90444.080753] tc_setup_cb_add+0xca/0x1e0
[90444.082415] fl_hw_replace_filter+0x15f/0x1f0 [cls_flower]
[90444.084119] fl_change+0xa59/0x13dc [cls_flower]
[90444.085772] ? wait_for_completion+0xa8/0xf0
[90444.087364] tc_new_tfilter+0x3f5/0xa60
[90444.088960] rtnetlink_rcv_msg+0xeb/0x360
[90444.090514] ? __d_lookup_done+0x76/0xe0
[90444.092034] ? proc_alloc_inode+0x16/0x70
[90444.093560] ? prep_new_page+0x8c/0xf0
[90444.095048] ? _cond_resched+0x15/0x30
[90444.096483] ? rtnl_calcit.isra.0+0x110/0x110
[90444.097907] netlink_rcv_skb+0x49/0x110
[90444.099289] netlink_unicast+0x191/0x230
[90444.100629] netlink_sendmsg+0x243/0x480
[90444.101984] sock_sendmsg+0x5e/0x60
[90444.103305] ____sys_sendmsg+0x1f3/0x260
[90444.104597] ? copy_msghdr_from_user+0x5c/0x90
[90444.105916] ? __mod_lruvec_state+0x3c/0xe0
[90444.107210] ___sys_sendmsg+0x81/0xc0
[90444.108484] ? do_filp_open+0xa5/0x100
[90444.109732] ? handle_mm_fault+0x117b/0x1e00
[90444.110970] ? __check_object_size+0x46/0x147
[90444.112205] ? __check_object_size+0x136/0x147
[90444.113402] __sys_sendmsg+0x59/0xa0
[90444.114587] do_syscall_64+0x4d/0x90
[90444.115782] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[90444.116953] RIP: 0033:0x7f3c2393b7b8
[90444.118101] Code: Bad RIP value.
[90444.119240] RSP: 002b:
00007ffc6ad8e6c8 EFLAGS:
00000246 ORIG_RAX:
000000000000002e
[90444.120408] RAX:
ffffffffffffffda RBX:
0000000000000000 RCX:
00007f3c2393b7b8
[90444.121583] RDX:
0000000000000000 RSI:
00007ffc6ad8e740 RDI:
0000000000000003
[90444.122750] RBP:
000000005eea0c3a R08:
0000000000000001 R09:
00007ffc6ad8e68c
[90444.123928] R10:
0000000000404fa8 R11:
0000000000000246 R12:
0000000000000001
[90444.125073] R13:
0000000000000000 R14:
00007ffc6ad92a00 R15:
00000000004866a0
[90444.126221] Modules linked in: act_skbedit act_tunnel_key act_mirred bonding vxlan ip6_udp_tunnel udp_tunnel nfnetlink act_gact cls_flower sch_ingress openvswitch nsh nf_conncount nfsv3 nfs_acl nfs lockd grace fscache tun bridge stp llc sunrpc rdma_ucm rdma_cm iw_cm ib_cm mlx5_ib ib_uverbs ib_core mlx5_core intel_r
apl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel mlxfw kvm act_ct nf_flow_table nf_nat nf_conntrack irqbypass crct10dif_pclmul nf_defrag_ipv6 igb ipmi_ssif libcrc32c crc32_pclmul crc32c_intel ipmi_si nf_defrag_ipv4 ptp ghash_clmulni_intel mei_me ses iTCO_wdt i2c_i801 pps_core
ioatdma iTCO_vendor_support joydev mei enclosure intel_cstate i2c_smbus wmi dca ipmi_devintf intel_uncore lpc_ich ipmi_msghandler pcspkr acpi_pad acpi_power_meter ast i2c_algo_bit drm_vram_helper drm_kms_helper drm_ttm_helper ttm drm mpt3sas raid_class scsi_transport_sas
[90444.136253] CR2:
0000000000000000
[90444.137621] ---[ end trace
924af62aa2b151bd ]---
Fixes:
553f9328385d ("net/mlx5e: Support tc block sharing for representors")
Reported-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Vu Pham [Wed, 17 Jun 2020 22:11:24 +0000 (15:11 -0700)]
net/mlx5: E-Switch, Fix vlan or qos setting in legacy mode
Refactoring eswitch ingress acl codes accidentally inserts extra
memset zero that removes vlan and/or qos setting in legacy mode.
Fixes:
07bab9502641 ("net/mlx5: E-Switch, Refactor eswitch ingress acl codes")
Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Eran Ben Elisha [Sun, 14 Jun 2020 14:31:26 +0000 (17:31 +0300)]
net/mlx5: Fix eeprom support for SFP module
Fix eeprom SFP query support by setting i2c_addr, offset and page number
correctly. Unlike QSFP modules, SFP eeprom params are as follow:
- i2c_addr is 0x50 for offset 0 - 255 and 0x51 for offset 256 - 511.
- Page number is always zero.
- Page offset is always relative to zero.
As part of eeprom query, query the module ID (SFP / QSFP*) via helper
function to set the params accordingly.
In addition, change mlx5_qsfp_eeprom_page() input type to be u16 to avoid
unnecessary casting.
Fixes:
a708fb7b1f8d ("net/mlx5e: ethtool, Add support for EEPROM high pages query")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Linus Torvalds [Fri, 10 Jul 2020 01:20:19 +0000 (18:20 -0700)]
Merge tag 'drm-fixes-2020-07-10' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"I've been off most of the week, but some fixes have piled up. Seems a
bit busier than last week, but they are pretty spread out across a
bunch of drivers, none of them seem that big or worried me too much.
amdgpu:
- Fix a suspend/resume issue with PSP
- Backlight fix for Renoir
- Fix for gpu recovery debugging
radeon:
- Fix a double free in error path
i915:
- fbc fencing fix
- debugfs panic fix
- gem vma constuction fix
- gem pin under vm->nutex fix
nouveau:
- SVM fixes
- display fixes
meson:
- OSD burst length fixes
hibmc:
- runtime warning fix
mediatek:
- cmdq, mmsys fixes
- visibility check fixes"
* tag 'drm-fixes-2020-07-10' of git://anongit.freedesktop.org/drm/drm: (24 commits)
drm/amdgpu: don't do soft recovery if gpu_recovery=0
drm/radeon: fix double free
drm/amd/display: add dmcub check on RENOIR
drm/amdgpu: add TMR destory function for psp
drm/amdgpu: asd function needs to be unloaded in suspend phase
drm/hisilicon/hibmc: Move drm_fbdev_generic_setup() down to avoid the splat
drm/nouveau/nouveau: fix page fault on device private memory
drm/nouveau/svm: fix migrate page regression
drm/nouveau/i2c/g94-: increase NV_PMGR_DP_AUXCTL_TRANSACTREQ timeout
drm/nouveau/kms/nv50-: bail from nv50_audio_disable() early if audio not enabled
drm/i915/gt: Pin the rings before marking active
drm/i915: Also drop vm.ref along error paths for vma construction
drm/i915: Drop vm.ref for duplicate vma on construction
drm/i915/fbc: Fix fence_y_offset handling
drm/i915: Skip stale object handle for debugfs per-file-stats
drm/mediatek: mtk_hdmi: Remove debug messages for function calls
drm/mediatek: mtk_mt8173_hdmi_phy: Remove unnused const variables
drm/mediatek: Delete not used of_device_get_match_data
drm/mediatek: Remove unnecessary conversion to bool
drm/meson: viu: fix setting the OSD burst length in VIU_OSD1_FIFO_CTRL_STAT
...
Cesar Eduardo Barros [Thu, 9 Jul 2020 22:11:02 +0000 (19:11 -0300)]
Restore gcc check in mips asm/unroll.h
While raising the gcc version requirement to 4.9, the compile-time check
in the unroll macro was accidentally changed from being used on gcc and
clang to being used on clang only.
Restore the gcc check, changing it from "gcc >= 4.7" to "all gcc".
[ We should probably remove this all entirely: if we remove the check
for CLANG, then the check for GCC can go away. Older versions of clang
are not really appropriate or supported for kernel builds - Linus ]
Fixes:
6ec4476ac825 ("Raise gcc version requirement to 4.9")
Signed-off-by: Cesar Eduardo Barros <cesarb@cesarb.eti.br>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rikard Falkeborn [Wed, 8 Jul 2020 19:07:56 +0000 (21:07 +0200)]
kbuild: Move -Wtype-limits to W=2
-Wtype-limits is included in -Wextra which is added at W=1. It warns
(among other things) that 'comparison of an unsigned variable `< 0` is
always false. This causes noisy warnings, especially when used in
macros, hence it is more suitable for W=2.
Link: https://lore.kernel.org/lkml/CAHk-=wiKCXEWKJ9dWUimGbrVRo_N2RosESUw8E7m9AEtyZcu=w@mail.gmail.com/
Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cong Wang [Thu, 9 Jul 2020 23:28:44 +0000 (16:28 -0700)]
cgroup: Fix sock_cgroup_data on big-endian.
In order for no_refcnt and is_data to be the lowest order two
bits in the 'val' we have to pad out the bitfield of the u8.
Fixes:
ad0f75e5f57c ("cgroup: fix cgroup_sk_alloc() for sk_clone_lock()")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lorenz Bauer [Thu, 9 Jul 2020 11:51:51 +0000 (12:51 +0100)]
selftests: bpf: Fix detach from sockmap tests
Fix sockmap tests which rely on old bpf_prog_dispatch behaviour.
In the first case, the tests check that detaching without giving
a program succeeds. Since these are not the desired semantics,
invert the condition. In the second case, the clean up code doesn't
supply the necessary program fds.
Fixes:
bb0de3131f4c ("bpf: sockmap: Require attach_bpf_fd when detaching a program")
Reported-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20200709115151.75829-1-lmb@cloudflare.com
Dave Airlie [Thu, 9 Jul 2020 21:02:02 +0000 (07:02 +1000)]
Merge tag 'amd-drm-fixes-5.8-2020-07-09' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
amd-drm-fixes-5.8-2020-07-09:
amdgpu:
- Fix a suspend/resume issue with PSP
- Backlight fix for Renoir
- Fix for gpu recovery debugging
radeon:
- Fix a double free in error path
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200709185221.44895-1-alexander.deucher@amd.com
Dave Airlie [Thu, 9 Jul 2020 21:01:24 +0000 (07:01 +1000)]
Merge tag 'drm-intel-fixes-2020-07-08' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
One display's fbc patch fixing fence_y_offset calculation
from Ville and 4 patches from Chris on GEM: 1 fixing a debugfs
panic and others fixing vma construction and pin under vm->mutex.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200708190654.GA3924867@intel.com
Dave Airlie [Thu, 9 Jul 2020 20:59:05 +0000 (06:59 +1000)]
Merge branch 'linux-5.8' of git://github.com/skeggsb/linux into drm-fixes
- SVM fixes
- display fixes
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Ben Skeggs <skeggsb@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/
Dave Airlie [Thu, 9 Jul 2020 20:46:47 +0000 (06:46 +1000)]
Merge tag 'drm-misc-fixes-2020-07-08' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
* meson: OSD burst-length fixes
* hibmc: fix runtime warning by setting up generic fbdev after
registering device
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20200708121050.GA29420@linux-uq9g
Dave Airlie [Thu, 9 Jul 2020 20:43:31 +0000 (06:43 +1000)]
Merge tag 'mediatek-drm-fixes-5.8' of https://git./linux/kernel/git/chunkuang.hu/linux into drm-fixes
Mediatek DRM Fixes for Linux 5.8
This include fixup for cmdq, mmsys, visibility checking and some refinement.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200707153944.604-1-chunkuang.hu@kernel.org
David S. Miller [Thu, 9 Jul 2020 20:15:30 +0000 (13:15 -0700)]
Merge branch 'Expose-port-split-attributes'
Ido Schimmel says:
====================
Expose port split attributes
Danielle says:
Currently, user space has no way of knowing if a port can be split and
into how many ports. Among other things, this makes it impossible to
write generic tests for port split functionality.
Therefore, this set exposes two new devlink port attributes to user
space: Number of lanes and whether the port can be split or not.
Patch set overview:
Patches #1-#4 cleanup 'struct devlink_port_attrs' and reduce the number
of parameters passed between drivers and devlink via
devlink_port_attrs_set()
Patch #5 adds devlink port lanes attributes
Patches #6-#7 add devlink port splittable attribute
Patch #8 exploits the fact that devlink is now aware of port's number of
lanes and whether the port can be split or not and moves some checks
from drivers to devlink
Patch #9 adds a port split test
Changes since v2:
* Remove some local variables from patch #3
* Reword function description in patch #5
* Fix a bug in patch #8
* Add a test for the splittable attribute in patch #9
Changes since v1:
* Rename 'width' attribute to 'lanes'
* Add 'splittable' attribute
* Move checks from drivers to devlink
====================
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Danielle Ratson [Thu, 9 Jul 2020 13:18:22 +0000 (16:18 +0300)]
selftests: net: Add port split test
Test port split configuration using previously added number of port lanes
attribute.
Check that all the splittable ports are successfully split to their maximum
number of lanes and below, and that those which are not splittable fail to
be split.
Test output example:
TEST: swp4 is unsplittable [ OK ]
TEST: split port swp53 into 4 [ OK ]
TEST: Unsplit port pci/0000:03:00.0/25 [ OK ]
TEST: split port swp53 into 2 [ OK ]
TEST: Unsplit port pci/0000:03:00.0/25 [ OK ]
Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Danielle Ratson [Thu, 9 Jul 2020 13:18:21 +0000 (16:18 +0300)]
devlink: Move input checks from driver to devlink
Currently, all the input checks are done in driver.
After adding the split capability to devlink port, move the checks to
devlink.
Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Danielle Ratson [Thu, 9 Jul 2020 13:18:20 +0000 (16:18 +0300)]
devlink: Add a new devlink port split ability attribute and pass to netlink
Add a new attribute that indicates the split ability of devlink port.
Drivers are expected to set it via devlink_port_attrs_set(), before
registering the port.
Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Danielle Ratson [Thu, 9 Jul 2020 13:18:19 +0000 (16:18 +0300)]
mlxsw: Set port split ability attribute in driver
Currently, port attributes like flavour, port number and whether the port
was split are set when initializing a port.
Set the split ability of the port as well, based on port_mapping->width
field and split attribute of devlink port in spectrum, so that it could be
easily passed to devlink in the next patch.
Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>