Jakub Kicinski [Thu, 7 Oct 2021 18:18:46 +0000 (11:18 -0700)]
eth: platform: add a helper for loading netdev->dev_addr
Commit
406f42fa0d3c ("net-next: When a bond have a massive amount
of VLANs...") introduced a rbtree for faster Ethernet address look
up. To maintain netdev->dev_addr in this tree we need to make all
the writes to it got through appropriate helpers.
There is a handful of drivers which pass netdev->dev_addr as
the destination buffer to eth_platform_get_mac_address().
Add a helper which takes a dev pointer instead, so it can call
an appropriate helper.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 7 Oct 2021 18:18:45 +0000 (11:18 -0700)]
ethernet: un-export nvmem_get_mac_address()
nvmem_get_mac_address() is only called from of_net.c
we don't need the export.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 8 Oct 2021 13:31:01 +0000 (14:31 +0100)]
Merge branch '100GbE' of git://git./linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
100GbE Intel Wired LAN Driver Updates 2021-10-07
Michal Swiatkowski says:
The following patch series introduces basic switchdev model
support in ice driver. Implement the following blocks of
switchdev framework:
- VF port representors creation
- control plane VSI definition
- exception path (a. k. a. "slow-path") - to allow a virtual
switch or linux bridge to receive any packet that doesn't match
any hw filter
- link state management of virtual ports
- query virtual port statistics
Hardware offload support in switchdev mode is out of scope of
this patchset. Devlink interface is used to toggle between
switchdev and legacy (the default) modes of the driver.
---
Note: This series includes the use enum ice_status, however, we have
patches in our queue to remove it from the driver [1]. We are working
through the patches that precede the removal series.
[1] https://patchwork.ozlabs.org/project/intel-wired-lan/list/?series=265957
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 7 Oct 2021 22:24:06 +0000 (15:24 -0700)]
Merge git://git./linux/kernel/git/netdev/net
No conflicts.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linus Torvalds [Thu, 7 Oct 2021 21:11:40 +0000 (14:11 -0700)]
Merge tag 'nfsd-5.15-3' of git://git./linux/kernel/git/cel/linux
Pull nfsd fixes from Chuck Lever:
"Bug fixes for NFSD error handling paths"
* tag 'nfsd-5.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
NFSD: Keep existing listeners on portlist error
SUNRPC: fix sign error causing rpcsec_gss drops
nfsd: Fix a warning for nfsd_file_close_inode
nfsd4: Handle the NFSv4 READDIR 'dircount' hint being zero
nfsd: fix error handling of register_pernet_subsys() in init_nfsd()
Linus Torvalds [Thu, 7 Oct 2021 21:01:29 +0000 (14:01 -0700)]
Merge tag 'armsoc-fixes-5.15' of git://git./linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
"This is a larger than normal update for Arm SoC specific code, most of
it in device trees, but also drivers and the omap and at91/sama7
platforms:
- There are four new entries to the MAINTAINERS file: Sven Peter and
Alyssa Rosenzweig for Apple M1, Romain Perier for Mstar/sigmastar,
and Vignesh Raghavendra for TI K3
- Build fixes to address randconfig warnings in sharpsl, dove, omap1,
and qcom platforms as well as the scmi and op-tee subsystems
- Regression fixes for missing CONFIG_FB and other options for
several defconfigs
- Several bug fixes for the newly added Microchip SAMA7 platform,
mostly regarding power management
- Missing SMP barriers to protect accesses to SCMI virtio device
- Regression fixes for TI OMAP, including a boot-time hang on am335x.
- Lots of bug fixes for NXP i.MX, mostly addressing incorrect
settings in devicetree files, and one revert for broken suspend.
- Fixes for ARM Juno/Vexpress devicetree files, addressing a couple
of schema warnings.
- Regression fixes for qualcomm SoC specific drivers and devicetree
files, reverting an mdt_loader change and at least pastially
reverting some of the 5.15 DTS changes, plus some minor bugfixes"
* tag 'armsoc-fixes-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (64 commits)
MAINTAINERS: Add Sven Peter as ARM/APPLE MACHINE maintainer
MAINTAINERS: Add Alyssa Rosenzweig as M1 reviewer
firmware: arm_scmi: Add proper barriers to scmi virtio device
firmware: arm_scmi: Simplify spinlocks in virtio transport
ARM: dts: omap3430-sdp: Fix NAND device node
bus: ti-sysc: Use CLKDM_NOAUTO for dra7 dcan1 for errata i893
ARM: sharpsl_param: work around -Wstringop-overread warning
ARM: defconfig: gemini: Restore framebuffer
ARM: dove: mark 'putc' as inline
ARM: omap1: move omap15xx local bus handling to usb.c
MAINTAINERS: Add Vignesh to TI K3 platform maintainership
arm64: dts: imx8m*-venice-gw7902: fix M2_RST# gpio
ARM: imx6: disable the GIC CPU interface before calling stby-poweroff sequence
arm64: dts: ls1028a: fix eSDHC2 node
arm64: dts: imx8mm-kontron-n801x-som: do not allow to switch off buck2
ARM: dts: at91: sama7g5ek: to not touch slew-rate for SDMMC pins
ARM: dts: at91: sama7g5ek: use proper slew-rate settings for GMACs
ARM: at91: pm: preload base address of controllers in tlb
ARM: at91: pm: group constants and addresses loading
ARM: dts: at91: sama7g5ek: add suspend voltage for ddr3l rail
...
Arnd Bergmann [Thu, 7 Oct 2021 19:14:12 +0000 (21:14 +0200)]
Merge tag 'asahi-soc-fixes-5.15' of https://github.com/AsahiLinux/linux into arm/fixes
Apple SoC fixes for 5.15; just two MAINTAINERS updates.
- MAINTAINERS: Add Sven Peter as ARM/APPLE MACHINE maintainer
- MAINTAINERS: Add Alyssa Rosenzweig as M1 reviewer
* tag 'asahi-soc-fixes-5.15' of https://github.com/AsahiLinux/linux:
MAINTAINERS: Add Sven Peter as ARM/APPLE MACHINE maintainer
MAINTAINERS: Add Alyssa Rosenzweig as M1 reviewer
Link: https://lore.kernel.org/r/a50a9015-0e62-c451-4d0d-668233b35b85@marcan.st
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Arnd Bergmann [Thu, 7 Oct 2021 19:14:03 +0000 (21:14 +0200)]
Merge tag 'scmi-fixes-5.15' of git://git./linux/kernel/git/sudeep.holla/linux into arm/fixes
SCMI fixes for v5.15
A few fixes addressing:
- Kconfig dependency between VIRTIO and ARM_SCMI_PROTOCOL
- Link-time error with __exit annotation for virtio_scmi_exit
- Unnecessary nested irqsave/irqrestore spinlocks in virtio transport
- Missing SMP barriers to protect accesses to SCMI virtio device
* tag 'scmi-fixes-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux:
firmware: arm_scmi: Add proper barriers to scmi virtio device
firmware: arm_scmi: Simplify spinlocks in virtio transport
firmware: arm_scmi: Remove __exit annotation
firmware: arm_scmi: Fix virtio transport Kconfig dependency
Link: https://lore.kernel.org/r/20211007102822.27886-1-sudeep.holla@arm.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Arnd Bergmann [Thu, 7 Oct 2021 19:13:57 +0000 (21:13 +0200)]
Merge tag 'omap-for-v5.15/fixes-rc4' of git://git./linux/kernel/git/tmlind/linux-omap into arm/fixes
Fixes for omaps for v5.15
Few regression fixes for omaps for the v5.15-rc cycle. There is a fix
for boot time hangs that can happen on some am335x devices that started
when the pruss devicetree nodes were added. The other fixes are less
critical:
- Fix compiler warning for sysc_init_soc() that got recently introduced
- Fix external abort for am335x pruss as otherwise some am335x will hang
- Use CLKDM_NOAUTO quirk also for dra7 dcan1
- Fix older NAND device node regression for omap3-sdp
* tag 'omap-for-v5.15/fixes-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: omap3430-sdp: Fix NAND device node
bus: ti-sysc: Use CLKDM_NOAUTO for dra7 dcan1 for errata i893
soc: ti: omap-prm: Fix external abort for am335x pruss
bus: ti-sysc: Add break in switch statement in sysc_init_soc()
Link: https://lore.kernel.org/r/pull-1633609552-789682@atomide.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Linus Torvalds [Thu, 7 Oct 2021 18:20:08 +0000 (11:20 -0700)]
Merge tag 'misc-fixes-
20211007' of git://git./linux/kernel/git/dhowells/linux-fs
Pull netfslib, cachefiles and afs fixes from David Howells:
- Fix another couple of oopses in cachefiles tracing stemming from the
possibility of passing in a NULL object pointer
- Fix netfs_clear_unread() to set READ on the iov_iter so that source
it is passed to doesn't do the wrong thing (some drivers look at the
flag on iov_iter rather than other available information to determine
the direction)
- Fix afs_launder_page() to write back at the correct file position on
the server so as not to corrupt data
* tag 'misc-fixes-
20211007' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
afs: Fix afs_launder_page() to set correct start file position
netfs: Fix READ/WRITE confusion when calling iov_iter_xarray()
cachefiles: Fix oops with cachefiles_cull() due to NULL object
Linus Torvalds [Thu, 7 Oct 2021 17:58:42 +0000 (10:58 -0700)]
Merge tag 'perf-tools-fixes-for-v5.15-2021-10-07' of git://git./linux/kernel/git/acme/linux
Pull perf tools fixes from Arnaldo Carvalho de Melo:
- Fix plugin static linking with libopencsd on ARM and ARM64
- Add missing -lstdc++ when linking with libopencsd
- Add missing topdown metrics events to 'perf test attr'
- Plug leak sys_event_tables list after processing JSON vendor events
entries
- Sync sound/asound.h copy with the kernel sources
* tag 'perf-tools-fixes-for-v5.15-2021-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
perf tests attr: Add missing topdown metrics events
tools include UAPI: Sync sound/asound.h copy with the kernel sources
perf build: Fix plugin static linking with libopencsd on ARM and ARM64
perf build: Add missing -lstdc++ when linking with libopencsd
perf jevents: Free the sys_event_tables list after processing entries
Wojciech Drewek [Fri, 20 Aug 2021 00:08:59 +0000 (17:08 -0700)]
ice: add port representor ethtool ops and stats
Introduce the following ethtool operations for VF's representor:
-get_drvinfo
-get_strings
-get_ethtool_stats
-get_sset_count
-get_link
In all cases, existing operations were used with minor
changes which allow us to detect if ethtool op was called for
representor. Only VF VSI stats will be available for representor.
Implement ndo_get_stats64 for port representor. This will update
VF VSI stats and read them.
Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Grzegorz Nitka [Fri, 20 Aug 2021 00:08:58 +0000 (17:08 -0700)]
ice: switchdev slow path
Slow path means allowing packet to go from uplink to representor
and from representor to correct VF on Rx site and from VF to
representor and to uplink on Tx site.
To accomplish this driver, has to set correct Tx descriptor. When
packet is sent from representor to VF, destination should be
set to VF VSI. When packet is sent from uplink port destination
should be uplink to bypass switch infrastructure and send packet
outside.
On Rx site driver should check source VSI field from Rx descriptor
and based on that forward packed to correct netdev. To allow
this there is a target netdevs table in control plane VSI
struct.
Co-developed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Grzegorz Nitka [Fri, 20 Aug 2021 00:08:57 +0000 (17:08 -0700)]
ice: rebuild switchdev when resetting all VFs
As resetting all VFs behaves mostly like creating new VFs also
eswitch infrastructure has to be recreated. The easiest way to
do that is to rebuild eswitch after resetting VFs.
Implement helper functions to start and stop all representors
queues. This is used to disable traffic on port representors.
In rebuild path:
- NAPI has to be disabled
- eswitch environment has to be set up
- new port representors have to be created, because the old
one had pointer to not existing VFs
- new control plane VSI ring should be remapped
- NAPI hast to be enabled
- rxdid has to be set to FLEX_NIC_2, because this descriptor id
support source_vsi, which is needed on control plane VSI queues
- port representors queues have to be started
Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Grzegorz Nitka [Fri, 20 Aug 2021 00:08:56 +0000 (17:08 -0700)]
ice: enable/disable switchdev when managing VFs
Only way to enable switchdev is to create VFs when the eswitch
mode is set to switchdev. Check if correct mode is set and
enable switchdev in function which creating VFs.
Disable switchdev when user change number of VFs to 0. Changing
eswitch mode back to legacy when VFs are created in switchdev
mode isn't allowed.
As switchdev takes care of managing filter rules, adding new
rules on VF is blocked.
In case of resetting VF driver has to update pointer in ice_repr
struct, because after reset VSI related things can change.
Co-developed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Grzegorz Nitka [Fri, 20 Aug 2021 00:08:55 +0000 (17:08 -0700)]
ice: introduce new type of VSI for switchdev
New type of VSI has to be defined for switchdev control plane
VSI. Number of allocated Tx and Rx queue has to be equal to
amount of VFs, because each port representor should have one
Tx and Rx queue.
Also to not increase number of used irqs too much, control plane
VSI uses only one q_vector and handle all queues in one irq.
To allow handling all queues in one irq , new function to clean
msix for eswitch was introduced. This function will schedule napi
for each representor instead of scheduling it only for one like in
normal clean irq function.
Only one additional msix has to be requested. Always try to request
it in ice_ena_msix_range function.
Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Grzegorz Nitka [Fri, 20 Aug 2021 00:08:54 +0000 (17:08 -0700)]
ice: set and release switchdev environment
Switchdev environment has to be set up when user create VFs
and eswitch mode is switchdev. Release is done when user
delete all VFs.
Data path in this implementation is based on control plane VSI.
This VSI is used to pass traffic from port representors to
corresponding VFs and vice versa. Default TX rule has to be
added to forward packet to control plane VSI. This will redirect
packets from VFs which don't match other rules to control plane
VSI.
On RX side default rule is added on uplink VSI to receive all
traffic that doesn't match other rules. When setting switchdev
environment all other rules from VFs should be removed. Packet to
VFs will be forwarded by control plane VSI.
As VF without any mac rules can't send any packet because of
antispoof mechanism, VSI antispoof should be turned off on each VFs.
To send packet from representor to correct VSI, destination VSI
field in TX descriptor will have to be filled. Allow that by
setting destination override bit in control plane VSI security config.
Packet from VFs will be received on control plane VSI. Driver
should decide to which netdev forward the packet. Decision is
made based on src_vsi field from descriptor. There is a target
netdev list in control plane VSI struct which choose netdev
based on src_vsi number.
Co-developed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Michal Swiatkowski [Fri, 20 Aug 2021 00:08:53 +0000 (17:08 -0700)]
ice: allow changing lan_en and lb_en on dflt rules
There is no way to change default lan_en and lb_en flags while
adding new rule. Add function that allows changing these flags
on ICE_SW_LKUP_DFLT recipe and any rule id.
lan_en allows packet to go outside if rule is matched. Clearing
this bit will block packet from sending it outside.
lb_en allows packet to be forwarded to other VSI. Clearing
this bit will block packet from forwarding it to other VSI.
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Michal Swiatkowski [Fri, 20 Aug 2021 00:08:52 +0000 (17:08 -0700)]
ice: manage VSI antispoof and destination override
Implement functions to make setting VSI security config easier.
Main function ice_update_security fills security section field and
checks against error in updating VSI. Reset functions are responsible
for correct filling config according to user expectations.
This helper is needed because destination override is located in
this section. Driver has to set this bit to allow strering Tx packet
on VSI based on value in Tx descriptors.
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Michal Swiatkowski [Fri, 20 Aug 2021 00:08:51 +0000 (17:08 -0700)]
ice: allow process VF opcodes in different ways
In switchdev driver shouldn't add MAC, VLAN and promisc
filters on iavf demand but should return success to not
break normal iavf flow.
Achieve that by creating table of functions pointer with
default functions used to parse iavf command. While parse
iavf command, call correct function from table instead of
calling function direct.
When port representors are being created change functions
in table to new one that behaves correctly for switchdev
puprose (ignoring new filters).
Change back to default ops when representors are being
removed.
Co-developed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Michal Swiatkowski [Fri, 20 Aug 2021 00:08:50 +0000 (17:08 -0700)]
ice: introduce VF port representor
Port representor is used to manage VF from host side. To allow
it each created representor registers netdevice with random hw
address. Also devlink port is created for all representors.
Port representor name is created based on switch id or managed
by devlink core if devlink port was registered with success.
Open and stop ndo ops are implemented to allow managing the VF
link state. Link state is tracked in VF struct.
Struct ice_netdev_priv is extended by pointer to representor
field. This is needed to get correct representor from netdev
struct mostly used in ndo calls.
Implement helper functions to check if given netdev is netdev of
port representor (ice_is_port_repr_netdev) and to get representor
from netdev (ice_netdev_to_repr).
As driver mostly will create or destroy port representors on all
VFs instead of on single one, write functions to add and remove
representor for each VF.
Representor struct contains pointer to source VSI, which is VSI
configured on VF, backpointer to VF, backpointer to netdev,
q_vector pointer and metadata_dst which will be used in data path.
Co-developed-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Wojciech Drewek [Fri, 20 Aug 2021 00:08:49 +0000 (17:08 -0700)]
ice: Move devlink port to PF/VF struct
Keeping devlink port inside VSI data structure causes some issues.
Since VF VSI is released during reset that means that we have to
unregister devlink port and register it again every time reset is
triggered. With the new changes in devlink API it
might cause deadlock issues. After calling
devlink_port_register/devlink_port_unregister devlink API is going to
lock rtnl_mutex. It's an issue when VF reset is triggered in netlink
operation context (like setting VF MAC address or VLAN),
because rtnl_lock is already taken by netlink. Another call of
rtnl_lock from devlink API results in dead-lock.
By moving devlink port to PF/VF we avoid creating/destroying it
during reset. Since this patch, devlink ports are created during
ice_probe, destroyed during ice_remove for PF and created during
ice_repr_add, destroyed during ice_repr_rem for VF.
Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Michal Swiatkowski [Fri, 20 Aug 2021 00:08:48 +0000 (17:08 -0700)]
ice: support basic E-Switch mode control
Write set and get eswitch mode functions used by devlink
ops. Use new pf struct member eswitch_mode to track current
eswitch mode in driver.
Changing eswitch mode is only allowed when there are no
VFs created.
Create new file for eswitch related code.
Add config flag ICE_SWITCHDEV to allow user to choose if
switchdev support should be enabled or disabled.
Use case examples:
- show current eswitch mode ('legacy' is the default one)
[root@localhost]# devlink dev eswitch show pci/0000:03:00.1
pci/0000:03:00.1: mode legacy
- move to 'switchdev' mode
[root@localhost]# devlink dev eswitch set pci/0000:03:00.1 mode
switchdev
[root@localhost]# devlink dev eswitch show pci/0000:03:00.1
pci/0000:03:00.1: mode switchdev
- create 2 VFs
[root@localhost]# echo 2 > /sys/class/net/ens4f1/device/sriov_numvfs
- unsuccessful attempt to change eswitch mode while VFs are created
[root@localhost]# devlink dev eswitch set pci/0000:03:00.1 mode legacy
devlink answers: Operation not supported
- destroy VFs
[root@localhost]# echo 0 > /sys/class/net/ens4f1/device/sriov_numvfs
- restore 'legacy' mode
[root@localhost]# devlink dev eswitch set pci/0000:03:00.1 mode legacy
[root@localhost]# devlink dev eswitch show pci/0000:03:00.1
pci/0000:03:00.1: mode legacy
Co-developed-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Linus Torvalds [Thu, 7 Oct 2021 16:50:31 +0000 (09:50 -0700)]
Merge tag 'net-5.15-rc5' of git://git./linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from xfrm, bpf, netfilter, and wireless.
Current release - regressions:
- xfrm: fix XFRM_MSG_MAPPING ABI breakage caused by inserting a new
value in the middle of an enum
- unix: fix an issue in unix_shutdown causing the other end
read/write failures
- phy: mdio: fix memory leak
Current release - new code bugs:
- mlx5e: improve MQPRIO resiliency against bad configs
Previous releases - regressions:
- bpf: fix integer overflow leading to OOB access in map element
pre-allocation
- stmmac: dwmac-rk: fix ethernet on rk3399 based devices
- netfilter: conntrack: fix boot failure with
nf_conntrack.enable_hooks=1
- brcmfmac: revert using ISO3166 country code and 0 rev as fallback
- i40e: fix freeing of uninitialized misc IRQ vector
- iavf: fix double unlock of crit_lock
Previous releases - always broken:
- bpf, arm: fix register clobbering in div/mod implementation
- netfilter: nf_tables: correct issues in netlink rule change event
notifications
- dsa: tag_dsa: fix mask for trunked packets
- usb: r8152: don't resubmit rx immediately to avoid soft lockup on
device unplug
- i40e: fix endless loop under rtnl if FW fails to correctly respond
to capability query
- mlx5e: fix rx checksum offload coexistence with ipsec offload
- mlx5: force round second at 1PPS out start time and allow it only
in supported clock modes
- phy: pcs: xpcs: fix incorrect CL37 AN sequence, EEE disable
sequence
Misc:
- xfrm: slightly rejig the new policy uAPI to make it less cryptic"
* tag 'net-5.15-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (66 commits)
net: prefer socket bound to interface when not in VRF
iavf: fix double unlock of crit_lock
i40e: Fix freeing of uninitialized misc IRQ vector
i40e: fix endless loop under rtnl
dt-bindings: net: dsa: marvell: fix compatible in example
ionic: move filter sync_needed bit set
gve: report 64bit tx_bytes counter from gve_handle_report_stats()
gve: fix gve_get_stats()
rtnetlink: fix if_nlmsg_stats_size() under estimation
gve: Properly handle errors in gve_assign_qpl
gve: Avoid freeing NULL pointer
gve: Correct available tx qpl check
unix: Fix an issue in unix_shutdown causing the other end read/write failures
net: stmmac: trigger PCS EEE to turn off on link down
net: pcs: xpcs: fix incorrect steps on disable EEE
netlink: annotate data races around nlk->bound
net: pcs: xpcs: fix incorrect CL37 AN sequence
net: sfp: Fix typo in state machine debug string
net/sched: sch_taprio: properly cancel timer from taprio_destroy()
net: bridge: fix under estimation in br_get_linkxstats_size()
...
Linus Torvalds [Thu, 7 Oct 2021 16:44:48 +0000 (09:44 -0700)]
Merge tag 'hyperv-fixes-signed-
20211007' of git://git./linux/kernel/git/hyperv/linux
Pull hyperv fixes from Wei Liu:
- Replace uuid.h with types.h in a header (Andy Shevchenko)
- Avoid sleeping in atomic context in PCI driver (Long Li)
- Avoid sending IPI to self when it shouldn't (Vitaly Kuznetsov)
* tag 'hyperv-fixes-signed-
20211007' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
x86/hyperv: Avoid erroneously sending IPI to 'self'
hyper-v: Replace uuid.h with types.h
PCI: hv: Fix sleep while in non-sleep context when removing child devices from the bus
Sven Peter [Thu, 7 Oct 2021 05:34:30 +0000 (07:34 +0200)]
MAINTAINERS: Add Sven Peter as ARM/APPLE MACHINE maintainer
Hector suggested I should add myself to help him maintain the
platform.
Acked-by: Hector Martin <marcan@marcan.st>
Signed-off-by: Sven Peter <sven@svenpeter.dev>
Alyssa Rosenzweig [Mon, 23 Aug 2021 15:17:38 +0000 (11:17 -0400)]
MAINTAINERS: Add Alyssa Rosenzweig as M1 reviewer
Add myself as a reviewer for Asahi Linux (Apple M1) patches.
I would like to be CC'ed on Asahi Linux patches for review and testing.
I am also collecting Asahi Linux patches downstream, rebasing on
linux-next periodically, and would like to be notified of what to
cherry-pick from lists.
Cc: Hector Martin <marcan@marcan.st>
Cc: Sven Peter <sven@svenpeter.dev>
Acked-by: Hector Martin <marcan@marcan.st>
Acked-by: Sven Peter <sven@svenpeter.dev>
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Gustavo A. R. Silva [Wed, 6 Oct 2021 18:11:15 +0000 (13:11 -0500)]
ethernet: ti: cpts: Use devm_kcalloc() instead of devm_kzalloc()
Use 2-factor multiplication argument form devm_kcalloc() instead
of devm_kzalloc().
Link: https://github.com/KSPP/linux/issues/162
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20211006181115.GA913499@embeddedor
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Gustavo A. R. Silva [Wed, 6 Oct 2021 18:09:44 +0000 (13:09 -0500)]
net: stmmac: selftests: Use kcalloc() instead of kzalloc()
Use 2-factor multiplication argument form kcalloc() instead
of kzalloc().
Link: https://github.com/KSPP/linux/issues/162
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20211006180944.GA913477@embeddedor
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Gustavo A. R. Silva [Wed, 6 Oct 2021 18:09:27 +0000 (13:09 -0500)]
net: mana: Use kcalloc() instead of kzalloc()
Use 2-factor multiplication argument form kcalloc() instead
of kzalloc().
Link: https://github.com/KSPP/linux/issues/162
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Link: https://lore.kernel.org/r/20211006180927.GA913456@embeddedor
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Gustavo A. R. Silva [Wed, 6 Oct 2021 18:08:43 +0000 (13:08 -0500)]
net: broadcom: bcm4908_enet: use kcalloc() instead of kzalloc()
Use 2-factor multiplication argument form kcalloc() instead
of kzalloc().
Link: https://github.com/KSPP/linux/issues/162
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20211006180843.GA913399@embeddedor
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Mike Manning [Tue, 5 Oct 2021 13:03:42 +0000 (14:03 +0100)]
net: prefer socket bound to interface when not in VRF
The commit
6da5b0f027a8 ("net: ensure unbound datagram socket to be
chosen when not in a VRF") modified compute_score() so that a device
match is always made, not just in the case of an l3mdev skb, then
increments the score also for unbound sockets. This ensures that
sockets bound to an l3mdev are never selected when not in a VRF.
But as unbound and bound sockets are now scored equally, this results
in the last opened socket being selected if there are matches in the
default VRF for an unbound socket and a socket bound to a dev that is
not an l3mdev. However, handling prior to this commit was to always
select the bound socket in this case. Reinstate this handling by
incrementing the score only for bound sockets. The required isolation
due to choosing between an unbound socket and a socket bound to an
l3mdev remains in place due to the device match always being made.
The same approach is taken for compute_score() for stream sockets.
Fixes:
6da5b0f027a8 ("net: ensure unbound datagram socket to be chosen when not in a VRF")
Fixes:
e78190581aff ("net: ensure unbound stream socket to be chosen when not in a VRF")
Signed-off-by: Mike Manning <mmanning@vyatta.att-mail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/cf0a8523-b362-1edf-ee78-eef63cbbb428@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 7 Oct 2021 14:11:32 +0000 (07:11 -0700)]
Merge https://git./linux/kernel/git/bpf/bpf
Daniel Borkmann says:
====================
pull-request: bpf 2021-10-07
We've added 7 non-merge commits during the last 8 day(s) which contain
a total of 8 files changed, 38 insertions(+), 21 deletions(-).
The main changes are:
1) Fix ARM BPF JIT to preserve caller-saved regs for DIV/MOD JIT-internal
helper call, from Johan Almbladh.
2) Fix integer overflow in BPF stack map element size calculation when
used with preallocation, from Tatsuhiko Yasumatsu.
3) Fix an AF_UNIX regression due to added BPF sockmap support related
to shutdown handling, from Jiang Wang.
4) Fix a segfault in libbpf when generating light skeletons from objects
without BTF, from Kumar Kartikeya Dwivedi.
5) Fix a libbpf memory leak in strset to free the actual struct strset
itself, from Andrii Nakryiko.
6) Dual-license bpf_insn.h similarly as we did for libbpf and bpftool,
with ACKs from all contributors, from Luca Boccassi.
====================
Link: https://lore.kernel.org/r/20211007135010.21143-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
David S. Miller [Thu, 7 Oct 2021 12:42:40 +0000 (13:42 +0100)]
Merge tag 'wireless-drivers-next-2021-10-07' of git://git./linux/kernel/git/kvalo/wireless-drivers-next
Kalle Valo says:
====================
wireless-drivers-next patches for v5.16
First set of patches for v5.16. ath11k getting most of new features
this time. Other drivers also have few new features, and of course the
usual set of fixes and cleanups all over.
Major changes:
rtw88
* support adaptivity for ETSI/JP DFS region
* 8821c: support RFE type4 wifi NIC
brcmfmac
* DMI nvram filename quirk for Cyberbook T116 tablet
ath9k
* load calibration data and pci init values via nvmem subsystem
ath11k
* include channel rx and tx time in survey dump statistics
* support for setting fixed Wi-Fi 6 rates from user space
* support for 80P80 and 160 MHz bandwidths
* spectral scan support for QCN9074
* support for calibration data files per radio
* support for calibration data via eeprom
* support for rx decapsulation offload (data frames in 802.3 format)
* support channel 2 in 6 GHz band
ath10k
* include frame time stamp in beacon and probe response frames
wcn36xx
* enable Idle Mode Power Save (IMPS) to reduce power consumption during idle
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 7 Oct 2021 12:39:52 +0000 (13:39 +0100)]
Merge branch 'dev_addr-fw-helpers'
Jakub Kicinski says:
====================
net: add a helpers for loading netdev->dev_addr from FW
We're trying to make all writes to netdev->dev_addr go via helpers.
A lot of places pass netdev->dev_addr to of_get_ethdev_address() and
device_get_ethdev_addr() so this set adds new functions which wrap
the functionality.
v2 performs suggested code moves, adds a couple additional clean ups
on the device property side, and an extra patch converting drivers
which can benefit from device_get_ethdev_address().
v3 removes OF_NET and corrects kdoc.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 7 Oct 2021 01:07:02 +0000 (18:07 -0700)]
ethernet: make more use of device_get_ethdev_address()
Convert a few drivers to device_get_ethdev_address(),
saving a few LoC.
The check if addr is valid in netsec is superfluous,
device_get_ethdev_addr() already checks that (in
fwnode_get_mac_addr()).
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 7 Oct 2021 01:07:01 +0000 (18:07 -0700)]
ethernet: use device_get_ethdev_address()
Use the new device_get_ethdev_address() helper for the cases
where dev->dev_addr is passed in directly as the destination.
@@
expression dev, np;
@@
- device_get_mac_address(np, dev->dev_addr, ETH_ALEN)
+ device_get_ethdev_address(np, dev)
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 7 Oct 2021 01:07:00 +0000 (18:07 -0700)]
eth: fwnode: add a helper for loading netdev->dev_addr
Commit
406f42fa0d3c ("net-next: When a bond have a massive amount
of VLANs...") introduced a rbtree for faster Ethernet address look
up. To maintain netdev->dev_addr in this tree we need to make all
the writes to it got through appropriate helpers.
There is a handful of drivers which pass netdev->dev_addr as
the destination buffer to device_get_mac_address(). Add a helper
which takes a dev pointer instead, so it can call an appropriate
helper.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 7 Oct 2021 01:06:59 +0000 (18:06 -0700)]
eth: fwnode: remove the addr len from mac helpers
All callers pass in ETH_ALEN and the function itself
will return -EINVAL for any other address length.
Just assume it's ETH_ALEN like all other mac address
helpers (nvm, of, platform).
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 7 Oct 2021 01:06:58 +0000 (18:06 -0700)]
eth: fwnode: change the return type of mac address helpers
fwnode_get_mac_address() and device_get_mac_address()
return a pointer to the buffer that was passed to them
on success or NULL on failure. None of the callers
care about the actual value, only if it's NULL or not.
These semantics differ from of_get_mac_address() which
returns an int so to avoid confusion make the device
helpers return an errno.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 7 Oct 2021 01:06:57 +0000 (18:06 -0700)]
device property: move mac addr helpers to eth.c
Move the mac address helpers out, eth.c already contains
a bunch of similar helpers.
Suggested-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 7 Oct 2021 01:06:56 +0000 (18:06 -0700)]
ethernet: use of_get_ethdev_address()
Use the new of_get_ethdev_address() helper for the cases
where dev->dev_addr is passed in directly as the destination.
@@
expression dev, np;
@@
- of_get_mac_address(np, dev->dev_addr)
+ of_get_ethdev_address(np, dev)
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 7 Oct 2021 01:06:55 +0000 (18:06 -0700)]
of: net: add a helper for loading netdev->dev_addr
Commit
406f42fa0d3c ("net-next: When a bond have a massive amount
of VLANs...") introduced a rbtree for faster Ethernet address look
up. To maintain netdev->dev_addr in this tree we need to make all
the writes to it got through appropriate helpers.
There are roughly 40 places where netdev->dev_addr is passed
as the destination to a of_get_mac_address() call. Add a helper
which takes a dev pointer instead, so it can call an appropriate
helper.
Note that of_get_mac_address() already assumes the address is
6 bytes long (ETH_ALEN) so use eth_hw_addr_set().
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 7 Oct 2021 01:06:54 +0000 (18:06 -0700)]
of: net: move of_net under net/
Rob suggests to move of_net.c from under drivers/of/ somewhere
to the networking code.
Suggested-by: Rob Herring <robh@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 7 Oct 2021 12:35:10 +0000 (13:35 +0100)]
Merge branch 'nfc-pn533-const'
Rikard Falkeborn says:
====================
nfc: pn533: Constify ops-structs
Constify a couple of ops-structs. This allows the compiler to put the
static structs in read-only memory.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Rikard Falkeborn [Wed, 6 Oct 2021 22:47:38 +0000 (00:47 +0200)]
nfc: pn533: Constify pn533_phy_ops
Neither the driver or the core modifies the pn533_phy_ops struct, so
make them const to allow the compiler to put the static structs in
read-only memory.
Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rikard Falkeborn [Wed, 6 Oct 2021 22:47:37 +0000 (00:47 +0200)]
nfc: pn533: Constify serdev_device_ops
The only usage of pn532_serdev_ops is to pass its address to
serdev_device_set_client_ops(), which takes a pointer to const
serdev_device_ops as argument. Make it const to allow the compiler to
put it in read-only memory.
Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 7 Oct 2021 11:44:41 +0000 (12:44 +0100)]
Merge branch 'master' of git://git./linux/kernel/git/klassert/
ipsec
Steffen Klassert says:
====================
pull request (net): ipsec 2021-10-07
1) Fix a sysbot reported shift-out-of-bounds in xfrm_get_default.
From Pavel Skripkin.
2) Fix XFRM_MSG_MAPPING ABI breakage. The new XFRM_MSG_MAPPING
messages were accidentally not paced at the end.
Fix by Eugene Syromiatnikov.
3) Fix the uapi for the default policy, use explicit field and macros
and make it accessible to userland.
From Nicolas Dichtel.
4) Fix a missing rcu lock in xfrm_notify_userpolicy().
From Nicolas Dichtel.
Please pull or let me know if there are problems.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 7 Oct 2021 11:38:15 +0000 (12:38 +0100)]
Merge branch '40GbE' of git://git./linux/kernel/git/tnguy/net-
queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2021-10-06
This series contains updates to i40e and iavf drivers.
Jiri Benc expands an error check to prevent infinite loop for i40e.
Sylwester prevents freeing of uninitialized IRQ vector to resolve a
kernel oops for i40e.
Stefan Assmann fixes a double mutex unlock for iavf.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 7 Oct 2021 01:26:36 +0000 (18:26 -0700)]
Merge tag 'devicetree-fixes-for-5.15-3' of git://git./linux/kernel/git/robh/linux
Pull devicetree fixes from Rob Herring:
- Add another allowed address for TI sn65dsi86
- Drop more redundant minItems/maxItems
- Fix more graph 'unevaluatedProperties' warnings in media bindings
* tag 'devicetree-fixes-for-5.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
dt-bindings: drm/bridge: ti-sn65dsi86: Fix reg value
dt-bindings: Drop more redundant 'maxItems/minItems'
dt-bindings: media: Fix more graph 'unevaluatedProperties' related warnings
Jakub Kicinski [Thu, 7 Oct 2021 00:49:21 +0000 (17:49 -0700)]
Merge branch 'add-mdiobus_modify_changed-helper'
Russell King says:
====================
Add mdiobus_modify_changed() helper
Sean Anderson's recent patch series is introducing more read-write
operations on the MDIO bus that only need to happen if a change is
being made.
We have similar logic in __mdiobus_modify_changed(), but we didn't
add its correponding locked variant mdiobus_modify_changed() as we
had very few users. Now that we are getting more, let's add the
helper.
====================
Link: https://lore.kernel.org/r/YV2UIa2eU+UjmWaE@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Russell King (Oracle) [Wed, 6 Oct 2021 12:19:25 +0000 (13:19 +0100)]
net: phylink: use mdiobus_modify_changed() helper
Use the mdiobus_modify_changed() helper in the C22 PCS advertisement
helper.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Russell King (Oracle) [Wed, 6 Oct 2021 12:19:20 +0000 (13:19 +0100)]
net: mdio: add mdiobus_modify_changed()
Add mdiobus_modify_changed() helper to reflect the phylib and similar
equivalents. This will avoid this functionality being open-coded, as
has already happened in phylink, and it looks like other users will be
appearing soon.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 7 Oct 2021 00:47:52 +0000 (17:47 -0700)]
Merge branch 'ethtool-add-ability-to-control-transceiver-modules-power-mode'
Ido Schimmel says:
====================
ethtool: Add ability to control transceiver modules' power mode
This patchset extends the ethtool netlink API to allow user space to
control transceiver modules. Two specific APIs are added, but the plan
is to extend the interface with more APIs in the future (see "Future
plans").
This submission is a complete rework of a previous submission [1] that
tried to achieve the same goal by allowing user space to write to the
EEPROMs of these modules. It was rejected as it could have enabled user
space binary blob drivers.
However, the main issue is that by directly writing to some pages of
these EEPROMs, we are interfering with the entity that is controlling
the modules (kernel / device firmware). In addition, some functionality
cannot be implemented solely by writing to the EEPROM, as it requires
the assertion / de-assertion of hardware signals (e.g., "ResetL" pin in
SFF-8636).
Motivation
==========
The kernel can currently dump the contents of module EEPROMs to user
space via the ethtool legacy ioctl API or the new netlink API. These
dumps can then be parsed by ethtool(8) according to the specification
that defines the memory map of the EEPROM. For example, SFF-8636 [2] for
QSFP and CMIS [3] for QSFP-DD.
In addition to read-only elements, these specifications also define
writeable elements that can be used to control the behavior of the
module. For example, controlling whether the module is put in low or
high power mode to limit its power consumption.
The CMIS specification even defines a message exchange mechanism (CDB,
Command Data Block) on top of the module's memory map. This allows the
host to send various commands to the module. For example, to update its
firmware.
Implementation
==============
The ethtool netlink API is extended with two new messages,
'ETHTOOL_MSG_MODULE_SET' and 'ETHTOOL_MSG_MODULE_GET', that allow user
space to set and get transceiver module parameters. Specifically, the
'ETHTOOL_A_MODULE_POWER_MODE_POLICY' attribute allows user space to
control the power mode policy of the module in order to limit its power
consumption. See detailed description in patch #1.
The user API is designed to be generic enough so that it could be used
for modules with different memory maps (e.g., SFF-8636, CMIS).
The only implementation of the device driver API in this series is for a
MAC driver (mlxsw) where the module is controlled by the device's
firmware, but it is designed to be generic enough so that it could also
be used by implementations where the module is controlled by the kernel.
Testing and introspection
=========================
See detailed description in patches #1 and #5.
Patchset overview
=================
Patch #1 adds the initial infrastructure in ethtool along with the
ability to control transceiver modules' power mode.
Patches #2-#3 add required device registers in mlxsw.
Patch #4 implements in mlxsw the ethtool operations added in patch #1.
Patch #5 adds extended link states in order to allow user space to
troubleshoot link down issues related to transceiver modules.
Patch #6 adds support for these extended states in mlxsw.
Future plans
============
* Extend 'ETHTOOL_MSG_MODULE_SET' to control Tx output among other
attributes.
* Add new ethtool message(s) to update firmware on transceiver modules.
* Extend ethtool(8) to parse more diagnostic information from CMIS
modules. No kernel changes required.
[1] https://lore.kernel.org/netdev/
20210623075925.2610908-1-idosch@idosch.org/
[2] https://members.snia.org/document/dl/26418
[3] http://www.qsfp-dd.com/wp-content/uploads/2021/05/CMIS5p0.pdf
Previous versions:
[4] https://lore.kernel.org/netdev/
20211003073219.1631064-1-idosch@idosch.org/
[5] https://lore.kernel.org/netdev/
20210824130344.1828076-1-idosch@idosch.org/
[6] https://lore.kernel.org/netdev/
20210818155202.1278177-1-idosch@idosch.org/
[7] https://lore.kernel.org/netdev/
20210809102152.719961-1-idosch@idosch.org/
====================
Link: https://lore.kernel.org/r/20211006104647.2357115-1-idosch@idosch.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ido Schimmel [Wed, 6 Oct 2021 10:46:47 +0000 (13:46 +0300)]
mlxsw: Add support for transceiver module extended state
Add support for the transceiver module extended state and sub-state
added in previous patch. The extended state is meant to describe link
issues related to transceiver modules.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ido Schimmel [Wed, 6 Oct 2021 10:46:46 +0000 (13:46 +0300)]
ethtool: Add transceiver module extended state
Add an extended state and sub-state to describe link issues related to
transceiver modules.
The 'ETHTOOL_LINK_EXT_SUBSTATE_MODULE_CMIS_NOT_READY' extended sub-state
tells user space that port is unable to gain a carrier because the CMIS
Module State Machine did not reach the ModuleReady (Fully Operational)
state. For example, if the module is stuck at ModuleLowPwr or
ModuleFault state. In case of the latter, user space can read the fault
reason from the module's EEPROM and potentially reset it.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ido Schimmel [Wed, 6 Oct 2021 10:46:45 +0000 (13:46 +0300)]
mlxsw: Add ability to control transceiver modules' power mode
Implement support for ethtool_ops::.get_module_power_mode and
ethtool_ops::set_module_power_mode.
The get operation is implemented using the Management Cable IO and
Notifications (MCION) register that reports the operational power mode
of the module and its presence. In case a module is not present, its
operational power mode is not reported to ethtool and user space. If not
set before, the power mode policy is reported as "high", which is the
default on Mellanox systems.
The set operation is implemented using the Port Module Memory Map
Properties (PMMP) register. The register instructs the device's firmware
to transition a plugged-in module to / out of low power mode by writing
to its memory map.
When the power mode policy is set to 'auto', a module will not
transition to low power mode as long as any ports using it are
administratively up. Example:
# devlink port split swp11 count 4
# ethtool --set-module swp11s0 power-mode-policy auto
$ ethtool --show-module swp11s0
Module parameters for swp11s0:
power-mode-policy auto
power-mode low
# ip link set dev swp11s0 up
# ip link set dev swp11s1 up
$ ethtool --show-module swp11s0
Module parameters for swp11s0:
power-mode-policy auto
power-mode high
# ip link set dev swp11s1 down
$ ethtool --show-module swp11s0
Module parameters for swp11s0:
power-mode-policy auto
power-mode high
# ip link set dev swp11s0 down
$ ethtool --show-module swp11s0
Module parameters for swp11s0:
power-mode-policy auto
power-mode low
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ido Schimmel [Wed, 6 Oct 2021 10:46:44 +0000 (13:46 +0300)]
mlxsw: reg: Add Management Cable IO and Notifications register
Add the Management Cable IO and Notifications register. It will be used
to retrieve the power mode status of a module in subsequent patches and
whether a module is present in a cage or not.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ido Schimmel [Wed, 6 Oct 2021 10:46:43 +0000 (13:46 +0300)]
mlxsw: reg: Add Port Module Memory Map Properties register
Add the Port Module Memory Map Properties register. It will be used to
set the power mode of a module in subsequent patches.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ido Schimmel [Wed, 6 Oct 2021 10:46:42 +0000 (13:46 +0300)]
ethtool: Add ability to control transceiver modules' power mode
Add a pair of new ethtool messages, 'ETHTOOL_MSG_MODULE_SET' and
'ETHTOOL_MSG_MODULE_GET', that can be used to control transceiver
modules parameters and retrieve their status.
The first parameter to control is the power mode of the module. It is
only relevant for paged memory modules, as flat memory modules always
operate in low power mode.
When a paged memory module is in low power mode, its power consumption
is reduced to the minimum, the management interface towards the host is
available and the data path is deactivated.
User space can choose to put modules that are not currently in use in
low power mode and transition them to high power mode before putting the
associated ports administratively up. This is useful for user space that
favors reduced power consumption and lower temperatures over reduced
link up times. In QSFP-DD modules the transition from low power mode to
high power mode can take a few seconds and this transition is only
expected to get longer with future / more complex modules.
User space can control the power mode of the module via the power mode
policy attribute ('ETHTOOL_A_MODULE_POWER_MODE_POLICY'). Possible
values:
* high: Module is always in high power mode.
* auto: Module is transitioned by the host to high power mode when the
first port using it is put administratively up and to low power mode
when the last port using it is put administratively down.
The operational power mode of the module is available to user space via
the 'ETHTOOL_A_MODULE_POWER_MODE' attribute. The attribute is not
reported to user space when a module is not plugged-in.
The user API is designed to be generic enough so that it could be used
for modules with different memory maps (e.g., SFF-8636, CMIS).
The only implementation of the device driver API in this series is for a
MAC driver (mlxsw) where the module is controlled by the device's
firmware, but it is designed to be generic enough so that it could also
be used by implementations where the module is controlled by the CPU.
CMIS testing
============
# ethtool -m swp11
Identifier : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628))
...
Module State : 0x03 (ModuleReady)
LowPwrAllowRequestHW : Off
LowPwrRequestSW : Off
The module is not in low power mode, as it is not forced by hardware
(LowPwrAllowRequestHW is off) or by software (LowPwrRequestSW is off).
The power mode can be queried from the kernel. In case
LowPwrAllowRequestHW was on, the kernel would need to take into account
the state of the LowPwrRequestHW signal, which is not visible to user
space.
$ ethtool --show-module swp11
Module parameters for swp11:
power-mode-policy high
power-mode high
Change the power mode policy to 'auto':
# ethtool --set-module swp11 power-mode-policy auto
Query the power mode again:
$ ethtool --show-module swp11
Module parameters for swp11:
power-mode-policy auto
power-mode low
Verify with the data read from the EEPROM:
# ethtool -m swp11
Identifier : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628))
...
Module State : 0x01 (ModuleLowPwr)
LowPwrAllowRequestHW : Off
LowPwrRequestSW : On
Put the associated port administratively up which will instruct the host
to transition the module to high power mode:
# ip link set dev swp11 up
Query the power mode again:
$ ethtool --show-module swp11
Module parameters for swp11:
power-mode-policy auto
power-mode high
Verify with the data read from the EEPROM:
# ethtool -m swp11
Identifier : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628))
...
Module State : 0x03 (ModuleReady)
LowPwrAllowRequestHW : Off
LowPwrRequestSW : Off
Put the associated port administratively down which will instruct the
host to transition the module to low power mode:
# ip link set dev swp11 down
Query the power mode again:
$ ethtool --show-module swp11
Module parameters for swp11:
power-mode-policy auto
power-mode low
Verify with the data read from the EEPROM:
# ethtool -m swp11
Identifier : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628))
...
Module State : 0x01 (ModuleLowPwr)
LowPwrAllowRequestHW : Off
LowPwrRequestSW : On
SFF-8636 testing
================
# ethtool -m swp13
Identifier : 0x11 (QSFP28)
...
Extended identifier description : 5.0W max. Power consumption, High Power Class (> 3.5 W) enabled
Power set : Off
Power override : On
...
Transmit avg optical power (Channel 1) : 0.7733 mW / -1.12 dBm
Transmit avg optical power (Channel 2) : 0.7649 mW / -1.16 dBm
Transmit avg optical power (Channel 3) : 0.7790 mW / -1.08 dBm
Transmit avg optical power (Channel 4) : 0.7837 mW / -1.06 dBm
Rcvr signal avg optical power(Channel 1) : 0.9302 mW / -0.31 dBm
Rcvr signal avg optical power(Channel 2) : 0.9079 mW / -0.42 dBm
Rcvr signal avg optical power(Channel 3) : 0.8993 mW / -0.46 dBm
Rcvr signal avg optical power(Channel 4) : 0.8778 mW / -0.57 dBm
The module is not in low power mode, as it is not forced by hardware
(Power override is on) or by software (Power set is off).
The power mode can be queried from the kernel. In case Power override
was off, the kernel would need to take into account the state of the
LPMode signal, which is not visible to user space.
$ ethtool --show-module swp13
Module parameters for swp13:
power-mode-policy high
power-mode high
Change the power mode policy to 'auto':
# ethtool --set-module swp13 power-mode-policy auto
Query the power mode again:
$ ethtool --show-module swp13
Module parameters for swp13:
power-mode-policy auto
power-mode low
Verify with the data read from the EEPROM:
# ethtool -m swp13
Identifier : 0x11 (QSFP28)
Extended identifier description : 5.0W max. Power consumption, High Power Class (> 3.5 W) not enabled
Power set : On
Power override : On
...
Transmit avg optical power (Channel 1) : 0.0000 mW / -inf dBm
Transmit avg optical power (Channel 2) : 0.0000 mW / -inf dBm
Transmit avg optical power (Channel 3) : 0.0000 mW / -inf dBm
Transmit avg optical power (Channel 4) : 0.0000 mW / -inf dBm
Rcvr signal avg optical power(Channel 1) : 0.0000 mW / -inf dBm
Rcvr signal avg optical power(Channel 2) : 0.0000 mW / -inf dBm
Rcvr signal avg optical power(Channel 3) : 0.0000 mW / -inf dBm
Rcvr signal avg optical power(Channel 4) : 0.0000 mW / -inf dBm
Put the associated port administratively up which will instruct the host
to transition the module to high power mode:
# ip link set dev swp13 up
Query the power mode again:
$ ethtool --show-module swp13
Module parameters for swp13:
power-mode-policy auto
power-mode high
Verify with the data read from the EEPROM:
# ethtool -m swp13
Identifier : 0x11 (QSFP28)
...
Extended identifier description : 5.0W max. Power consumption, High Power Class (> 3.5 W) enabled
Power set : Off
Power override : On
...
Transmit avg optical power (Channel 1) : 0.7934 mW / -1.01 dBm
Transmit avg optical power (Channel 2) : 0.7859 mW / -1.05 dBm
Transmit avg optical power (Channel 3) : 0.7885 mW / -1.03 dBm
Transmit avg optical power (Channel 4) : 0.7985 mW / -0.98 dBm
Rcvr signal avg optical power(Channel 1) : 0.9325 mW / -0.30 dBm
Rcvr signal avg optical power(Channel 2) : 0.9034 mW / -0.44 dBm
Rcvr signal avg optical power(Channel 3) : 0.9086 mW / -0.42 dBm
Rcvr signal avg optical power(Channel 4) : 0.8885 mW / -0.51 dBm
Put the associated port administratively down which will instruct the
host to transition the module to low power mode:
# ip link set dev swp13 down
Query the power mode again:
$ ethtool --show-module swp13
Module parameters for swp13:
power-mode-policy auto
power-mode low
Verify with the data read from the EEPROM:
# ethtool -m swp13
Identifier : 0x11 (QSFP28)
...
Extended identifier description : 5.0W max. Power consumption, High Power Class (> 3.5 W) not enabled
Power set : On
Power override : On
...
Transmit avg optical power (Channel 1) : 0.0000 mW / -inf dBm
Transmit avg optical power (Channel 2) : 0.0000 mW / -inf dBm
Transmit avg optical power (Channel 3) : 0.0000 mW / -inf dBm
Transmit avg optical power (Channel 4) : 0.0000 mW / -inf dBm
Rcvr signal avg optical power(Channel 1) : 0.0000 mW / -inf dBm
Rcvr signal avg optical power(Channel 2) : 0.0000 mW / -inf dBm
Rcvr signal avg optical power(Channel 3) : 0.0000 mW / -inf dBm
Rcvr signal avg optical power(Channel 4) : 0.0000 mW / -inf dBm
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Benjamin Coddington [Wed, 6 Oct 2021 17:20:44 +0000 (13:20 -0400)]
NFSD: Keep existing listeners on portlist error
If nfsd has existing listening sockets without any processes, then an error
returned from svc_create_xprt() for an additional transport will remove
those existing listeners. We're seeing this in practice when userspace
attempts to create rpcrdma transports without having the rpcrdma modules
present before creating nfsd kernel processes. Fix this by checking for
existing sockets before calling nfsd_destroy().
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Stefan Assmann [Tue, 24 Aug 2021 10:06:39 +0000 (12:06 +0200)]
iavf: fix double unlock of crit_lock
The crit_lock mutex could be unlocked twice as reported here
https://lists.osuosl.org/pipermail/intel-wired-lan/Week-of-Mon-
20210823/025525.html
Remove the superfluous unlock. Technically the problem was already
present before
5ac49f3c2702 as that commit only replaced the locking
primitive, but no functional change.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes:
5ac49f3c2702 ("iavf: use mutexes for locking of critical sections")
Fixes:
bac8486116b0 ("iavf: Refactor the watchdog state machine")
Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Sylwester Dziedziuch [Fri, 24 Sep 2021 09:40:41 +0000 (11:40 +0200)]
i40e: Fix freeing of uninitialized misc IRQ vector
When VSI set up failed in i40e_probe() as part of PF switch set up
driver was trying to free misc IRQ vectors in
i40e_clear_interrupt_scheme and produced a kernel Oops:
Trying to free already-free IRQ 266
WARNING: CPU: 0 PID: 5 at kernel/irq/manage.c:1731 __free_irq+0x9a/0x300
Workqueue: events work_for_cpu_fn
RIP: 0010:__free_irq+0x9a/0x300
Call Trace:
? synchronize_irq+0x3a/0xa0
free_irq+0x2e/0x60
i40e_clear_interrupt_scheme+0x53/0x190 [i40e]
i40e_probe.part.108+0x134b/0x1a40 [i40e]
? kmem_cache_alloc+0x158/0x1c0
? acpi_ut_update_ref_count.part.1+0x8e/0x345
? acpi_ut_update_object_reference+0x15e/0x1e2
? strstr+0x21/0x70
? irq_get_irq_data+0xa/0x20
? mp_check_pin_attr+0x13/0xc0
? irq_get_irq_data+0xa/0x20
? mp_map_pin_to_irq+0xd3/0x2f0
? acpi_register_gsi_ioapic+0x93/0x170
? pci_conf1_read+0xa4/0x100
? pci_bus_read_config_word+0x49/0x70
? do_pci_enable_device+0xcc/0x100
local_pci_probe+0x41/0x90
work_for_cpu_fn+0x16/0x20
process_one_work+0x1a7/0x360
worker_thread+0x1cf/0x390
? create_worker+0x1a0/0x1a0
kthread+0x112/0x130
? kthread_flush_work_fn+0x10/0x10
ret_from_fork+0x1f/0x40
The problem is that at that point misc IRQ vectors
were not allocated yet and we get a call trace
that driver is trying to free already free IRQ vectors.
Add a check in i40e_clear_interrupt_scheme for __I40E_MISC_IRQ_REQUESTED
PF state before calling i40e_free_misc_vector. This state is set only if
misc IRQ vectors were properly initialized.
Fixes:
c17401a1dd21 ("i40e: use separate state bit for miscellaneous IRQ setup")
Reported-by: PJ Waskiewicz <pwaskiewicz@jumptrading.com>
Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Dave Switzer <david.switzer@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Jiri Benc [Tue, 14 Sep 2021 08:54:42 +0000 (10:54 +0200)]
i40e: fix endless loop under rtnl
The loop in i40e_get_capabilities can never end. The problem is that
although i40e_aq_discover_capabilities returns with an error if there's
a firmware problem, the returned error is not checked. There is a check for
pf->hw.aq.asq_last_status but that value is set to I40E_AQ_RC_OK on most
firmware problems.
When i40e_aq_discover_capabilities encounters a firmware problem, it will
encounter the same problem on its next invocation. As the result, the loop
becomes endless. We hit this with I40E_ERR_ADMIN_QUEUE_TIMEOUT but looking
at the code, it can happen with a range of other firmware errors.
I don't know what the correct behavior should be: whether the firmware
should be retried a few times, or whether pf->hw.aq.asq_last_status should
be always set to the encountered firmware error (but then it would be
pointless and can be just replaced by the i40e_aq_discover_capabilities
return value). However, the current behavior with an endless loop under the
rtnl mutex(!) is unacceptable and Intel has not submitted a fix, although we
explained the bug to them 7 months ago.
This may not be the best possible fix but it's better than hanging the whole
system on a firmware bug.
Fixes:
56a62fc86895 ("i40e: init code and hardware support")
Tested-by: Stefan Assmann <sassmann@redhat.com>
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Dave Switzer <david.switzer@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Vitaly Kuznetsov [Wed, 6 Oct 2021 12:50:16 +0000 (14:50 +0200)]
x86/hyperv: Avoid erroneously sending IPI to 'self'
__send_ipi_mask_ex() uses an optimization: when the target CPU mask is
equal to 'cpu_present_mask' it uses 'HV_GENERIC_SET_ALL' format to avoid
converting the specified cpumask to VP_SET. This case was overlooked when
'exclude_self' parameter was added. As the result, a spurious IPI to
'self' can be send.
Reported-by: Thomas Gleixner <tglx@linutronix.de>
Fixes:
dfb5c1e12c28 ("x86/hyperv: remove on-stack cpumask from hv_send_ipi_mask_allbutself")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20211006125016.941616-1-vkuznets@redhat.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Arnd Bergmann [Wed, 6 Oct 2021 15:36:33 +0000 (17:36 +0200)]
Merge tag 'imx-fixes-5.15-2' of git://git./linux/kernel/git/shawnguo/linux into arm/fixes
i.MX fixes for 5.15, round 2:
- A couple of fixes from Haibo Chen to update SPI NOR TX bus width for
i.MX6 and i.MX8 boards. This becomes necessary because spi-nor driver
starts using the setting in DT.
- Mark buck2 always-on for i.MX8MM Kontron-n801x-som board to avoid the
core supply being turned off unexpectedly.
- Fix eSDHC2 device tree settings for LS1028A SoC.
- Disable GIC CPU interface before calling stby-poweroff sequence to fix
power-off failure on i.MX6.
- Fix M2_RST# GPIO pinmux on i.MX8M venice-gw7902 boards.
* tag 'imx-fixes-5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
arm64: dts: imx8m*-venice-gw7902: fix M2_RST# gpio
ARM: imx6: disable the GIC CPU interface before calling stby-poweroff sequence
arm64: dts: ls1028a: fix eSDHC2 node
arm64: dts: imx8mm-kontron-n801x-som: do not allow to switch off buck2
arm64: dts: imx8: change the spi-nor tx
ARM: dts: imx: change the spi-nor tx
Link: https://lore.kernel.org/r/20211006125734.GA10197@dragon
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Colin Ian King [Wed, 6 Oct 2021 08:49:55 +0000 (09:49 +0100)]
qed: Fix spelling mistake "ctx_bsaed" -> "ctx_based"
There is a spelling mistake in a DP_VERBOSE message. Fix it.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Wed, 6 Oct 2021 07:33:47 +0000 (10:33 +0300)]
mlxsw: spectrum_buffers: silence uninitialized warning
Static checkers and runtime checkers such as KMSan will complain that
we do not initialize the last 6 bytes of "cb_priv". The caller only
uses the first two bytes so it doesn't cause a runtime issue. Still
worth fixing though.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marcel Ziswiler [Wed, 6 Oct 2021 06:31:04 +0000 (08:31 +0200)]
dt-bindings: net: dsa: marvell: fix compatible in example
While the MV88E6390 switch chip exists, one is supposed to use a
compatible of "marvell,mv88e6190" for it. Fix this in the given example.
Signed-off-by: Marcel Ziswiler <marcel@ziswiler.com>
Fixes:
a3c53be55c95 ("net: dsa: mv88e6xxx: Support multiple MDIO busses")
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gyeongun Kang [Wed, 6 Oct 2021 03:57:39 +0000 (03:57 +0000)]
gtp: use skb_dst_update_pmtu_no_confirm() instead of direct call
skb_dst_update_pmtu_no_confirm() is a just wrapper function of
->update_pmtu(). So, it doesn't change logic
Signed-off-by: Gyeongun Kang <kyeongun15@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jean Sacren [Wed, 6 Oct 2021 06:41:20 +0000 (00:41 -0600)]
net: tg3: fix obsolete check of !err
The err variable is checked for true or false a few lines above. When
!err is checked again, it always evaluates to true. Therefore we should
skip this check.
We should also group the adjacent statements together for readability.
Signed-off-by: Jean Sacren <sakiwit@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson [Tue, 5 Oct 2021 23:11:05 +0000 (16:11 -0700)]
ionic: move filter sync_needed bit set
Move the setting of the filter-sync-needed bit to the error
case in the filter add routine to be sure we're checking the
live filter status rather than a copy of the pre-sync status.
Fixes:
969f84394604 ("ionic: sync the filters in the work task")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 6 Oct 2021 01:01:38 +0000 (18:01 -0700)]
gve: report 64bit tx_bytes counter from gve_handle_report_stats()
Each tx queue maintains a 64bit counter for bytes, there is
no reason to truncate this to 32bit (or this has not been
documented)
Fixes:
24aeb56f2d38 ("gve: Add Gvnic stats AQ command and ethtool show/set-priv-flags.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yangchun Fu <yangchun@google.com>
Cc: Kuo Zhao <kuozhao@google.com>
Cc: David Awogbemila <awogbemila@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 6 Oct 2021 00:30:30 +0000 (17:30 -0700)]
gve: fix gve_get_stats()
gve_get_stats() can report wrong numbers if/when u64_stats_fetch_retry()
returns true.
What is needed here is to sample values in temporary variables,
and only use them after each loop is ended.
Fixes:
f5cedc84a30d ("gve: Add transmit and receive support")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Catherine Sullivan <csully@google.com>
Cc: Sagi Shahar <sagis@google.com>
Cc: Jon Olson <jonolson@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Luigi Rizzo <lrizzo@google.com>
Cc: Jeroen de Borst <jeroendb@google.com>
Cc: Tao Liu <xliutaox@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 5 Oct 2021 21:04:17 +0000 (14:04 -0700)]
rtnetlink: fix if_nlmsg_stats_size() under estimation
rtnl_fill_statsinfo() is filling skb with one mandatory if_stats_msg structure.
nlmsg_put(skb, pid, seq, type, sizeof(struct if_stats_msg), flags);
But if_nlmsg_stats_size() never considered the needed storage.
This bug did not show up because alloc_skb(X) allocates skb with
extra tailroom, because of added alignments. This could very well
be changed in the future to have deterministic behavior.
Fixes:
10c9ead9f3c6 ("rtnetlink: add new RTM_GETSTATS message to dump link stats")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Roopa Prabhu <roopa@nvidia.com>
Acked-by: Roopa Prabhu <roopa@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 6 Oct 2021 14:08:12 +0000 (15:08 +0100)]
Merge branch 'RTL8366RB-enhancements'
Linus Walleij says:
====================
RTL8366RB enhancements
This patch set is a set of reasonably mature improvements
for the RTL8366RB switch, implemented after Vladimir
challenged me to dig deeper into the switch functions.
ChangeLog v4->v5:
- Drop dubious flood control patch: these registers probably
only deal with rate limiting, we will deal with this
another time if we can figure it out.
ChangeLog -> v4:
- Rebase earlier circulated patches on the now merged
VLAN set-up cleanups.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Walleij [Tue, 5 Oct 2021 19:47:04 +0000 (21:47 +0200)]
net: dsa: rtl8366rb: Support setting STP state
This adds support for setting the STP state to the RTL8366RB
DSA switch. This rids the following message from the kernel on
e.g. OpenWrt:
DSA: failed to set STP state 3 (-95)
Since the RTL8366RB has one STP state register per FID with
two bit per port in each, we simply loop over all the FIDs
and set the state on all of them.
Cc: Vladimir Oltean <olteanv@gmail.com>
Cc: Alvin Šipraga <alsi@bang-olufsen.dk>
Cc: Mauri Sandberg <sandberg@mailfence.com>
Cc: DENG Qingfang <dqfext@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Walleij [Tue, 5 Oct 2021 19:47:03 +0000 (21:47 +0200)]
net: dsa: rtl8366rb: Support fast aging
This implements fast aging per-port using the special "security"
register, which will flush any learned L2 LUT entries on a port.
The vendor API just enabled setting and clearing this bit, so
we set it to age out any entries on the port and then we clear
it again.
Suggested-by: Vladimir Oltean <olteanv@gmail.com>
Cc: Mauri Sandberg <sandberg@mailfence.com>
Cc: DENG Qingfang <dqfext@gmail.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Walleij [Tue, 5 Oct 2021 19:47:02 +0000 (21:47 +0200)]
net: dsa: rtl8366rb: Support disabling learning
The RTL8366RB hardware supports disabling learning per-port
so let's make use of this feature. Rename some unfortunately
named registers in the process.
Suggested-by: Vladimir Oltean <olteanv@gmail.com>
Cc: Alvin Šipraga <alsi@bang-olufsen.dk>
Cc: Mauri Sandberg <sandberg@mailfence.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: DENG Qingfang <dqfext@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Catherine Sullivan [Wed, 6 Oct 2021 02:42:21 +0000 (19:42 -0700)]
gve: Properly handle errors in gve_assign_qpl
Ignored errors would result in crash.
Fixes:
ede3fcf5ec67f ("gve: Add support for raw addressing to the rx path")
Signed-off-by: Catherine Sullivan <csully@google.com>
Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tao Liu [Wed, 6 Oct 2021 02:42:20 +0000 (19:42 -0700)]
gve: Avoid freeing NULL pointer
Prevent possible crashes when cleaning up after unsuccessful
initializations.
Fixes:
893ce44df5658 ("gve: Add basic driver framework for Compute Engine Virtual NIC")
Signed-off-by: Tao Liu <xliutaox@google.com>
Signed-off-by: Catherine Sully <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Catherine Sullivan [Wed, 6 Oct 2021 02:42:19 +0000 (19:42 -0700)]
gve: Correct available tx qpl check
The qpl_map_size is rounded up to a multiple of sizeof(long), but the
number of qpls doesn't have to be.
Fixes:
f5cedc84a30d2 ("gve: Add transmit and receive support")
Signed-off-by: Catherine Sullivan <csully@google.com>
Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiang Wang [Mon, 4 Oct 2021 23:25:28 +0000 (23:25 +0000)]
unix: Fix an issue in unix_shutdown causing the other end read/write failures
Commit
94531cfcbe79 ("af_unix: Add unix_stream_proto for sockmap") sets
unix domain socket peer state to TCP_CLOSE in unix_shutdown. This could
happen when the local end is shutdown but the other end is not. Then,
the other end will get read or write failures which is not expected.
Fix the issue by setting the local state to shutdown.
Fixes:
94531cfcbe79 ("af_unix: Add unix_stream_proto for sockmap")
Reported-by: Casey Schaufler <casey@schaufler-ca.com>
Suggested-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Jiang Wang <jiang.wang@bytedance.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211004232530.2377085-1-jiang.wang@bytedance.com
Andy Shevchenko [Fri, 1 Oct 2021 13:55:44 +0000 (16:55 +0300)]
hyper-v: Replace uuid.h with types.h
There is no user of anything in uuid.h in the hyperv.h. Replace it with
more appropriate types.h.
Fixes:
f081bbb3fd03 ("hyper-v: Remove internal types from UAPI header")
Reported-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://lore.kernel.org/r/20211001135544.1823-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
David S. Miller [Wed, 6 Oct 2021 10:18:27 +0000 (11:18 +0100)]
Merge branch 'stmmac-eee-fix'
Wong Vee Khee says:
====================
net: stmmac: Turn off EEE on MAC link down
This patch series ensure PCS EEE is turned off on the event of MAC
link down.
Tested on Intel AlderLake-S (STMMAC + MaxLinear GPY211 PHY).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Wong Vee Khee [Tue, 5 Oct 2021 11:51:00 +0000 (19:51 +0800)]
net: stmmac: trigger PCS EEE to turn off on link down
The current implementation enable PCS EEE feature in the event of link
up, but PCS EEE feature is not disabled on link down.
This patch makes sure PCE EEE feature is disabled on link down.
Fixes:
656ed8b015f1 ("net: stmmac: fix EEE init issue when paired with EEE capable PHYs")
Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wong Vee Khee [Tue, 5 Oct 2021 11:50:59 +0000 (19:50 +0800)]
net: pcs: xpcs: fix incorrect steps on disable EEE
When Energy-Efficient Ethernet(EEE) is disable from the MAC side,
we need to clear the DW_VR_MII_EEE_TRN_LPI bit of DW_VR_MII_EEE_MCTRL1
register.
Fixes:
7617af3d1a5e ("net: pcs: Introducing support for DWC xpcs Energy Efficient Ethernet")
Cc: Michael Sit Wei Hong <michael.wei.hong.sit@intel.com>
Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cristian Marussi [Thu, 16 Sep 2021 10:33:36 +0000 (11:33 +0100)]
firmware: arm_scmi: Add proper barriers to scmi virtio device
Only one single SCMI Virtio device is currently supported by this driver
and it is referenced using a static global variable which is initialized
once for all during probing and nullified at virtio device removal.
Add proper SMP barriers to protect accesses to such device reference to
ensure that the initialzation state of such device is correctly observed by
all PEs at any time.
Return -EBUSY, instead of -EINVAL, and a descriptive error message if more
than one SCMI Virtio device is ever found and probed.
Link: https://lore.kernel.org/r/20210916103336.7243-3-cristian.marussi@arm.com
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Cristian Marussi [Thu, 16 Sep 2021 10:33:35 +0000 (11:33 +0100)]
firmware: arm_scmi: Simplify spinlocks in virtio transport
Remove unneeded nested irqsave/irqrestore spinlocks.
Add also a few descriptive comments to explain better the system behaviour
at shutdown time.
Link: https://lore.kernel.org/r/20210916103336.7243-2-cristian.marussi@arm.com
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Roger Quadros [Thu, 2 Sep 2021 09:58:28 +0000 (12:58 +0300)]
ARM: dts: omap3430-sdp: Fix NAND device node
Nand is on CS1 so reg properties first field should be 1 not 0.
Fixes:
44e4716499b8 ("ARM: dts: omap3: Fix NAND device nodes")
Cc: stable@vger.kernel.org # v4.6+
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Tony Lindgren [Wed, 8 Sep 2021 05:49:36 +0000 (08:49 +0300)]
bus: ti-sysc: Use CLKDM_NOAUTO for dra7 dcan1 for errata i893
Commit
94f6345712b3 ("bus: ti-sysc: Implement quirk handling for
CLKDM_NOAUTO") should have also added the quirk for dra7 dcan1 in
addition to dcan2 for errata i893 handling.
Let's also pass the quirk flag for legacy mode booting for if "ti,hwmods"
dts property is used with related dcan hwmod data. This should be only
needed if anybody needs to git bisect earlier stable trees though.
Fixes:
94f6345712b3 ("bus: ti-sysc: Implement quirk handling for CLKDM_NOAUTO")
Signed-off-by: Tony Lindgren <tony@atomide.com>
Tony Lindgren [Wed, 6 Oct 2021 04:55:44 +0000 (07:55 +0300)]
Merge branch 'pruss-fix' into fixes
Merge in a fix for pruss reset issue caused by enabling pruss for am335x.
Stephen Rothwell [Wed, 6 Oct 2021 01:23:15 +0000 (12:23 +1100)]
ethernet: fix up ps3_gelic_net.c for "ethernet: use eth_hw_addr_set()"
Another case needing a u8 * cast.
Fixes:
a96d317fb1a3 ("ethernet: use eth_hw_addr_set()")
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Link: https://lore.kernel.org/r/20211006122315.4e04fb87@canb.auug.org.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
David Bauer [Tue, 5 Oct 2021 22:54:01 +0000 (00:54 +0200)]
net: phy: at803x: add QCA9561 support
Add support for the embedded fast-ethernet PHY found on the QCA9561
WiSoC platform. It supports the usual Atheros PHY featureset including
the cable tester.
Tested on a Xiaomi MiRouter 4Q (QCA9561)
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David Bauer <mail@david-bauer.net>
Link: https://lore.kernel.org/r/20211005225401.10653-1-mail@david-bauer.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kan Liang [Thu, 30 Sep 2021 19:52:46 +0000 (12:52 -0700)]
perf tests attr: Add missing topdown metrics events
The Topdown metrics events were added as 'perf stat' default events
since commit
42641d6f4d15e6db ("perf stat: Add Topdown metrics events as
default events").
However, the perf attr tests were not updated
accordingly.
The perf attr test fails on the platform which supports Topdown metrics.
# perf test 17
17: Setup struct perf_event_attr :FAILED!
Add Topdown metrics events into perf attr test cases. Make them optional
since they are only available on newer platforms.
Fixes:
42641d6f4d15e6db ("perf stat: Add Topdown metrics events as default events")
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/1633031566-176517-1-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Linus Torvalds [Tue, 5 Oct 2021 17:52:53 +0000 (10:52 -0700)]
Merge tag 'warning-fixes-
20211005' of git://git./linux/kernel/git/dhowells/linux-fs
Pull misc fs warning fixes from David Howells:
"The first four patches fix kerneldoc warnings in fscache, afs, 9p and
nfs - they're mostly just comment changes, though there's one place in
9p where a comment got detached from the function it was attached to
(v9fs_fid_add) and has to switch places with a function that got
inserted between (__add_fid).
The patch on the end removes an unused symbol in fscache"
* tag 'warning-fixes-
20211005' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
fscache: Remove an unused static variable
fscache: Fix some kerneldoc warnings shown up by W=1
9p: Fix a bunch of kerneldoc warnings shown up by W=1
afs: Fix kerneldoc warning shown up by W=1
nfs: Fix kerneldoc warning shown up by W=1
Arnaldo Carvalho de Melo [Wed, 12 Feb 2020 14:04:23 +0000 (11:04 -0300)]
tools include UAPI: Sync sound/asound.h copy with the kernel sources
Picking the changes from:
09d23174402da0f1 ("ALSA: rawmidi: introduce SNDRV_RAWMIDI_IOCTL_USER_PVERSION")
Which entails no changes in the tooling side as it doesn't introduce new
SNDRV_PCM_IOCTL_ ioctls.
To silence this perf tools build warning:
Warning: Kernel ABI header at 'tools/include/uapi/sound/asound.h' differs from latest version at 'include/uapi/sound/asound.h'
diff -u tools/include/uapi/sound/asound.h include/uapi/sound/asound.h
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Branislav Rankov [Wed, 21 Jul 2021 10:32:58 +0000 (11:32 +0100)]
perf build: Fix plugin static linking with libopencsd on ARM and ARM64
Filter out -static flag when building plugins as they are always built
as dynamic libraries and -static and -dynamic don't work well together
on arm and arm64.
Signed-off-by: Branislav Rankov <branislav.rankov@arm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Brown <Mark.Brown@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: coresight@lists.linaro.org
Cc: nd@arm.com
Link: https://lore.kernel.org/r/e88952b3-2470-da96-dee9-e247a1759cd0@arm.com
Signed-off-by: Tamas Zsoldos <tamas.zsoldos@arm.com>
[ Split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Branislav Rankov [Wed, 21 Jul 2021 10:32:58 +0000 (11:32 +0100)]
perf build: Add missing -lstdc++ when linking with libopencsd
Add -lstdc++ to perf when linking libopencsd as it is a dependency. It
does not hurt to add it when dynamic linking.
Signed-off-by: Branislav Rankov <branislav.rankov@arm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Brown <Mark.Brown@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: coresight@lists.linaro.org
Cc: nd@arm.com
Link: https://lore.kernel.org/r/e88952b3-2470-da96-dee9-e247a1759cd0@arm.com
Signed-off-by: Tamas Zsoldos <tamas.zsoldos@arm.com>
[ Split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Like Xu [Tue, 28 Sep 2021 10:29:38 +0000 (18:29 +0800)]
perf jevents: Free the sys_event_tables list after processing entries
The compiler reports that free_sys_event_tables() is dead code.
But according to the semantics, the "LIST_HEAD(sys_event_tables)" should
also be released, just like we do with 'arch_std_events' in main().
Fixes:
e9d32c1bf0cd7a98 ("perf vendor events: Add support for arch standard events")
Signed-off-by: Like Xu <likexu@tencent.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210928102938.69681-1-likexu@tencent.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>