review.tizen.org Git - platform/kernel/linux-starfive.git/log

net: pcs: lynx: consolidate sgmii and 1000base-x config code

Consolidate lynx_pcs_config_1000basex() and lynx_pcs_config_sgmii() into
a single function. The differences between these two are:

- The value that the link timer is set to.
- The value of the IF_MODE register.

Everything else is identical.

This patch depends on "net: phylink: add QSGMII support to
phylink_mii_c22_pcs_encode_advertisement()".

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: phylink: add QSGMII support to phylink_mii_c22_pcs_encode_advertisement()

The QSGMII MAC-to-PHY reply is the same as the SGMII MAC-to-PHY reply.
Add support for this to phylink_mii_c22_pcs_encode_advertisement().

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: lan743x: Use correct variable in lan743x_sgmii_config()

There is a copy and paste bug in lan743x_sgmii_config() so it checks
if (ret < 0) instead of if (mii_ctl < 0).

Fixes: 46b777ad9a8c ("net: lan743x: Add support to SGMII 1G and 2.5G")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/YrRry7K66BzKezl8@kili
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'mlxsw-unified-bridge-conversion-part-3'

Ido Schimmel says:

====================
mlxsw: Unified bridge conversion - part 3/6

This is the third part of the conversion of mlxsw to the unified bridge
model.

Like the second part, this patchset does not begin the conversion, but
instead prepares the FID code for it. The individual changes are
relatively small and self-contained with detailed description and
motivation in the commit message.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: spectrum: Change mlxsw_sp_rif_vlan_fid_op() to be dedicated for FID RIFs

The function was designed to configure both VLAN and FID RIFs, but
currently the driver does not use VLAN RIFs. Instead, it emulates VLAN
RIFs using FID RIFs.

As part of the conversion to the unified bridge model, the driver will
need to use VLAN RIFs, but they will be configured differently from FID
RIFs.

As a preparation for this change, rename the function to reflect the
fact that it is specific to FID RIFs and do not pass the RIF type as an
argument.

This leaves mlxsw_reg_ritr_fid_set() unused, so remove it.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: spectrum: Rename MLXSW_SP_RIF_TYPE_VLAN

Currently, the driver emulates 802.1Q FIDs using 802.1D FIDs. As such,
the RIFs configured on top of these FIDs are FID RIFs and not VLAN RIFs.

As part of converting the driver to the unified bridge model, 802.1Q
FIDs and VLAN RIFs will be used.

As a preparation for this change, rename the emulated VLAN RIFs from
'MLXSW_SP_RIF_TYPE_VLAN' to 'MLXSW_SP_RIF_TYPE_VLAN_EMU'. After the
conversion the emulated VLAN RIFs will be removed.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: spectrum: Use different arrays of FID families per-ASIC type

Egress VID for layer 2 multicast is determined from two tables, the MPE
and PGT tables. The MPE table is a two dimensional table indexed by local
port and SMPE index, which should be thought of as a FID index.

In Spectrum-1 the SMPE index is derived from the PGT entry, whereas in
Spectrum-2 and newer ASICs the SMPE index is a FID attribute configured
via the SFMR register.

The validity of the SMPE index in SFMR is influenced from two factors:
1. FID family. SMPE index is reserved for rFIDs, as their flooding is
handled by firmware.
2. ASIC generation. SMPE index is always reserved for Spectrum-1.

As such, the validity of the SMPE index should be an attribute of the FID
family and have different arrays of FID families per-ASIC type.

As a preparation for SMPE index configuration, create separate arrays of
FID families for different ASICs.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: spectrum_fid: Pass FID structure to __mlxsw_sp_fid_port_vid_map()

The function configures {Port, VID}->FID classification entries using
the SVFA register. In the unified bridge model such entries will need to
be programmed with an ingress RIF parameter, which is a FID attribute.

As a preparation for this change, pass the FID structure itself to the
function.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: spectrum_fid: Pass FID structure to mlxsw_sp_fid_op()

The function gets several arguments derived from the FID structure
itself. In the future, it will need to be extended to configure
additional FID attributes.

Prepare for that change and reduce the arguments list by passing the FID
structure itself.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: spectrum_fid: Rename mlxsw_sp_fid_vni_op()

After the previous patch, all the callers of the function pass arguments
extracted from the FID structure itself. Reduce the arguments list by
simply passing the FID structure itself.

This makes the function more generic as it can be easily extended to
edit any FID attributes. Rename it to mlxsw_sp_fid_edit_op() to reflect
that.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: spectrum_fid: Update FID structure prior to device configuration

Currently, the only FID attributes that are edited after FID creation
are its VNI and NVE tunnel flood pointer. This is achieved by eventually
invoking mlxsw_sp_fid_vni_op() with an updated set of arguments.

In the future, more FID attributes will need to be edited, such as the
ingress RIF configured on top of the FID.

Therefore, it makes sense to encapsulate all the FID edit logic in one
function that will perform the edit based on an updated FID structure.

To that end, update the FID structure before invoking the various edit
operations that eventually call into mlxsw_sp_fid_vni_op(). Use the
updated structure as the sole argument of the edit operations.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: spectrum_fid: Maintain {port, VID}->FID mappings

In the unified bridge model, FID classification mappings (e.g., {Port,
VID}->FID) and layer 3 egress VID classification mappings (i.e., {eRIF,
ePort}->VID) will need to be updated when a RIF is configured on top of
a FID. This requires the driver to be aware of all the {Port, VID} pairs
mapped to a FID.

To that end, extend the FID structure with a linked list of {Port, VID}
pairs. Add an entry to the list when a {Port, VID} is mapped to a FID
and remove it upon unmap.

Keep the list sorted by local port as it will be useful for {eRIF,
ePort}->VID mappings via REIV register in the future.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'ipmr-remove-rwlocks'

Eric Dumazet says:

====================
ipmr: get rid of rwlocks

We need to get rid of rwlocks in networking stacks,
if read_lock() is (ab)used from softirq context.

As discussed recently [1], rwlock are unfair by design in this case,
and writers can starve and trigger soft lockups.

This series convert ipmr code (both IPv4 and IPv6 families)
to RCU and spinlocks.

[1] https://lkml.org/lkml/2022/6/17/272

v2: fixed two typos, and resent because patch 19/19
did not make it to patchwork.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

ip6mr: convert mrt_lock to a spinlock

mrt_lock is only held in write mode, from process context only.

We can switch to a mere spinlock, and avoid blocking BH.

Also, vif_dev_read() is always called under standard rcu_read_lock().

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipmr: convert mrt_lock to a spinlock

mrt_lock is only held in write mode, from process context only.

We can switch to a mere spinlock, and avoid blocking BH.

Also, vif_dev_read() is always called under standard rcu_read_lock().

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipmr: convert /proc handlers to rcu_read_lock()

We can use standard rcu_read_lock(), to get rid
of last read_lock(&mrt_lock) call points.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipmr: adopt rcu_read_lock() in mr_dump()

We no longer need to acquire mrt_lock() in mr_dump,
using rcu_read_lock() is enough.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ip6mr: switch ip6mr_get_route() to rcu_read_lock()

Like ipmr_get_route(), we can use standard RCU here.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ip6mr: do not acquire mrt_lock while calling ip6_mr_forward()

ip6_mr_forward() uses standard RCU protection already.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ip6mr: do not acquire mrt_lock before calling ip6mr_cache_unresolved

rcu_read_lock() protection is good enough.

ip6mr_cache_unresolved() uses a dedicated spinlock (mfc_unres_lock)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ip6mr: do not acquire mrt_lock in ioctl(SIOCGETMIFCNT_IN6)

rcu_read_lock() protection is good enough.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ip6mr: do not acquire mrt_lock in pim6_rcv()

rcu_read_lock() protection is more than enough.

vif_dev_read() supports either mrt_lock or rcu_read_lock().

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ip6mr: ip6mr_cache_report() changes

ip6mr_cache_report() first argument can be marked const, and we change
the caller convention about which lock needs to be held.

Instead of read_lock(&mrt_lock), we can use rcu_read_lock().

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipmr: do not acquire mrt_lock in ipmr_get_route()

mr_fill_mroute() uses standard rcu_read_unlock(),
no need to grab mrt_lock anymore.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipmr: do not acquire mrt_lock while calling ip_mr_forward()

ip_mr_forward() uses standard RCU protection already.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipmr: do not acquire mrt_lock before calling ipmr_cache_unresolved()

rcu_read_lock() protection is good enough.

ipmr_cache_unresolved() uses a dedicated spinlock (mfc_unres_lock)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipmr: do not acquire mrt_lock in ioctl(SIOCGETVIFCNT)

rcu_read_lock() protection is good enough.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipmr: do not acquire mrt_lock in __pim_rcv()

rcu_read_lock() protection is more than enough.

vif_dev_read() supports either mrt_lock or rcu_read_lock().

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipmr: ipmr_cache_report() changes

ipmr_cache_report() first argument can be marked const, and we change
the caller convention about which lock needs to be held.

Instead of read_lock(&mrt_lock), we can use rcu_read_lock().

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipmr: change igmpmsg_netlink_event() prototype

igmpmsg_netlink_event() first argument can be marked const.

igmpmsg_netlink_event() reads mrt->net and mrt->id,
both being set once in mr_table_alloc().

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipmr: add rcu protection over (struct vif_device)->dev

We will soon use RCU instead of rwlock in ipmr & ip6mr

This preliminary patch adds proper rcu verbs to read/write
(struct vif_device)->dev

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ip6mr: do not get a device reference in pim6_rcv()

pim6_rcv() is called under rcu_read_lock(), there is
no need to use dev_hold()/dev_put() pair.

IPv4 side was handled in commit 55747a0a73ea
("ipmr: __pim_rcv() is called under rcu_read_lock")

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'dsa-microchip-common-spi-probe'

Arun Ramadoss says:

====================
net: dsa: microchip: common spi probe for the ksz series switches - part 2

This patch series aims to refactor the ksz_switch_register routine to have the
common flow for the ksz series switch. And this is the follow up patch series.

First, it tries moves the common implementation in the setup from individual
files to ksz_setup. Then implements the common dsa_switch_ops structure instead
of independent registration. And then moves the ksz_dev_ops to ksz_common.c,
it allows the dynamic detection of which ksz_dev_ops to be used based on
the switch detection function.

Finally, the patch updates the ksz_spi probe function to be same for all the
ksz_switches.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: microchip: common ksz_spi_probe for ksz switches

As of now, there are two spi probes, one ksz8795_spi.c and other
ksz9477_spi.c. This patch combines two files into single ksz_spi.c. The
difference between the two are regmap config and struct ksz8. The regmap
config is assigned based on the platform data. And struct ksz8 is left
untouched, as it is used only ksz8795.c. It can be used for all
other switches also in future.

Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: microchip: remove the ksz8/ksz9477_switch_register

This patch delete the ksz8_switch_register and ksz9477_switch_register
since both are calling the ksz_switch_register function. Instead the
ksz_switch_register is called from the probe function.

Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: microchip: move ksz_dev_ops to ksz_common.c

This patch move the ksz_dev_ops from individual files to ksz_common.c.
And the dev_ops is assigned to ksz_device based on the switch detect.
This reduces the redundant function and allows to reuse the
functionality for LAN937x which has similar register set.

Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: microchip: common menuconfig for ksz series switch

This patch replaces the two different menuconfig for ksz9477 and ksz8795
to single ksz_common. so that it can be extended for the other switch
like lan937x. And removes the export_symbols for the extern functions in
the ksz_common.h.

Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: microchip: ksz9477: separate phylink mode from switch register

As per 'commit 3506b2f42dff ("net: dsa: microchip: call
phy_remove_link_mode during probe")' phy_remove_link_mode is added in
the switch_register function after dsa_switch_register. In order to have
the common switch register function, moving this phylink validation to
phylink_get_caps validation hook.

Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: microchip: common dsa_switch_ops for ksz switches

At present, ksz8795.c and ksz9477.c have separate dsa_switch_ops
structure initialization. This patch modifies the files such a way that
ksz switches has common dsa_switch_ops in the ksz_common.c file.

Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: microchip: move start of switch to ksz_setup

This patch move the setting the start bit from the individual switch
configuration to ksz_setup

Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: microchip: move multicast enable to ksz_setup

This patch moves the enabling the multicast storm protection from
individual setup function to ksz_setup function.

Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: microchip: move broadcast rate limit to ksz_setup

This patch move the 10% broadcast protection from the individual setup
to ksz_setup. In the ksz9477, broadcast protection is updated in reset
function.

Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: microchip: move setup function to ksz_common

This patch move the common initialization of switches to ksz_setup and
perform the switch specific initialization using the ksz_dev_ops
function pointer.

Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: microchip: add the enable_stp_addr pointer in ksz_dev_ops

In order to transmit the STP BPDU packet to the CPU port, the STP
address 01-80-c2-00-00-00 has to be added to static alu table for
ksz8795 series switch. For the ksz9477 switch, there is reserved
multicast table which handles forwarding the particular set of
multicast address to cpu port. So enabling the multicast reserved table
and updated the cpu port index. The stp addr is enabled during the setup
phase using the enable_stp_addr pointer in struct ksz_dev_ops.

Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: microchip: add config_cpu_port to struct ksz_dev_ops

To have the common set of initialization in ksz_setup, introduced the
new config_cpu_port member to ksz_dev_ops. Since both the ksz8795.c and
ksz9477.c configuring the cpu port in the setup function, introduced the
member.

Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: microchip: rename shutdown to reset in ksz_dev_ops

This patch renames the shutdown to reset in ksz_dev_ops in order to use
the reset dev_ops in the ksz_setup.

Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'bonding-per-port-priorities'

Hangbin Liu says:

====================
Bonding: add per-port priority support

This patch set add per-port priority for bonding failover re-selection.

The first patch add a new filed for bond_opt_value so we can set slave
value easier. I will update the bond_option_queue_id_set() setting
in later patch.

The second patch add the per-port priority for bonding. I defined
it as s32 to compatible with team prio option, which also use a s32
value.

v3: store slave_dev in bond_opt_value directly to simplify setting
values for slave.

v2: using the extant bonding options management stuff instead setting
slave prio in bond_slave_changelink() directly.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

Bonding: add per-port priority for failover re-selection

Add per port priority support for bonding active slave re-selection during
failover. A higher number means higher priority in selection. The primary
slave still has the highest priority. This option also follows the
primary_reselect rules.

This option could only be configured via netlink.

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Jonathan Toppins <jtoppins@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: add slave_dev field for bond_opt_value

Currently, bond_opt_value are mostly used for bonding option settings. If
we want to set a value for slave, we need to re-alloc a string to store
both slave name and vlaue, like bond_option_queue_id_set() does, which
is complex and dumb.

As Jon suggested, let's add a union field slave_dev for bond_opt_value,
which will be benefit for future slave option setting. In function
__bond_opt_init(), we will always check the extra field and set it
if it's not NULL.

Suggested-by: Jonathan Toppins <jtoppins@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Jonathan Toppins <jtoppins@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cxgb4/cxgb4vf: Fix typo in comments

Remove the repeated word 'and' from comments

Signed-off-by: Jiang Jian <jiangjian@cdjrlc.com>
Link: https://lore.kernel.org/r/20220622144841.21274-1-jiangjian@cdjrlc.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

bnxt: Fix typo in comments

Remove the repeated word 'and' from comments

Signed-off-by: Jiang Jian <jiangjian@cdjrlc.com>
Link: https://lore.kernel.org/r/20220622144526.20659-1-jiangjian@cdjrlc.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: phy: mxl-gpy: add temperature sensor

The GPY115 and GPY2xx PHYs contain an integrated temperature sensor. It
accuracy is +/- 5°C. Add support for it.

Signed-off-by: Michael Walle <michael@walle.cc>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20220622141716.3517645-1-michael@walle.cc
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'net-use-new-hwmon_sanitize_name'

Michael Walle says:

====================
net: use new hwmon_sanitize_name()

These are the remaining patches of my former series [1] which were hold
back because it would have required a stable branch between the subsystems.

[1] https://lore.kernel.org/r/20220405092452.4033674-1-michael@walle.cc/
====================

Link: https://lore.kernel.org/r/20220622123543.3463209-1-michael@walle.cc
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: phy: nxp-tja11xx: use devm_hwmon_sanitize_name()

Instead of open-coding the bad characters replacement in the hwmon name,
use the new devm_hwmon_sanitize_name().

Signed-off-by: Michael Walle <michael@walle.cc>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: sfp: use hwmon_sanitize_name()

Instead of open-coding the bad characters replacement in the hwmon name,
use the new hwmon_sanitize_name().

Signed-off-by: Michael Walle <michael@walle.cc>
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'broadcom-ptp-phy-support'

Jonathan Lemon says:

====================
Broadcom PTP PHY support

This adds PTP support for the Broadcom PHY BCM54210E (and the
specific variant BCM54213PE that the rpi-5.15 branch uses).

This has only been tested on the RPI CM4, which has one port.

There are other Broadcom chips which may benefit from using the
same framework here, although with different register sets.
====================

Link: https://lore.kernel.org/r/20220622050454.878052-1-jonathan.lemon@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: phy: Add support for 1PPS out and external timestamps

The perout function is used to generate a 1PPS signal, synchronized
to the PHC. This is accomplished by a using the hardware oneshot
functionality, which is reset by a timer.

The external timestamp function is set up for a 1PPS input pulse,
and uses a timer to poll for temestamps.

Both functions use the SYNC_OUT/SYNC_IN1 pin, so cannot run
simultaneously.

Co-developed-by: Lasse Johnsen <l@ssejohnsen.me>
Signed-off-by: Lasse Johnsen <l@ssejohnsen.me>
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: phy: broadcom: Add PTP support for some Broadcom PHYs.

This adds PTP support for BCM54210E Broadcom PHYs, in particular,
the BCM54213PE, as used in the Rasperry PI CM4. It has only been
tested on that hardware.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: phy: broadcom: Add Broadcom PTP hooks to bcm-phy-lib

Add 'struct bcm_ptp_private' to bcm54xx_phy_priv which points to
an optional PTP structure attached to the PHY. This is allocated
on probe if PHY PTP support is configured, and if the driver supports
PTP for the specified PHY.

Add the bcm_ptp_probe(), bcm_ptp_config_init() and bcn_ptp_stop()
API functions to the bcm-phy library.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'net-dsa-mv88e6xxx-get-rid-of-speed_max'

Russell King says:

====================
net: dsa: mv88e6xxx: get rid of SPEED_MAX

This series does two things:

1. it gets rid of mv88e6065_port_set_speed_duplex() which is completely
   unused (do we support this device? I couldn't find it in the tables
   in chip.c) This has a max speed of 200Mbps which we don't support.

2. get rid of the SPEED_MAX constant, which is used to configure a DSA
   or CPU port to their maximum speed during initialisation. We no
   longer need this as we can derive the maximum port speed from the
   mac_capabilities instead.

The reason for making this change is in preparation for phylink to be
used by DSA for CPU ports. This omission has come back to bite us with
the conversion of DSA drivers to phylink_pcs, since phylink_pcs won't
get used unless phylink is being used. Particularly with this driver,
it is very common for DT descriptions to omit the fixed-link details
which means "use maximum speed".

It will eventually be necessary to hoist the selection of "max speed"
into the DSA layer (trivial) and also have a way for the DSA driver
to tell the DSA layer which interface it should be using for these
ports.
====================

Link: https://lore.kernel.org/r/YrGQBssOvQBZiDS4@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: dsa: mv88e6xxx: get rid of SPEED_MAX setting

Currently, all the device specific speed setting functions convert
SPEED_MAX to the actual speed of the port. Rather than having each
of the mv88e6xxx chip specifics handling SPEED_MAX, derive it from
the mac_capabilities instead.

This is only needed for CPU and DSA ports, so move the logic up into
mv88e6xxx_setup_port() - which allows us to kill off all users of
SPEED_MAX throughout the driver.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: dsa: mv88e6xxx: remove mv88e6065 dead code

Remove mv88e6065_port_set_speed_duplex() - this is never called, and
thus is completely redundant.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge git://git./linux/kernel/git/netdev/net

No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'net-5.19-rc4' of git://git./linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
"Including fixes from bpf and netfilter.

  Current release - regressions:

   - netfilter: cttimeout: fix slab-out-of-bounds read in
     cttimeout_net_exit

Current release - new code bugs:

   - bpf: ftrace: keep address offset in ftrace_lookup_symbols

   - bpf: force cookies array to follow symbols sorting

  Previous releases - regressions:

   - ipv4: ping: fix bind address validity check

   - tipc: fix use-after-free read in tipc_named_reinit

   - eth: veth: add updating of trans_start

  Previous releases - always broken:

   - sock: redo the psock vs ULP protection check

   - netfilter: nf_dup_netdev: fix skb_under_panic

   - bpf: fix request_sock leak in sk lookup helpers

   - eth: igb: fix a use-after-free issue in igb_clean_tx_ring

   - eth: ice: prohibit improper channel config for DCB

   - eth: at803x: fix null pointer dereference on AR9331 phy

   - eth: virtio_net: fix xdp_rxq_info bug after suspend/resume

  Misc:

   - eth: hinic: replace memcpy() with direct assignment"

* tag 'net-5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (47 commits)
  net: openvswitch: fix parsing of nw_proto for IPv6 fragments
  sock: redo the psock vs ULP protection check
  Revert "net/tls: fix tls_sk_proto_close executed repeatedly"
  virtio_net: fix xdp_rxq_info bug after suspend/resume
  igb: Make DMA faster when CPU is active on the PCIe link
  net: dsa: qca8k: reduce mgmt ethernet timeout
  net: dsa: qca8k: reset cpu port on MTU change
  MAINTAINERS: Add a maintainer for OCP Time Card
  hinic: Replace memcpy() with direct assignment
  Revert "drivers/net/ethernet/neterion/vxge: Fix a use-after-free bug in vxge-main.c"
  net: phy: smsc: Disable Energy Detect Power-Down in interrupt mode
  ice: ethtool: Prohibit improper channel config for DCB
  ice: ethtool: advertise 1000M speeds properly
  ice: Fix switchdev rules book keeping
  ice: ignore protocol field in GTP offload
  netfilter: nf_dup_netdev: add and use recursion counter
  netfilter: nf_dup_netdev: do not push mac header a second time
  selftests: netfilter: correct PKTGEN_SCRIPT_PATHS in nft_concat_range.sh
  net/tls: fix tls_sk_proto_close executed repeatedly
  erspan: do not assume transport header is always set
  ...

Merge tag 'mmc-v5.19-rc2' of git://git./linux/kernel/git/ulfh/mmc

Pull MMC fixes from Ulf Hansson:

- mtk-sd: Fix dma hang issues

- sdhci-pci-o2micro: Fix card detect by dealing with debouncing

* tag 'mmc-v5.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: mediatek: wait dma stop bit reset to 0
mmc: sdhci-pci-o2micro: Fix card detect by dealing with debouncing

Merge tag 'sound-5.19-rc4' of git://git./linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
"All small changes, mostly device-specific:

   - A regression fix for PCM WC-page allocation on x86

   - A regression fix for i915 audio component binding

   - Fixes for (longstanding) beep handling bug

   - Runtime PM fixes for Intel LPE HDMI audio

   - A couple of pending FireWire fixes

   - Usual HD-audio and USB-audio quirks, new Intel dspconf entries"

* tag 'sound-5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda/realtek: Add quirk for Clevo NS50PU
  ALSA: hda: Fix discovery of i915 graphics PCI device
  ALSA: hda/via: Fix missing beep setup
  ALSA: hda/conexant: Fix missing beep setup
  ALSA: memalloc: Drop x86-specific hack for WC allocations
  ALSA: hda/realtek: Add quirk for Clevo PD70PNT
  ALSA: x86: intel_hdmi_audio: use pm_runtime_resume_and_get()
  ALSA: x86: intel_hdmi_audio: enable pm_runtime and set autosuspend delay
  ALSA: hda: intel-nhlt: remove use of __func__ in dev_dbg
  ALSA: hda: intel-dspcfg: use SOF for UpExtreme and UpExtreme11 boards
  firewire: convert sysfs sprintf/snprintf family to sysfs_emit
  firewire: cdev: fix potential leak of kernel stack due to uninitialized value
  ALSA: hda/realtek: Apply fixup for Lenovo Yoga Duet 7 properly
  ALSA: hda/realtek - ALC897 headset MIC no sound
  ALSA: usb-audio: US16x08: Move overflow check before array access
  ALSA: hda/realtek: Add mute LED quirk for HP Omen laptop

nfp: add 'ethtool --identify' support

Add support for ethtool -p|--identify
by enabling blinking of the panel LED if supported by the NIC firmware.

Signed-off-by: Sixiang Chen <sixiang.chen@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20220622083938.291548-1-simon.horman@corigine.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: openvswitch: fix parsing of nw_proto for IPv6 fragments

When a packet enters the OVS datapath and does not match any existing
flows installed in the kernel flow cache, the packet will be sent to
userspace to be parsed, and a new flow will be created. The kernel and
OVS rely on each other to parse packet fields in the same way so that
packets will be handled properly.

As per the design document linked below, OVS expects all later IPv6
fragments to have nw_proto=44 in the flow key, so they can be correctly
matched on OpenFlow rules. OpenFlow controllers create pipelines based
on this design.

This behavior was changed by the commit in the Fixes tag so that
nw_proto equals the next_header field of the last extension header.
However, there is no counterpart for this change in OVS userspace,
meaning that this field is parsed differently between OVS and the
kernel. This is a problem because OVS creates actions based on what is
parsed in userspace, but the kernel-provided flow key is used as a match
criteria, as described in Documentation/networking/openvswitch.rst. This
leads to issues such as packets incorrectly matching on a flow and thus
the wrong list of actions being applied to the packet. Such changes in
packet parsing cannot be implemented without breaking the userspace.

The offending commit is partially reverted to restore the expected
behavior.

The change technically made sense and there is a good reason that it was
implemented, but it does not comply with the original design of OVS.
If in the future someone wants to implement such a change, then it must
be user-configurable and disabled by default to preserve backwards
compatibility with existing OVS versions.

Cc: stable@vger.kernel.org
Fixes: fa642f08839b ("openvswitch: Derive IP protocol number for IPv6 later frags")
Link: https://docs.openvswitch.org/en/latest/topics/design/#fragments
Signed-off-by: Rosemarie O'Riorden <roriorden@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Link: https://lore.kernel.org/r/20220621204845.9721-1-roriorden@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

sock: redo the psock vs ULP protection check

Commit 8a59f9d1e3d4 ("sock: Introduce sk->sk_prot->psock_update_sk_prot()")
has moved the inet_csk_has_ulp(sk) check from sk_psock_init() to
the new tcp_bpf_update_proto() function. I'm guessing that this
was done to allow creating psocks for non-inet sockets.

Unfortunately the destruction path for psock includes the ULP
unwind, so we need to fail the sk_psock_init() itself.
Otherwise if ULP is already present we'll notice that later,
and call tcp_update_ulp() with the sk_proto of the ULP
itself, which will most likely result in the ULP looping
its callbacks.

Fixes: 8a59f9d1e3d4 ("sock: Introduce sk->sk_prot->psock_update_sk_prot()")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Tested-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/r/20220620191353.1184629-2-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Revert "net/tls: fix tls_sk_proto_close executed repeatedly"

This reverts commit 69135c572d1f84261a6de2a1268513a7e71753e2.

This commit was just papering over the issue, ULP should not
get ->update() called with its own sk_prot. Each ULP would
need to add this check.

Fixes: 69135c572d1f ("net/tls: fix tls_sk_proto_close executed repeatedly")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/r/20220620191353.1184629-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Merge branch '40GbE' of git://git./linux/kernel/git/tnguy/next-queue

Tony Nguyen says:

====================
40GbE Intel Wired LAN Driver Updates 2022-06-21

This series contains updates to i40e driver only.

Mateusz adds support for using the speed option in ethtool.

Minghao Chi removes unneeded synchronize_irq() calls.

Bernard Zhao removes unneeded NULL check.

* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  intel/i40e: delete if NULL check before dev_kfree_skb
  i40e: Remove unnecessary synchronize_irq() before free_irq()
  i40e: Add support for ethtool -s <interface> speed <speed in Mb>
====================

Link: https://lore.kernel.org/r/20220621225930.632741-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

virtio_net: fix xdp_rxq_info bug after suspend/resume

The following sequence currently causes a driver bug warning
when using virtio_net:

  # ip link set eth0 up
  # echo mem > /sys/power/state (or e.g. # rtcwake -s 10 -m mem)
  <resume>
  # ip link set eth0 down

  Missing register, driver bug
  WARNING: CPU: 0 PID: 375 at net/core/xdp.c:138 xdp_rxq_info_unreg+0x58/0x60
  Call trace:
   xdp_rxq_info_unreg+0x58/0x60
   virtnet_close+0x58/0xac
   __dev_close_many+0xac/0x140
   __dev_change_flags+0xd8/0x210
   dev_change_flags+0x24/0x64
   do_setlink+0x230/0xdd0
   ...

This happens because virtnet_freeze() frees the receive_queue
completely (including struct xdp_rxq_info) but does not call
xdp_rxq_info_unreg(). Similarly, virtnet_restore() sets up the
receive_queue again but does not call xdp_rxq_info_reg().

Actually, parts of virtnet_freeze_down() and virtnet_restore_up()
are almost identical to virtnet_close() and virtnet_open(): only
the calls to xdp_rxq_info_(un)reg() are missing. This means that
we can fix this easily and avoid such problems in the future by
just calling virtnet_close()/open() from the freeze/restore handlers.

Aside from adding the missing xdp_rxq_info calls the only difference
is that the refill work is only cancelled if netif_running(). However,
this should not make any functional difference since the refill work
should only be active if the network interface is actually up.

Fixes: 754b8a21a96d ("virtio_net: setup xdp_rxq_info")
Signed-off-by: Stephan Gerhold <stephan.gerhold@kernkonzept.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20220621114845.3650258-1-stephan.gerhold@kernkonzept.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch '100GbE' of git://git./linux/kernel/git/tnguy/net-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2022-06-21

This series contains updates to ice driver only.

Marcin fixes GTP filters by allowing ignoring of the inner ethertype field.

Wojciech adds VSI handle tracking in order to properly distinguish similar
filters for removal.

Anatolii removes ability to set 1000baseT and 1000baseX fields
concurrently which caused link issues. He also disallows setting
channels to less than the number of Traffic Classes which would cause
NULL pointer dereference.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  ice: ethtool: Prohibit improper channel config for DCB
  ice: ethtool: advertise 1000M speeds properly
  ice: Fix switchdev rules book keeping
  ice: ignore protocol field in GTP offload
====================

Link: https://lore.kernel.org/r/20220621224756.631765-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

raw: remove unused variables from raw6_icmp_error()

saddr and daddr are set but not used.

Fixes: ba44f8182ec2 ("raw: use more conventional iterators")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Link: https://lore.kernel.org/r/20220622032303.159394-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

igb: Make DMA faster when CPU is active on the PCIe link

Intel I210 on some Intel Alder Lake platforms can only achieve ~750Mbps
Tx speed via iperf. The RR2DCDELAY shows around 0x2xxx DMA delay, which
will be significantly lower when 1) ASPM is disabled or 2) SoC package
c-state stays above PC3. When the RR2DCDELAY is around 0x1xxx the Tx
speed can reach to ~950Mbps.

According to the I210 datasheet "8.26.1 PCIe Misc. Register - PCIEMISC",
"DMA Idle Indication" doesn't seem to tie to DMA coalesce anymore, so
set it to 1b for "DMA is considered idle when there is no Rx or Tx AND
when there are no TLPs indicating that CPU is active detected on the
PCIe link (such as the host executes CSR or Configuration register read
or write operation)" and performing Tx should also fall under "active
CPU on PCIe link" case.

In addition to that, commit b6e0c419f040 ("igb: Move DMA Coalescing init
code to separate function.") seems to wrongly changed from enabling
E1000_PCIEMISC_LX_DECISION to disabling it, also fix that.

Fixes: b6e0c419f040 ("igb: Move DMA Coalescing init code to separate function.")
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20220621221056.604304-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: phy: Add support for AQR113C EPHY

Add support multi-gigabit and single-port Ethernet
PHY transceiver (AQR113C).

Signed-off-by: Bhadram Varka <vbhadram@nvidia.com>
Link: https://lore.kernel.org/r/20220621034027.56508-1-vbhadram@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: pcs: lynx: use mdiodev accessors

Use the recently introduced mdiodev accessors for the lynx PCS.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1o3Zd8-002yHI-G2@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: dsa: qca8k: reduce mgmt ethernet timeout

The current mgmt ethernet timeout is set to 100ms. This value is too
big and would slow down any mdio command in case the mgmt ethernet
packet have some problems on the receiving part.
Reduce it to just 5ms to handle case when some operation are done on the
master port that would cause the mgmt ethernet to not work temporarily.

Fixes: 5950c7c0a68c ("net: dsa: qca8k: add support for mgmt read/write in Ethernet packet")
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Link: https://lore.kernel.org/r/20220621151633.11741-1-ansuelsmth@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: dsa: qca8k: reset cpu port on MTU change

It was discovered that the Documentation lacks of a fundamental detail
on how to correctly change the MAX_FRAME_SIZE of the switch.

In fact if the MAX_FRAME_SIZE is changed while the cpu port is on, the
switch panics and cease to send any packet. This cause the mgmt ethernet
system to not receive any packet (the slow fallback still works) and
makes the device not reachable. To recover from this a switch reset is
required.

To correctly handle this, turn off the cpu ports before changing the
MAX_FRAME_SIZE and turn on again after the value is applied.

Fixes: f58d2598cf70 ("net: dsa: qca8k: implement the port MTU callbacks")
Cc: stable@vger.kernel.org
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Link: https://lore.kernel.org/r/20220621151122.10220-1-ansuelsmth@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

isdn: mISDN: hfcsusb: drop unexpected word "the" in the comments

there is an unexpected word "the" in the comments that need to be dropped

file: ./drivers/isdn/hardware/mISDN/hfcsusb.c
line: 1560
/* set USB_SIZE_I to match the the wMaxPacketSize for ISO transfers */
changed to
/* set USB_SIZE_I to match the wMaxPacketSize for ISO transfers */

Signed-off-by: Jiang Jian <jiangjian@cdjrlc.com>
Link: https://lore.kernel.org/r/20220621114529.108079-1-jiangjian@cdjrlc.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: ipa: remove unexpected word "the"

there is an unexpected word "the" in the comments that need to be removed

Signed-off-by: Jiang Jian <jiangjian@cdjrlc.com>
Reviewed-by: Alex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20220621085001.61320-1-jiangjian@cdjrlc.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

cxgb4vf: remove unexpected word "the"

there is an unexpected word "the" in the comments that need to be removed

Signed-off-by: Jiang Jian <jiangjian@cdjrlc.com>
Link: https://lore.kernel.org/r/20220621084537.58402-1-jiangjian@cdjrlc.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

MAINTAINERS: Add a maintainer for OCP Time Card

I've been contributing and reviewing patches for ptp_ocp driver for
some time and I'm taking care of it's github mirror. On Jakub's
suggestion, I would like to step forward and become a maintainer for
this driver. This patch adds a dedicated entry to MAINTAINERS.

Signed-off-by: Vadim Fedorenko <vadfed@fb.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Link: https://lore.kernel.org/r/20220621233131.21240-1-vfedorenko@novek.ru
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

amt: remove unnecessary (void*) conversions

Remove unnecessary void* type castings.

Signed-off-by: Yu Zhe <yuzhe@nfschina.com>
Acked-by: Taehee Yoo <ap420073@gmail.com>
Link: https://lore.kernel.org/r/20220621021648.2544-1-yuzhe@nfschina.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'linux-kselftest-fixes-5.19-rc4' of git://git./linux/kernel/git/shuah/linux-kselftest

Pull Kselftest fixes from Shuah Khan:
"Compile time fixes and run-time resources leaks:

   - Fix clang cross compilation

   - Fix resource leak when return error

   - fix compile error for dma_map_benchmark

   - Fix regression - make use of GUP_TEST_FILE macro"

* tag 'linux-kselftest-fixes-5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests: make use of GUP_TEST_FILE macro
  selftests: vm: Fix resource leak when return error
  selftests dma: fix compile error for dma_map_benchmark
  selftests: Fix clang cross compilation

hinic: Replace memcpy() with direct assignment

Under CONFIG_FORTIFY_SOURCE=y and CONFIG_UBSAN_BOUNDS=y, Clang is bugged
here for calculating the size of the destination buffer (0x10 instead of
0x14). This copy is a fixed size (sizeof(struct fw_section_info_st)), with
the source and dest being struct fw_section_info_st, so the memcpy should
be safe, assuming the index is within bounds, which is UBSAN_BOUNDS's
responsibility to figure out.

Avoid the whole thing and just do a direct assignment. This results in
no change to the executable code.

[This is a duplicate of commit 2c0ab32b73cf ("hinic: Replace memcpy()
with direct assignment") which was applied to net-next.]

Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Tom Rix <trix@redhat.com>
Cc: llvm@lists.linux.dev
Link: https://github.com/ClangBuiltLinux/linux/issues/1592
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org> # build
Link: https://lore.kernel.org/r/20220616052312.292861-1-keescook@chromium.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ALSA: hda/realtek: Add quirk for Clevo NS50PU

Fixes headset detection on Clevo NS50PU.

Signed-off-by: Tim Crawford <tcrawford@system76.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220622150017.9897-1-tcrawford@system76.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>

Merge tag '9p-for-5.19-rc4' of https://github.com/martinetd/linux

Pull 9pfs fixes from Dominique Martinet:
"A couple of fid refcount and fscache fixes:

   - fid refcounting was incorrect in some corner cases and would leak
     resources, only freed at umount time. The first three commits fix
     three such cases

   - 'cache=loose' or fscache was broken when trying to write a partial
     page to a file with no read permission since the rework a few
     releases ago.

     The fix taken here is just to restore old behavior of using the
     special 'writeback_fid' for such reads, which is open as root/RDWR
     and such not get complains that we try to read on a WRONLY fid.

     Long-term it'd be nice to get rid of this and not issue the read at
     all (skip cache?) in such cases, but that direction hasn't
     progressed"

* tag '9p-for-5.19-rc4' of https://github.com/martinetd/linux:
  9p: fix EBADF errors in cached mode
  9p: Fix refcounting during full path walks for fid lookups
  9p: fix fid refcount leak in v9fs_vfs_get_link
  9p: fix fid refcount leak in v9fs_vfs_atomic_open_dotl

Revert "drivers/net/ethernet/neterion/vxge: Fix a use-after-free bug in vxge-main.c"

This reverts commit 8fc74d18639a2402ca52b177e990428e26ea881f.

BAR0 is the main (only?) register bank for this device. We most
obviously can't unmap it before the netdev is unregistered.
This was pointed out in review but the patch got reposted and
merged, anyway.

The author of the patch was only testing it with a QEMU model,
which I presume does not emulate enough for the netdev to be brought
up (author's replies are not visible in lore because they kept sending
their emails in HTML).

Link: https://lore.kernel.org/all/20220616085059.680dc215@kernel.org/
Fixes: 8fc74d18639a ("drivers/net/ethernet/neterion/vxge: Fix a use-after-free bug in vxge-main.c")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'af_unix-per-netns-socket-hash'

Kuniyuki Iwashima says:

====================
af_unix: Introduce per-netns socket hash table.

This series replaces unix_socket_table with a per-netns hash table and
reduces lock contention and time on iterating over the list.

Note the 3rd-6th patches can be a single patch, but for ease of review,
they are split into small changes without breakage.

Changes:
  v3:
    6th:
      * Remove unix_table_locks from comments.
      * Remove missed spin_unlock(&unix_table_locks) in
        unix_lookup_by_ino() (kernel test robot)

  v2: https://lore.kernel.org/netdev/20220620185151.65294-1-kuniyu@amazon.com/
    3rd:
      * Update changelog
      * Remove holes from per-netns hash table structure
      * Use kvmalloc_array() instead of kmalloc() (Eric Dumazet)
      * Remove unnecessary parts in af_unix_init() (Eric Dumazet)
      * Move `err_sysctl` label into ifdef block (kernel test robot)
      * Remove struct netns_unix from struct net if CONFIG_UNIX is disabled
    4th:
      * Use spin_lock_nested() (kernel test robot)

  v1: https://lore.kernel.org/netdev/20220616234714.4291-1-kuniyu@amazon.com/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

af_unix: Remove unix_table_locks.

unix_table_locks are to protect the global hash table, unix_socket_table.
The previous commit removed it, so let's clean up the unnecessary locks.

Here is a test result on EC2 c5.9xlarge where 10 processes run concurrently
in different netns and bind 100,000 sockets for each.

without this series : 1m 38s
with this series : 11s

It is ~10x faster because the global hash table is split into 10 netns in
this case.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

af_unix: Put a socket into a per-netns hash table.

This commit replaces the global hash table with a per-netns one and removes
the global one.

We now link a socket in each netns's hash table so we can save some netns
comparisons when iterating through a hash bucket.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

af_unix: Acquire/Release per-netns hash table's locks.

This commit adds extra spin_lock/spin_unlock() for a per-netns
hash table inside the existing ones for unix_table_locks.

As of this commit, sockets are still linked in the global hash
table. After putting sockets in a per-netns hash table and
removing the old one in the next patch, we remove the global
locks in the last patch.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

af_unix: Define a per-netns hash table.

This commit adds a per netns hash table for AF_UNIX, which size is fixed
as UNIX_HASH_SIZE for now.

The first implementation defines a per-netns hash table as a single array
of lock and list:

struct unix_hashbucket {
spinlock_t lock;
struct hlist_head head;
};

struct netns_unix {
struct unix_hashbucket *hash;
...
};

But, Eric pointed out memory cost that the structure has holes because of
sizeof(spinlock_t), which is 4 (or more if LOCKDEP is enabled). [0]  It
could be expensive on a host with thousands of netns and few AF_UNIX
sockets.  For this reason, a per-netns hash table uses two dense arrays.

struct unix_table {
spinlock_t *locks;
struct hlist_head *buckets;
};

struct netns_unix {
struct unix_table table;
...
};

Note the length of the list has a significant impact rather than lock
contention, so having shared locks can be an option.  But, per-netns
locks and lists still perform better than the global locks and per-netns
lists. [1]

Also, this patch adds a change so that struct netns_unix disappears from
struct net if CONFIG_UNIX is disabled.

[0]: https://lore.kernel.org/netdev/CANn89iLVxO5aqx16azNU7p7Z-nz5NrnM5QTqOzueVxEnkVTxyg@mail.gmail.com/
[1]: https://lore.kernel.org/netdev/20220617175215.1769-1-kuniyu@amazon.com/

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

af_unix: Include the whole hash table size in UNIX_HASH_SIZE.

Currently, the size of AF_UNIX hash table is UNIX_HASH_SIZE * 2,
the first half for bind()ed sockets and the second half for unbound
ones.  UNIX_HASH_SIZE * 2 is used to define the table and iterate
over it.

In some places, we use ARRAY_SIZE(unix_socket_table) instead of
UNIX_HASH_SIZE * 2.  However, we cannot use it anymore because we
will allocate the hash table dynamically.  Then, we would have to
add UNIX_HASH_SIZE * 2 in many places, which would be troublesome.

This patch adapts the UNIX_HASH_SIZE definition to include bound
and unbound sockets and defines a new UNIX_HASH_MOD macro to ease
calculations.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

af_unix: Clean up some sock_net() uses.

Some functions define a net pointer only for one-shot use. Others call
sock_net() redundantly even when a net pointer is available. Let's fix
these and make the code simpler.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'mlxsw-unified-bridge-conversion-part-2'

Ido Schimmel says:

====================
mlxsw: Unified bridge conversion - part 2/6

This is the second part of the conversion of mlxsw to the unified bridge
model. Part 1 was merged in commit 4336487e30c3 ("Merge branch
'mlxsw-unified-bridge-conversion-part-1'") which includes details about
the new model and the motivation behind the conversion.

This patchset does not begin the conversion, but rather prepares the code
base for it.

Patchset overview:

Patch #1 removes an unnecessary field from one of the FID families.

Patches #2-#7 make various improvements in the layer 2 multicast code,
making it more receptive towards upcoming changes.

Patches #8-#10 prepare the CONFIG_PROFILE command for the unified bridge
model. This command will be used to enable the new model in the last
patchset.

Patches #11-#13 perform small changes in the FID code, preparing it for
upcoming changes.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: spectrum_fid: Implement missing operations for rFID and dummy FID

rFID and dummy FID do not support FID->VNI mapping. Currently, these
families do not implement the vni_{set, clear}() operations. Instead, there
is a check if these functions are implemented.

Similarly, 'SFMR.nve_tunnel_flood_ptr' is not relevant for rFID and dummy
FID, therefore, these families do not implement
nve_flood_index_{set, clear}().

Align the behavior to other unsupported operations, implement the functions
and just return an error or warn. Then, checks like '!ops->vni_set' can be
removed.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: spectrum_fid: Use 'fid->fid_offset' when setting VNI

The previous patch added 'fid_offset' field to FID structure. Now, this
field can be used when VNI is set using SFMR register. Currently
'fid_offset' is set to zero, instead, use the new field which is now set
to zero and in the future will be changed.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: spectrum_fid: Save 'fid_offset' as part of FID structure

SFMR register contains a 'fid_offset' field which is used when flooding
tables of type FID offset are used.

Currently, the driver sets this field to zero, as flooding tables of type
FID are used.

Using unified bridge model, the driver will use FID offset flooding
tables. As preparation, add 'fid_offset' to 'struct mlxsw_sp_fid'. Then,
use this field instead of passing zero to the function that configures
SFMR.

Set the new field as part of 'ops->setup()', for that, implement this
function for dummy FID and rFID.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>