review.tizen.org Git - platform/kernel/linux-starfive.git/log

net: dcb: add new common function for set/del of app/rewr entries

In preparation for DCB rewrite. Add a new function for setting and
deleting both app and rewrite entries. Moving this into a separate
function reduces duplicate code, as both type of entries requires the
same set of checks. The function will now iterate through a configurable
nested attribute (app or rewrite attr), validate each attribute and call
the appropriate set- or delete function.

Note that this function always checks for nla_len(attr_itr) <
sizeof(struct dcb_app), which was only done in dcbnl_ieee_set and not in
dcbnl_ieee_del prior to this patch. This means, that any userspace tool
that used to shove in data < sizeof(struct dcb_app) would now receive
-ERANGE.

Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dcb: modify dcb_app_add to take list_head ptr as parameter

In preparation to DCB rewrite. Modify dcb_app_add to take new struct
list_head * as parameter, to make the used list configurable. This is
done to allow reusing the function for adding rewrite entries to the
rewrite table, which is introduced in a later patch.

Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'lan9303-phylink'

Jerry Ray says:

====================
dsa: lan9303: Move to PHYLINK

This patch series moves the lan9303 driver to use the phylink
api away from phylib.

Migrating to phylink means removing the .adjust_link api. The
functionality from the adjust_link is moved to the phylink_mac_link_up
api.  The code being removed only affected the cpu port.  The other
ports on the LAN9303 do not need anything from the phylink_mac_link_up
api.

Patches:
0001 - Whitespace only change aligning the dsa_switch_ops members.
No code changes.
0002 - Moves the Turbo bit initialization out of the adjust_link api and
places it in a driver initialization execution path. It only needs
to be initialized once, it is never changed, and it is not a
per-port flag.
0003 - Adds exception handling logic in the extremely unlikely event that
the read of the device fails.
0004 - Performance optimization that skips a slow register write if there
is no need to perform it.
0005 - Change the way we identify the xMII port as phydev will be NULL
when this logic is moved into phylink_mac_link_up.
0006 - Removes adjust_link and begins using the phylink dsa_switch_ops
apis.
0007 - Adds XMII port flow control settings in the phylink_mac_link_up()
api while cleaning up the ANEG / speed / duplex implementation.
---
v6->v7:
  - Moved the initialization of the Turbo bit into lan9303_setup().
  - Added a macro for determining is a port is an XMII port.
  - Added setting the XMII flow control in the phylink_mac_link_up() API.
  - removed unnecessary error handling and cleaned up the code flow in
    phylink_mac_link_up().
v5->v6:
  - Moved to using port number to identify xMII port for the LAN9303.
v4->v5:
  - Created prep patches to better show how things migrate.
  - cleaned up comments.
v3->v4:
  - Addressed whitespace issues as a separate patch.
  - Removed port_max_mtu api patch as it is unrelated to phylink migration.
  - Reworked the implementation to preserve the adjust_link functionality
    by including it in the phylink_mac_link_up api.
v2->v3:
  Added back in disabling Turbo Mode on the CPU MII interface.
  Removed the unnecessary clearing of the phy supported interfaces.
v1->v2:
  corrected the reported mtu size, removing ETH_HLEN and ETH_FCS_LEN
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

dsa: lan9303: Add flow ctrl in link_up

While the prior patch moved the adjust_link code into the
phylink_mac_link_up api, this patch cleans it up and adds the setting the
port's flow control based on the phylink_mac_link_up input parameters.

Signed-off-by: Jerry Ray <jerry.ray@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dsa: lan9303: Migrate to PHYLINK

This patch replaces the adjust_link api with the phylink apis that provide
equivalent functionality.

The remaining functionality from the adjust_link is now covered in the
phylink_mac_link_up api.

Removes:
.adjust_link
Adds:
.phylink_get_caps
.phylink_mac_link_up

Signed-off-by: Jerry Ray <jerry.ray@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dsa: lan9303: Port 0 is xMII port

In preparing to move the adjust_link logic into the phylink_mac_link_up
api, change the macro used to check for the cpu port. In
phylink_mac_link_up, the phydev pointer passed in for the CPU port is
NULL, so we can't keep using phy_is_pseudo_fixed_link(phydev).

Signed-off-by: Jerry Ray <jerry.ray@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dsa: lan9303: write reg only if necessary

As the regmap_write() is over a slow bus that will sleep, we can speed up
the boot-up time a bit by not bothering to clear a bit that is already
clear.

Signed-off-by: Jerry Ray <jerry.ray@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dsa: lan9303: Add exception logic for read failure

While it is highly unlikely a read will ever fail, This code fragment is
now in a function that allows us to return an error code. A read failure
here will cause the lan9303_probe to fail.

Signed-off-by: Jerry Ray <jerry.ray@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dsa: lan9303: move Turbo Mode bit init

In preparing to remove the .adjust_link api, I am moving the one-time
initialization of the device's Turbo Mode bit into a different execution
path. This code clears (disables) the Turbo Mode bit which is never used
by this driver. Turbo Mode is a non-standard mode that would allow the
100Mbps RMII interface to run at 200Mbps.

Signed-off-by: Jerry Ray <jerry.ray@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dsa: lan9303: align dsa_switch_ops members

Whitespace preparatory patch, making the dsa_switch_ops table consistent.
No code is added or removed.

Signed-off-by: Jerry Ray <jerry.ray@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'mlx5-updates-2023-01-18' of git://git./linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2023-01-18

1) From Rahul,
  1.1) extended range for PTP adjtime and adjphase
  1.2) adjphase function to support hardware-only offset control

2) From Roi, code cleanup to the TC module.

3) From Maor, TC support for Geneve and GRE with VF tunnel offload

4) Cleanups and minor updates.

* tag 'mlx5-updates-2023-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5e: Use read lock for eswitch get callbacks
  net/mlx5e: Remove redundant allocation of spec in create indirect fwd group
  net/mlx5e: Support Geneve and GRE with VF tunnel offload
  net/mlx5: E-Switch, Fix typo for egress
  net/mlx5e: Warn when destroying mod hdr hash table that is not empty
  net/mlx5e: TC, Use common function allocating flow mod hdr or encap mod hdr
  net/mlx5e: TC, Add tc prefix to attach/detach hdr functions
  net/mlx5e: TC, Pass flow attr to attach/detach mod hdr functions
  net/mlx5e: Add warning when log WQE size is smaller than log stride size
  net/mlx5e: Fail with messages when params are not valid for XSK
  net/mlx5: E-switch, Remove redundant comment about meta rules
  net/mlx5: Add hardware extended range support for PTP adjtime and adjphase
  net/mlx5: Add adjphase function to support hardware-only offset control
  net/mlx5: Suppress error logging on UCTX creation
  net/mlx5e: Suppress Send WQEBB room warning for PAGE_SIZE >= 16KB
====================

Link: https://lore.kernel.org/r/20230118183602.124323-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: phy: fix use of uninit variable when setting PLCA config

Coverity reported the following:

*** CID 1530573:    (UNINIT)
drivers/net/phy/phy-c45.c:1036 in genphy_c45_plca_set_cfg()
1030      return ret;
1031
1032      val = ret;
1033      }
1034
1035      if (plca_cfg->node_cnt >= 0)
vvv     CID 1530573:    (UNINIT)
vvv     Using uninitialized value "val".
1036      val = (val & ~MDIO_OATC14_PLCA_NCNT) |
1037            (plca_cfg->node_cnt << 8);
1038
1039      if (plca_cfg->node_id >= 0)
1040      val = (val & ~MDIO_OATC14_PLCA_ID) |
1041            (plca_cfg->node_id);
drivers/net/phy/phy-c45.c:1076 in genphy_c45_plca_set_cfg()
1070      return ret;
1071
1072      val = ret;
1073      }
1074
1075      if (plca_cfg->burst_cnt >= 0)
vvv     CID 1530573:    (UNINIT)
vvv     Using uninitialized value "val".
1076      val = (val & ~MDIO_OATC14_PLCA_MAXBC) |
1077            (plca_cfg->burst_cnt << 8);
1078
1079      if (plca_cfg->burst_tmr >= 0)
1080      val = (val & ~MDIO_OATC14_PLCA_BTMR) |
1081            (plca_cfg->burst_tmr);

This is not actually creating a real problem because the path leading to
'val' being used uninitialized will eventually override the full content
of that variable before actually using it for writing the register.
However, the fix is simple and comes at basically no cost.

Reported-by: coverity-bot <keescook+coverity-bot@chromium.org>
Fixes: 493323416fed ("drivers/net/phy: add helpers to get/set PLCA configuration")
Signed-off-by: Piergiorgio Beruto <piergiorgio.beruto@gmail.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/f22f1864165a8dbac8b7a2277f341bc8e7a7b70d.1674056765.git.piergiorgio.beruto@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'devlink-linecard-and-reporters-locking-cleanup'

Jiri Pirko says:

====================
devlink: linecard and reporters locking cleanup

This patchset does not change functionality.

Patches 1-2 remove linecards lock and reference counting, converting
them to be protected by devlink instance lock as the rest of
the objects.

Patches 3-4 fix the mlx5 auxiliary device devlink locking scheme whis is
needed for proper reporters lock conversion done in the following
patches.

Patches 5-8 remove reporters locks and reference counting, converting
them to be protected by devlink instance lock as the rest of
the objects.

Patches 9 and 10 convert linecards and reporters dumpit callbacks to
recently introduced devlink_nl_instance_iter_dump() infra.

Patch 11 removes no longer needed devlink_dump_for_each_instance_get()
helper.

The last patch adds assertion to devl_is_registered() as dependency on
other locks is removed.
====================

Link: https://lore.kernel.org/r/20230118152115.1113149-1-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

devlink: add instance lock assertion in devl_is_registered()

After region and linecard lock removals, this helper is always supposed
to be called with instance lock held. So put the assertion here and
remove the comment which is no longer accurate.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

devlink: remove devlink_dump_for_each_instance_get() helper

devlink_dump_for_each_instance_get() is currently called from
a single place in netlink.c. As there is no need to use
this helper anywhere else in the future, remove it and
call devlinks_xa_find_get() directly from while loop
in devlink_nl_instance_iter_dump(). Also remove redundant
idx clear on loop end as it is already done
in devlink_nl_instance_iter_dump().

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

devlink: convert reporters dump to devlink_nl_instance_iter_dump()

Benefit from recently introduced instance iteration and convert
reporters .dumpit generic netlink callback to use it.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

devlink: convert linecards dump to devlink_nl_instance_iter_dump()

Benefit from recently introduced instance iteration and convert
linecards .dumpit generic netlink callback to use it.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

devlink: remove reporter reference counting

As long as the reporter life time is protected by devlink instance
lock, the reference counting is no longer needed. Remove it.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

devlink: remove devl*_port_health_reporter_destroy()

Remove port-specific health reporter destroy function as it is
currently the same as the instance one so no longer needed. Inline
__devlink_health_reporter_destroy() as it is no longer called from
multiple places.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

devlink: remove reporters_lock

Similar to other devlink objects, rely on devlink instance lock
and remove object specific reporters_lock.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

devlink: protect health reporter operation with instance lock

Similar to other devlink objects, protect the reporters list
by devlink instance lock. Alongside add unlocked versions
of health reporter create/destroy functions and use them in drivers
on call paths where the instance lock is held.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5: Remove MLX5E_LOCKED_FLOW flag

The MLX5E_LOCKED_FLOW flag is not checked anywhere now so remove it
entirely.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5e: Create separate devlink instance for ethernet auxiliary device

The fact that devlink instance lock is held over mlx5 auxiliary devices
probe and remove routines brought a need to conditionally take devlink
instance lock there. The code is checking a MLX5E_LOCKED_FLOW flag
in mlx5 priv struct.

This is racy and may lead to access devlink objects without holding
instance lock or deadlock.

To avoid this, the only lock-wise sane solution is to make the
devlink entities created by the auxiliary device independent on
the original pci devlink instance. Create devlink instance for the
auxiliary device and put the uplink port instance there alongside with
the port health reporters.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

devlink: remove linecard reference counting

As long as the linecard life time is protected by devlink instance
lock, the reference counting is no longer needed. Remove it.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

devlink: remove linecards lock

Similar to other devlink objects, convert the linecards list to be
protected by devlink instance lock. Alongside with that rename the
create/destroy() functions to devl_* to indicate the devlink instance
lock needs to be held while calling them.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: ethernet: ti: am65-cpsw: Handle -EPROBE_DEFER for Serdes PHY

In the am65_cpsw_init_serdes_phy() function, the error handling for the
call to the devm_of_phy_get() function misses the case where the return
value of devm_of_phy_get() is ERR_PTR(-EPROBE_DEFER). Proceeding without
handling this case will result in a crash when the "phy" pointer with
this value is dereferenced by phy_init() in am65_cpsw_enable_phy().

Fix this by adding appropriate error handling code.

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Fixes: dab2b265dd23 ("net: ethernet: ti: am65-cpsw: Add support for SERDES configuration")
Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Link: https://lore.kernel.org/r/20230118112136.213061-1-s-vadapalli@ti.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: dsa: microchip: ptp: Fix error code in ksz_hwtstamp_set()

We want to return negative error codes here but the copy_to/from_user()
functions return the number of bytes remaining to be copied.

Fixes: c59e12a140fb ("net: dsa: microchip: ptp: Initial hardware time stamping support")
Signed-off-by: Dan Carpenter <error27@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/Y8fJxSvbl7UNVHh/@kili
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'net-sfp-cleanup-i2c-dt-acpi-fwnode-includes'

Russell King says:

====================
net: sfp: cleanup i2c / dt / acpi / fwnode / includes

This series cleans up the DT/fwnode/ACPI code in the SFP cage driver:

1. Use the newly introduced i2c_get_adapter_by_fwnode(), which removes
the need to know about ACPI handles to find the I2C device.

2. Use device_get_match_data() to get the match data, rather than
having to look up the matching DT device_id to get at the data.

3. Rename gpio_of_names, as this is not DT specific.

4. Remove acpi.h include which is no longer necessary.

5. Remove ctype.h include which, as far as I can tell, was never
necessary.
====================

Link: https://lore.kernel.org/r/Y8fH+Vqx6huYQFDU@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: sfp: remove unused ctype.h include

An include of linux/ctype.h was added in commit 1323061a018a
("net: phy: sfp: Add HWMON support for module sensors") but nothing
was used from this header file. Remove this unnecessary include.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: sfp: remove acpi.h include

Nothing in the sfp code now references anything from the ACPI header,
everything is done via fwnode APIs, so get rid of this header.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: sfp: rename gpio_of_names[]

There's nothing DT specific about the gpio_of_names array, let's drop
the _of infix.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: sfp: use device_get_match_data()

Rather than using of_match_node() to get the matching of_device_id
to then retrieve the match data, use device_get_match_data() instead
to avoid firmware specific functions, and free the driver from having
firmware specific code.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: sfp: use i2c_get_adapter_by_fwnode()

Use the newly introduced i2c_get_adapter_by_fwnode() API, so that we
can retrieve the I2C adapter in a firmware independent manner once we
have the fwnode handle for the adapter.

Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'net-phy-remove-probe_capabilities'

Michael Walle says:

====================
net: phy: Remove probe_capabilities

With all the drivers which used .probe_capabilities converted to the
new c45 MDIO access methods, we can now decide based upon these whether
a bus driver supports c45 and we can get rid of the not widely used
probe_capabilites.

Unfortunately, due to a now broader support of c45 scans, this will
trigger a bug on some boards with a (c22-only) Micrel PHY. These PHYs
don't ignore c45 accesses correctly, thinking they are addressed
themselves and distrupt the MDIO access. To avoid this, a blacklist
for c45 scans is introduced.
====================

Link: https://lore.kernel.org/r/20230116-net-next-remove-probe-capabilities-v2-0-15513b05e1f4@walle.cc
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: phy: Remove probe_capabilities

Deciding if to probe of PHYs using C45 is now determine by if the bus
provides the C45 read method. This makes probe_capabilities redundant
so remove it.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Acked-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: phy: Decide on C45 capabilities based on presence of method

Some PHYs provide invalid IDs in C22 space. If C45 is supported on the
bus an attempt can be made to get the IDs from the C45 space. Decide
on this based on the presence of the C45 read method in the bus
structure. This will allow the unreliable probe_capabilities to be
removed.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: mdio: scan bus based on bus capabilities for C22 and C45

Now that all MDIO bus drivers which set probe_capabilities to
MDIOBUS_C22_C45 have been converted to use the name API for C45
transactions, perform the scanning of the bus based on which methods
the bus provides.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: mdio: Add workaround for Micrel PHYs which are not C45 compatible

After scanning the bus for C22 devices, check if any Micrel PHYs have
been found. They are known to do bad things if there are C45
transactions on the bus. Prevent the scanning of the bus using C45 if
such a PHY has been detected.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Michael Walle <michael@walle.cc>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: mdio: Rework scanning of bus ready for quirks

Some C22 PHYs do bad things when there are C45 transactions on the
bus. In order to handle this, the bus needs to be scanned first for
C22 at all addresses, and then C45 scanned for all addresses.

The Marvell pxa168 driver scans a specific address on the bus to find
its PHY. This is a C22 only device, so update it to use the c22
helper.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: mdio: Move mdiobus_scan() within file

No functional change, just place it earlier in preparation for some
refactoring.

While at it, correct the comment format and one typo.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Merge branch 'generic-implementation-of-phy-interface-and-fixed_phy-support-for-the-lan743x-device'

Pavithra Sathyanarayanan says:

====================
generic implementation of phy interface and fixed_phy support for the LAN743x device

This patch series includes the following changes:

- Remove the unwanted interface settings in the LAN743x driver as
  it is preset in EEPROM configurations.

- Handle generic implementation for the phy interfaces for different
  devices LAN7430/31 and pci11x1x.

- Add new feature for fixed_phy support at 1Gbps full duplex for the
  LAN7431 device if a phy not found over MDIO. Includes support for
  communication between a MAC in a LAN7431 device and custom phys
  without an MDIO interface.
====================

Link: https://lore.kernel.org/r/20230117141614.4411-1-Pavithra.Sathyanarayanan@microchip.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: lan743x: add fixed phy support for LAN7431 device

Add fixed_phy support at 1Gbps full duplex for the lan7431 device
if a phy not found over MDIO. Tested with a MAC to MAC connection
from LAN7431 to a KSZ9893 switch. This avoids the Driver open error
in LAN743x. TX delay and internal CLK125 generation is already
enabled in EEPROM.

Signed-off-by: Pavithra Sathyanarayanan <Pavithra.Sathyanarayanan@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: lan743x: add generic implementation for phy interface selection

Add logic to read the Phy interface from MAC_CR register for LAN743x
driver.

Checks for the LAN7430/31 or pci11x1x devices and the adapter
interface is updated accordingly. For LAN7431, adapter interface is set
based on Bit 19 of MAC_CR register as MII or RGMII which removes the
forced RGMII/GMII configurations in lan743x_phy_open().

Signed-off-by: Pavithra Sathyanarayanan <Pavithra.Sathyanarayanan@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: lan743x: remove unwanted interface select settings

Remove the MII/RGMII Selection settings in driver as it is preset
by the EEPROM and has the required configurations before the driver
loads for LAN743x.

Signed-off-by: Pavithra Sathyanarayanan <Pavithra.Sathyanarayanan@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

selftests/net: mv bpf/nat6to4.c to net folder

There are some issues with the bpf/nat6to4.c building.

1. It use TEST_CUSTOM_PROGS, which will add the nat6to4.o to
   kselftest-list file and run by common run_tests.
2. When building the test via `make -C tools/testing/selftests/
   TARGETS="net"`, the nat6to4.o will be build in selftests/net/bpf/
   folder. But in test udpgro_frglist.sh it refers to ../bpf/nat6to4.o.
   The correct path should be ./bpf/nat6to4.o.
3. If building the test via `make -C tools/testing/selftests/ TARGETS="net"
   install`. The nat6to4.o will be installed to kselftest_install/net/
   folder. Then the udpgro_frglist.sh should refer to ./nat6to4.o.

To fix the confusing test path, let's just move the nat6to4.c to net folder
and build it as TEST_GEN_FILES.

Fixes: edae34a3ed92 ("selftests net: add UDP GRO fraglist + bpf self-tests")
Tested-by: Björn Töpel <bjorn@kernel.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20230118020927.3971864-1-liuhangbin@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Merge branch 'enetc-bd-ring-cleanup'

Vladimir Oltean says:

====================
ENETC BD ring cleanup

The highlights of this patch set are:

- Installing a BPF program and changing PTP RX timestamping settings are
  currently implemented through a port reconfiguration procedure which
  triggers an AN restart on the PHY, and these procedures are not
  generally guaranteed to leave the port in a sane state. Patches 9/12
  and 11/12 address that.

- Attempting to put the port down (or trying to reconfigure it) has the
  driver oppose some resistance if it's bombarded with RX traffic
  (it won't go down). Patch 12/12 addresses that.

The other 9 patches are just cleanup in the BD ring setup/teardown code,
which gradually led to bringing the driver in a position where resolving
those 2 issues was possible.
====================

Link: https://lore.kernel.org/r/20230117230234.2950873-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: enetc: prioritize ability to go down over packet processing

napi_synchronize() from enetc_stop() waits until the softirq has
finished execution and no longer wants to be rescheduled. However under
high traffic load, this will never happen, and the interface can never
be closed.

The problem is the fact that the NAPI poll routine is written to update
the consumer index which makes the device want to put more buffers in
the RX ring, which restarts the madness again.

Browsing around, it seems that some drivers like i40e keep a bit
(__I40E_VSI_DOWN) which they use as communication between the control
path and the data path. But that isn't my first choice, because
complications ensue - since the enetc hardirq may trigger while we are
in a theoretical ENETC_DOWN state, it may happen that enetc_msix() masks
it, but enetc_poll() never unmasks it. To prevent a stall in that case,
one would need to schedule all NAPI instances when ENETC_DOWN gets
cleared, to process what's pending.

I find it more desirable for the control path - enetc_stop() - to just
quiesce the RX ring and let the softirq finish what remains there,
without any explicit communication, just by making hardware not provide
any more packets.

This seems possible with the Enable bit of the RX BD ring (RBaMR[EN]).
I can't seem to find an exact definition of what this bit does, but when
the RX ring is disabled, the port seems to no longer update the producer
index, and not react to software updates of the consumer index.

In fact, the RBaMR[EN] bit is already toggled by the driver, but too
late for what we want:

enetc_close()
-> enetc_stop()
-> napi_synchronize()
-> enetc_clear_bdrs()
-> enetc_clear_rxbdr()

The enetc_clear_bdrs() function contains not only logic to disable the
RX and TX rings, but also logic to wait for the TX ring stop being busy.

We split enetc_clear_bdrs() into enetc_disable_bdrs() and
enetc_wait_bdrs(). One needs to run before napi_synchronize() and the
other after (NAPI also processes TX completions, so we maximize our
chances of not waiting for the ENETC_TBSR_BUSY bit - unless a packet is
stuck for some reason, ofc).

We also split off enetc_enable_bdrs() from enetc_setup_bdrs(), and call
this from the mirror position in enetc_start() compared to enetc_stop(),
i.e. right before netif_tx_start_all_queues().

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: enetc: set up XDP program under enetc_reconfigure()

Offloading a BPF program to the RX path of the driver suffers from the
same problems as the PTP reconfiguration - improper error checking can
leave the driver in an invalid state, and the link on the PHY is lost.

Reuse the enetc_reconfigure() procedure, but here, we need to run some
code in the middle of the ring reconfiguration procedure - while the
interface is still down. Introduce a callback which makes that possible.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: enetc: rename "xdp" and "dev" in enetc_setup_bpf()

Follow the convention from this driver, which is to name "struct
net_device *" as "ndev", and the convention from other drivers, to name
"struct netdev_bpf *" as "bpf".

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: enetc: implement ring reconfiguration procedure for PTP RX timestamping

The crude enetc_stop() -> enetc_open() mechanism suffers from 2
problems:

1. improper error checking
2. it involves phylink_stop() -> phylink_start() which loses the link

Right now, the driver is prepared to offer a better alternative: a ring
reconfiguration procedure which takes the RX BD size (normal or
extended) as argument. It allocates new resources (failing if that
fails), stops the traffic, and assigns the new resources to the rings.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: enetc: move phylink_start/stop out of enetc_start/stop

We want to introduce a fast interface reconfiguration procedure, which
involves temporarily stopping the rings.

But we want enetc_start() and enetc_stop() to not restart PHY autoneg,
because that can take a few seconds until it completes again.

So we need part of enetc_start() and enetc_stop(), but not all of them.
Move phylink_start() right next to phylink_of_phy_connect(), and
phylink_stop() right next to phylink_disconnect_phy(), both still in
ndo_open() and ndo_stop().

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: enetc: split ring resource allocation from assignment

We have a few instances in the enetc driver where the ring resources
(BD ring iomem, software BD ring, software TSO headers, basically
everything except RX buffers) need to be reallocated. For example, when
RX timestamping is enabled, the RX BD format changes to an extended one
(twice as large).

Currently, this is done using a simplistic enetc_close() -> enetc_open()
procedure. But this is quite crude, since it also invokes phylink_stop()
-> phylink_start(), the link is lost, and a few seconds need to pass for
autoneg to complete again.

In fact it's bad also due to the improper (yolo) error checking. In case
we fail to allocate new resources, we've already freed the old ones, so
the interface is more or less stuck.

To avoid that, we need a system where reconfiguration is possible in a
way in which resources are allocated upfront. This means that there will
be a higher memory usage temporarily, but the assignment of resources to
rings can be done when both the old and new resources are still available.

Introduce a struct enetc_bdr_resource which holds the resources for a
ring, be it RX or TX. This structure duplicates a lot of fields from
struct enetc_bdr (and access to the same fields in the ring structure
was left duplicated, to not change cache characteristics in the fast
path).

When enetc_alloc_tx_resources() runs, it returns an array of resource
elements (one per TX ring), in addition to the existing priv->tx_res.
To populate priv->tx_res with that array, one must call
enetc_assign_tx_resources(), and this also frees the old resources.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: enetc: bring "bool extended" to top-level in enetc_open()

Extended RX buffer descriptors are necessary if they carry RX
timestamps, which will be true when PTP timestamping is enabled.

Right now, the rx_ring->ext_en is set from the function that allocates
ring resources (enetc_alloc_rx_resources() -> enetc_alloc_rxbdr()), and
also used later, in enetc_setup_rxbdr(). It is also used in the
enetc_rxbd() and enetc_rxbd_next() fast path helpers.

We want to decouple resource allocation from BD ring setup, but both
procedures depend on BD size (extended or not). Move the "extended"
boolean to enetc_open() and pass it both to the RX allocation procedure
as well as to the RX ring setup procedure. The latter will set
rx_ring->ext_en from now on.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: enetc: drop redundant enetc_free_tx_frame() call from enetc_free_txbdr()

The call path in enetc_close() is:

enetc_close()
-> enetc_free_rxtx_rings()
   -> enetc_free_tx_ring()
      -> enetc_free_tx_frame()
-> enetc_free_tx_resources()
   -> enetc_free_txbdr()
      -> enetc_free_tx_frame()

The enetc_free_tx_frame() function is written such that the second call
exits without doing anything, but nonetheless, it is completely
redundant. Delete it. This makes the TX teardown path more similar to
the RX one, where rx_swbd freeing is done in enetc_free_rx_ring(), not
in enetc_free_rxbdr().

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: enetc: rx_swbd and tx_swbd are never NULL in enetc_free_rxtx_rings()

The call path in enetc_close() is:

enetc_close()
-> enetc_free_rxtx_rings()
   -> enetc_free_rx_ring()
      -> tests whether rx_ring->rx_swbd is NULL
   -> enetc_free_tx_ring()
      -> tests whether tx_ring->tx_swbd is NULL
-> enetc_free_rx_resources()
   -> enetc_free_rxbdr()
      -> sets rxr->rx_swbd to NULL
-> enetc_free_tx_resources()
   -> enetc_free_txbdr()
      -> setx txr->tx_swbd to NULL

From the above, it is clear that due to the function ordering, the
checks for NULL are redundant, since the software buffer descriptor
arrays have not yet been set to NULL. Drop these checks.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: enetc: create enetc_dma_free_bdr()

This is a refactoring change which introduces the opposite function of
enetc_dma_alloc_bdr().

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: enetc: set up RX ring indices from enetc_setup_rxbdr()

There is only one place which needs to set up indices in the RX ring.
Be consistent with what was done in the TX path and do this in
enetc_setup_rxbdr().

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: enetc: set next_to_clean/next_to_use just from enetc_setup_txbdr()

enetc_alloc_txbdr() deals with allocating resources necessary for a TX
ring to work (the array of software BDs and the array of TSO headers).

The next_to_clean and next_to_use pointers are overwritten with proper
values which are read from hardware here:

enetc_open
-> enetc_alloc_tx_resources
   -> enetc_alloc_txbdr
      -> set to zero
-> enetc_setup_bdrs
   -> enetc_setup_txbdr
      -> read from hardware

So their initialization with zeroes is pointless and confusing.
Delete it.

Consequently, since enetc_setup_txbdr() has no opposite cleanup
function, also delete the resetting of these indices from
enetc_free_tx_ring().

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5e: Use read lock for eswitch get callbacks

In commit 367dfa121205 ("net/mlx5: Remove devl_unlock from
mlx5_eswtich_mode_callback_enter") all functions were converted
to use write lock without relation to their actual purpose.

Change the devlink eswitch getters to use read and not write locks.

Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: Remove redundant allocation of spec in create indirect fwd group

mlx5_add_flow_rules supports creating rules without any matches by passing NULL
pointer instead of spec, if NULL is passed it will use a static empty spec.
This make allocation of spec in mlx5_create_indir_fwd_group unnecessary.

Remove the redundant allocation.

Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: Support Geneve and GRE with VF tunnel offload

Today VF tunnel offload (tunnel endpoint is on VF) is implemented
by indirect table which use rules that match on VXLAN VNI to
recirculated to root table, this limit the support for only
VXLAN tunnels.

This patch change indirect table to use one single match all rule
to recirculated to root table which is added when any tunnel decap
rule is added with tunnel endpoint is VF. This allow support of
Geneve and GRE with this configuration.

Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5: E-Switch, Fix typo for egress

Fix engress to egress.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: Warn when destroying mod hdr hash table that is not empty

To avoid memory leaks add a warn when destroying mod hdr hash table
but the hash table is not empty.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: TC, Use common function allocating flow mod hdr or encap mod hdr

Use mlx5e_tc_attach_mod_hdr() when allocating encap mod hdr and
remove mlx5e_tc_add_flow_mod_hdr() which is not being used now.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: TC, Add tc prefix to attach/detach hdr functions

Currently there are confusing names for attach/detach functions.

mlx5e_attach_mod_hdr() vs mlx5e_mod_hdr_attach()
mlx5e_detach_mod_hdr() vs mlx5e_mod_hdr_detach()

Add tc prefix to the functions that are in en_tc.c to separate
from the functions in mod_hdr.c which has the mod_hdr prefix.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: TC, Pass flow attr to attach/detach mod hdr functions

In preparation to remove duplicate functions handling mod hdr allocation
and the fact that modify hdr should be per flow attr and not flow
pass flow attr to the attach and detach mod hdr funcs.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: Add warning when log WQE size is smaller than log stride size

Add warning macro in the function mlx5e_mpwqe_get_log_num_strides()
when log WQE size is smaller than log stride size. Theoretically this
should not happen.

Signed-off-by: Adham Faris <afaris@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: Fail with messages when params are not valid for XSK

Current XSK prerequisites validation implementation
(setup.c/mlx5e_validate_xsk_param()) fails silently when xsk
prerequisites are not fulfilled.
Add error messages to the kernel log to help the user understand what
went wrong when params are not valid for XSK.

Signed-off-by: Adham Faris <afaris@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5: E-switch, Remove redundant comment about meta rules

Meta rules are created/destroyed per vport and not in eswitch
init/destroy.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5: Add hardware extended range support for PTP adjtime and adjphase

Capable hardware can use an extended range for offsetting the clock. An
extended range of [-200000,200000] is used instead of [-32768,32767] for
the delta/phase parameter of the adjtime/adjphase ptp_clock_info callbacks.

Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5: Add adjphase function to support hardware-only offset control

The adjtime function supports using hardware to set the clock offset when
the delta was supported by the hardware. When the delta is not supported by
the hardware, the driver handles adjusting the clock. The newly-introduced
adjphase function is similar to the adjtime function, except it guarantees
that a provided clock offset will be used directly by the hardware to
adjust the PTP clock. When the range is not acceptable by the hardware, an
error is returned.

Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5: Suppress error logging on UCTX creation

Suppress error logging that can be triggered by userspace upon DEVX UCTX
creation.

The reason that it's not suppressed today with the uid check to suppress
DEVX is that MLX5_CMD_OP_CREATE_UCTX command still doesn't have a uid as
it comes to create it..

Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: Suppress Send WQEBB room warning for PAGE_SIZE >= 16KB

Send WQEBB size is 64 bytes and the max number of WQEBBs for an SQ is 255.
For 16KB pages and greater, there is always sufficient spaces for all
WQEBBs of an SQ. Cast mlx5e_get_max_sq_wqebbs(mdev) to u16. Prevents
-Wtautological-constant-out-of-range-compare warnings from occurring when
PAGE_SIZE >= 16KB.

Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

xdp: document xdp_do_flush() before napi_complete_done()

Document in the XDP_REDIRECT manual section that drivers must call
xdp_do_flush() before napi_complete_done(). The two reasons behind
this can be found following the links below.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/r/20221220185903.1105011-1-sbohrer@cloudflare.com
Link: https://lore.kernel.org/all/20210624160609.292325-1-toke@redhat.com/
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'sparx5-vcap-improve-locking'

Steen Hegelund says:

====================
sparx5: Improve locking in the VCAP API

This improves the VCAP cache and the VCAP rule list protection against
access from different sources.

The VCAP Admin lock protects the list of rules for the VCAP instance as
well as the cache used for encoding and decoding rules.

This series provides dedicated functions for accessing rule statistics,
decoding rule content, verifying if a rule exists and getting a rule with
the lock held, as well as ensuring the use of the lock when the list of
rules or the cache is accessed.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: microchip: sparx5: Add lock initialization to the KUNIT tests

Ensure that the KUNIT tests lock instance is initialized before the test is
executed.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: microchip: sparx5: Improve VCAP admin locking in the VCAP API

This improves the VCAP cache and the VCAP rule list protection against
access from different sources.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: microchip: sparx5: Add VCAP admin locking in debugFS

This ensures that the admin lock is taken before the debugFS functions
starts iterating the VCAP rules.
It also adds a separate function to decode a rule, which expects the lock
to have been taken before it is called.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: microchip: sparx5: Add support to check for existing VCAP rule id

Add a new function that just checks if the VCAP rule id is already used by
an existing rule.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: microchip: sparx5: Add support for rule count by cookie

This adds support for TC clients to get the packet count for a TC filter
identified by its cookie.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8152: avoid to change cfg for all devices

The rtl8152_cfgselector_probe() should set the USB configuration to the
vendor mode only for the devices which the driver (r8152) supports.
Otherwise, no driver would be used for such devices.

Fixes: ec51fbd1b8a2 ("r8152: add USB device driver for config selection")
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: macb: simplify TX timestamp handling

This driver was capturing the TX timestamp values from the TX ring
during the TX completion path, but deferring the actual packet TX
timestamp updating to a workqueue. There does not seem to be much of a
reason for this with the current state of the driver. Simplify this to
just do the TX timestamping as part of the TX completion path, to avoid
the need for the extra timestamp buffer and workqueue.

Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

virtio_net: Reuse buffer free function

virtnet_rq_free_unused_buf() helper function to free the buffer
already exists. Avoid code duplication by reusing existing function.

Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Parav Pandit <parav@nvidia.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge git://git./linux/kernel/git/netfilter/nf-next

Florian Westphal says:

====================
Netfilter updates for net-next

following patch set includes netfilter updates for your *net-next* tree.

1. Replace pr_debug use with nf_log infra for debugging in sctp
   conntrack.
2. Remove pr_debug calls, they are either useless or we have better
   options in place.
3. Avoid repeated load of ct->status in some spots.
   Some bit-flags cannot change during the lifeetime of
   a connection, so no need to re-fetch those.
4. Avoid uneeded nesting of rcu_read_lock during tuple lookup.
5. Remove the CLUSTERIP target.  Marked as obsolete for years,
   and we still have WARN splats wrt. races of the out-of-band
   /proc interface installed by this target.
6. Add static key to nf_tables to avoid the retpoline mitigation
   if/else if cascade provided the cpu doesn't need the retpoline thunk.
7. add nf_tables objref calls to the retpoline mitigation workaround.
8. Split parts of nft_ct.c that do not need symbols exported by
   the conntrack modules and place them in nf_tables directly.
   This allows to avoid indirect call for 'ct status' checks.
9. Add 'destroy' commands to nf_tables.  They are identical
   to the existing 'delete' commands, but do not indicate
   an error if the referenced object (set, chain, rule...)
   did not exist, from Fernando.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'tsnep-xdp-support'

Gerhard Engleder says:

====================
tsnep: XDP support

Implement XDP support for tsnep driver. I tried to follow existing
drivers like igb/igc as far as possible. Some prework was already done
in previous patch series, so in this series only actual XDP stuff is
included.

Thanks for the NetDev 0x14 slides "Add XDP support on a NIC driver".

Some commits contain changes not related to XDP but found during review
of XDP support patches.

v5:
- fix spelling of 'subtract' in commit message (Alexander Duyck)
- call txq_trans_cond_update() only if TX is complete (Alexander Duyck)
- remove const from static functions (Alexander Duyck)
- replace TX spin_lock with __netif_tx_lock (Alexander Duyck)
- use xdp_return_frame_rx_napi() instead of xdp_return_frame_bulk() (Alexander Duyck)
- eliminate __TSNEP_DOWN (Alexander Duyck)
- introduce single function for xdp_rxq and napi init (Alexander Duyck)
- use TX queue of pair instead of expensive processor id modulo for XDP_TX (Alexander Duyck)
- eliminate processor id modulo in tsnep_netdev_xdp_xmit (Alexander Duyck)
- use bitmap for TX type and add fragment type (Alexander Duyck)
- always use XDP_PACKET_HEADROOM and DMA_BIDIRECTIONAL

v4:
- remove process context from spin_lock_bh commit message (Alexander Lobakin)
- move tsnep_adapter::state to prevent 4 byte hole (Alexander Lobakin)
- braces for bitops in combination logical ops (Alexander Lobakin)
- make various pointers const (Alexander Lobakin)
- '!i' instead of 'i == 0' (Alexander Lobakin)
- removed redundant braces (Alexander Lobakin)
- squash variables into same line if same type (Alexander Lobakin)
- use fact that ::skb and ::xdpf use same slot for simplification (Alexander Lobakin)
- use u32 for smp_processor_id() (Alexander Lobakin)
- don't add $(tsnep-y) to $(tsnep-objs) (Alexander Lobakin)
- use rev xmas tree in tsnep_netdev_open() (Alexander Lobakin)
- do not move tsnep_queue::napi (Alexander Lobakin)
- call xdp_init_buff() only once (Alexander Lobakin)
- get nq and tx only once for XDP TX (Alexander Lobakin)
- move XDP BPF program setup to end of patch series (Alexander Lobakin)
- check for XDP state change and prevent redundant down-ups (Alexander Lobakin)
- access tsnep_adapter::xdp_prog only with READ_ONCE in RX path (Alexander Lobakin)
- forward NAPI budget to napi_consume_skb() (Alexander Lobakin)
- fix errno leftover in tsnep_xdp_xmit_back() (Dan Carpenter)
- eliminate tsnep_xdp_is_enabled() by setting RX offset during init

v3:
- use spin_lock_bh for TX (Paolo Abeni)
- add comment for XDP TX descriptor available check (Maciej Fijalkowski)
- return value bool for tsnep_xdp_xmit_frame_ring() (Saeed Mahameed)
- do not print DMA mapping error (Saeed Mahameed)
- use reverse xmas tree variable declaration (Saeed Mahameed)
- move struct xdp_rxq_info to end of struct tsnep_rx (Maciej Fijalkowski)
- check __TSNEP_DOWN flag on close to prevent double free (Saeed Mahameed)
- describe TSNEP_RX_INLINE_METADATA_SIZE in comment (Maciej Fijalkowski)
- substract TSNEP_RX_INLINE_METADATA_SIZE after DMA sync (Maciej Fijalkowski)
- use enum tsnep_tx_type for tsnep_xdp_tx_map (Saeed Mahameed)
- use nxmit as loop iterator in tsnep_netdev_xdp_xmit (Saeed Mahameed)
- stop netdev in tsnep_netdev_close() which is called during BPF prog setup

v2:
- move tsnep_xdp_xmit_back() to commit where it is used (Paolo Abeni)
- remove inline from tsnep_rx_offset() (Paolo Abeni)
- remove inline from tsnep_rx_offset_xdp() (Paolo Abeni)
- simplify tsnep_xdp_run_prog() call by moving xdp_status update to it (Paolo Abeni)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

tsnep: Support XDP BPF program setup

Implement setup of BPF programs for XDP RX path with command
XDP_SETUP_PROG of ndo_bpf(). This is the final step for XDP RX path
support.

There is no need to reinit the RX queues as they are always prepared for
XDP.

Additionally remove $(tsnep-y) from $(tsnep-objs) because it is added
automatically.

Test results with A53 1.2GHz:

XDP_DROP (samples/bpf/xdp1)
proto 17:     883878 pkt/s

XDP_TX (samples/bpf/xdp2)
proto 17:     255693 pkt/s

XDP_REDIRECT (samples/bpf/xdpsock)
sock0@eth2:0 rxdrop xdp-drv
                   pps            pkts           1.00
rx                 855,582        5,404,523
tx                 0              0

XDP_REDIRECT (samples/bpf/xdp_redirect)
eth2->eth1         613,267 rx/s   0 err,drop/s   613,272 xmit/s

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tsnep: Add XDP RX support

If BPF program is set up, then run BPF program for every received frame
and execute the selected action.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tsnep: Add RX queue info for XDP support

Register xdp_rxq_info with page_pool memory model. This is needed for
XDP buffer handling.

Additionally fix error path by removing call of tsnep_phy_close() after
failed tsnep_phy_open().

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tsnep: Prepare RX buffer for XDP support

Always reserve XDP_PACKET_HEADROOM in front of RX buffer. Similar DMA
direction is always set to DMA_BIDIRECTIONAL. This eliminates the need
for RX queue reconfiguration during BPF program setup. The RX queue is
always prepared for XDP.

No negative impact of DMA_BIDIRECTIONAL was measured.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tsnep: Subtract TSNEP_RX_INLINE_METADATA_SIZE once

Subtract size of metadata in front of received data only once. This
simplifies the RX code.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tsnep: Add XDP TX support

Implement ndo_xdp_xmit() for XDP TX support. Support for fragmented XDP
frames is included.

Also some braces and logic cleanups are done in normal TX path to keep
both TX paths in sync.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tsnep: Do not print DMA mapping error

Printing in data path shall be avoided. DMA mapping error is already
counted in stats so printing is not necessary.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tsnep: Forward NAPI budget to napi_consume_skb()

NAPI budget must be forwarded to napi_consume_skb(). It is used to
detect non-NAPI context.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tsnep: Replace TX spin_lock with __netif_tx_lock

TX spin_lock can be eliminated, because the normal TX path is already
protected with __netif_tx_lock and this lock can be used for access to
queue outside of normal TX path too.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ftmac100: handle netdev flags IFF_PROMISC and IFF_ALLMULTI

When netdev->flags has IFF_PROMISC or IFF_ALLMULTI, set the
corresponding bits in the MAC Control Register (MACCR).

This change is based on code from the ftgmac100 driver, see
ftgmac100_start_hw() in ftgmac100.c

Signed-off-by: Sergei Antonov <saproj@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipv6: Remove extra counter pull before gc

Per cpu entries are no longer used in consideration
for doing gc or not. Remove the extra per cpu entries
pull to directly check for time and perform gc.

Signed-off-by: Tanmay Bhushan <007047221b@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'am65-cpts-PPS'

Siddharth Vadapalli says:

====================
Add PPS support to am65-cpts driver

The CPTS hardware doesn't support PPS signal generation. Using the GenFx
(periodic signal generator) function, it is possible to model a PPS signal
followed by routing it via the time sync router to the CPTS_HWy_TS_PUSH
(hardware time stamp) input, in order to generate timestamps at 1 second
intervals.

This series adds driver support for enabling PPS signal generation.
Additionally, the documentation for the am65-cpts driver is updated with
the bindings for the "ti,pps" property, which is used to inform the
pair [CPTS_HWy_TS_PUSH, GenFx] to the cpts driver.

Changes from v1:
1. Drop device-tree patches.
2. Address Roger's comments on the:
"net: ethernet: ti: am65-cpts: add pps support" patch.
3. Collect Reviewed-by tag from Rob Herring.

v1:
https://lore.kernel.org/r/20230111114429.1297557-1-s-vadapalli@ti.com/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: ethernet: ti: am65-cpts: adjust pps following ptp changes

When CPTS clock is sync/adjusted by running linuxptp (ptp4l) it will cause
PPS jitter as Genf running PPS is not adjusted.

The same PPM adjustment has to be applied to GenF as to PHC clock to
correct PPS length and keep them in sync.

Testing:
Master:
  ptp4l -P -2 -H -i eth0 -l 6 -m -q -p /dev/ptp1 -f ptp.cfg &
  testptp -d /dev/ptp1 -P 1
  ppstest /dev/pps0

Slave:
  linuxptp/ptp4l -P -2 -H -i eth0 -l 6 -m -q -p /dev/ptp1 -f ptp1.cfg -s &
    <port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED;>
  testptp -d /dev/ptp1 -P 1
  ppstest /dev/pps0

Master log:
source 0 - assert 620.000000689, sequence: 530
source 0 - assert 621.000000689, sequence: 531
source 0 - assert 622.000000689, sequence: 532
source 0 - assert 623.000000689, sequence: 533
source 0 - assert 624.000000689, sequence: 534
source 0 - assert 625.000000689, sequence: 535
source 0 - assert 626.000000689, sequence: 536
source 0 - assert 627.000000689, sequence: 537
source 0 - assert 628.000000689, sequence: 538
source 0 - assert 629.000000689, sequence: 539
source 0 - assert 630.000000689, sequence: 540
source 0 - assert 631.000000689, sequence: 541
source 0 - assert 632.000000689, sequence: 542
source 0 - assert 633.000000689, sequence: 543
source 0 - assert 634.000000689, sequence: 544
source 0 - assert 635.000000689, sequence: 545

Slave log:
source 0 - assert 620.000000706, sequence: 252
source 0 - assert 621.000000709, sequence: 253
source 0 - assert 622.000000707, sequence: 254
source 0 - assert 623.000000707, sequence: 255
source 0 - assert 624.000000706, sequence: 256
source 0 - assert 625.000000705, sequence: 257
source 0 - assert 626.000000709, sequence: 258
source 0 - assert 627.000000709, sequence: 259
source 0 - assert 628.000000707, sequence: 260
source 0 - assert 629.000000706, sequence: 261
source 0 - assert 630.000000710, sequence: 262
source 0 - assert 631.000000708, sequence: 263
source 0 - assert 632.000000705, sequence: 264
source 0 - assert 633.000000710, sequence: 265
source 0 - assert 634.000000708, sequence: 266
source 0 - assert 635.000000707, sequence: 267

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ethernet: ti: am65-cpts: add pps support

CPTS doesn't have HW support for PPS ("pulse per second”) signal
generation, but it can be modeled by using Time Sync Router and routing
GenFx (periodic signal generator) output to CPTS_HWy_TS_PUSH (hardware time
stamp) input, and configuring GenFx to generate 1sec pulses.

     +------------------------+
     |          CPTS          |
     |                        |
+--->CPTS_HW4_PUSH      GENFx+---+
|   |                        |   |
|   +------------------------+   |
|                                |
+--------------------------------+

Add corresponding support to am65-cpts driver. The DT property "ti,pps"
has to be used to enable PPS support and configure pair
[CPTS_HWy_TS_PUSH, GenFx].

Once enabled, PPS can be tested using ppstest tool:
# ./ppstest /dev/pps0

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

dt-binding: net: ti: am65x-cpts: add 'ti,pps' property

Add the ti,pps property used to indicate the pair of HWx_TS_PUSH input and
the TS_GENFy output.

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>