Ong Boon Leong [Thu, 18 Mar 2021 17:22:04 +0000 (01:22 +0800)]
net: stmmac: add RX frame steering based on VLAN priority in tc flower
We extend tc flower to support configuration of VLAN priority-based RX
frame steering hardware offloading.
To map VLAN <PCP> to Traffic Class <TC>:
$ tc filter add dev <IFNAME> parent ffff: protocol 802.1Q flower \
vlan_prio <PCP> hw_tc <TC>
Note: <TC> < N whereby "tc qdisc ... num_tc N ..."
To delete all tc flower configurations:
$ tc qdisc delete dev <IFNAME> ingress
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ong Boon Leong [Thu, 18 Mar 2021 17:22:03 +0000 (01:22 +0800)]
net: stmmac: restructure tc implementation for RX VLAN Priority steering
The current tc_add_flow() and tc_del_flow() use hardware L3 & L4 filters
as offloading. The number of L3/L4 filters is read from L3L4FNUM field
from MAC_HW_Feature1 register and is used to alloc priv->tc_entries[].
For RX frame steering based on VLAN priority offloading, we use
MAC_RXQ_CTRL2 & MAC_RXQ_CTRL3 registers and all VLAN priority level
can be configured independent from L3 & L4 filters.
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 18 Mar 2021 18:37:22 +0000 (11:37 -0700)]
Merge branch 'octeon-tc-offloads'
Naveen Mamindlapalli says:
====================
Add tc hardware offloads
This patch series adds support for tc hardware offloads.
Patch #1 adds support for offloading flows that matches IP tos and IP
protocol which will be used by tc hw offload support. Also
added ethtool n-tuple filter to code to offload the flows
matching the above fields.
Patch #2 adds tc flower hardware offload support on ingress traffic.
Patch #3 adds TC flower offload stats.
Patch #4 adds tc TC_MATCHALL egress ratelimiting offload.
* tc flower hardware offload in PF driver
The driver parses the flow match fields and actions received from the tc
subsystem and adds/delete MCAM rules for the same. Each flow contains set
of match and action fields. If the action or fields are not supported,
the rule cannot be offloaded to hardware. The tc uses same set of MCAM
rules allocated for ethtool n-tuple filters. So, at a time only one entity
can offload the flows to hardware, they're made mutually exclusive in the
driver.
Following match and actions are supported.
Match: Eth dst_mac, EtherType, 802.1Q {vlan_id,vlan_prio}, vlan EtherType,
IP proto {tcp,udp,sctp,icmp,icmp6}, IPv4 tos, IPv4{dst_ip,src_ip},
L4 proto {dst_port|src_port number}.
Actions: drop, accept, vlan pop, redirect to another port on the device.
The Hardware stats are also supported. Currently only packet counter stats
are updated.
* tc egress rate limiting support
Added TC-MATCHALL classifier offload with police action applied for all
egress traffic on the specified interface.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Sunil Goutham [Thu, 18 Mar 2021 10:02:15 +0000 (15:32 +0530)]
octeontx2-pf: TC_MATCHALL egress ratelimiting offload
Add TC_MATCHALL egress ratelimiting offload support with POLICE
action for entire traffic going out of the interface.
Eg: To ratelimit egress traffic to 100Mbps
$ ethtool -K eth0 hw-tc-offload on
$ tc qdisc add dev eth0 clsact
$ tc filter add dev eth0 egress matchall skip_sw \
action police rate 100Mbit burst 16Kbit
HW supports a max burst size of ~128KB.
Only one ratelimiting filter can be installed at a time.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Naveen Mamindlapalli [Thu, 18 Mar 2021 10:02:14 +0000 (15:32 +0530)]
octeontx2-pf: add tc flower stats handler for hw offloads
Add support to get the stats for tc flower flows that are
offloaded to hardware. To support this feature, added a
new AF mbox handler which returns the MCAM entry stats
for a flow that has hardware stat counter enabled.
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Naveen Mamindlapalli [Thu, 18 Mar 2021 10:02:13 +0000 (15:32 +0530)]
octeontx2-pf: Add tc flower hardware offload on ingress traffic
This patch adds support for tc flower hardware offload on ingress
traffic. Since the tc-flower filter rules use the same set of MCAM
rules as the n-tuple filters, the n-tuple filters and tc flower
rules are mutually exclusive. When one of the feature is enabled
using ethtool, the other feature is disabled in the driver. By default
the driver enables n-tuple filters during initialization.
The following flow keys are supported.
-> Ethernet: dst_mac
-> L2 proto: all protocols
-> VLAN (802.1q): vlan_id/vlan_prio
-> IPv4: dst_ip/src_ip/ip_proto{tcp|udp|sctp|icmp}/ip_tos
-> IPv6: ip_proto{icmpv6}
-> L4(tcp/udp/sctp): dst_port/src_port
The following flow actions are supported.
-> drop
-> accept
-> redirect
-> vlan pop
The flow action supports multiple actions when vlan pop is specified
as the first action. The redirect action supports redirecting to the
PF/VF of same PCI device. Redirecting to other PCI NIX devices is not
supported.
Example #1: Add a tc filter rule to drop UDP traffic with dest port 80
# ethtool -K eth0 hw-tc-offload on
# tc qdisc add dev eth0 ingress
# tc filter add dev eth0 protocol ip parent ffff: flower ip_proto \
udp dst_port 80 action drop
Example #2: Add a tc filter rule to redirect ingress traffic on eth0
with vlan id 3 to eth6 (ex: eth0 vf0) after stripping the vlan hdr.
# ethtool -K eth0 hw-tc-offload on
# tc qdisc add dev eth0 ingress
# tc filter add dev eth0 parent ffff: protocol 802.1Q flower \
vlan_id 3 vlan_ethtype ipv4 action vlan pop action mirred \
ingress redirect dev eth6
Example #3: List the ingress filter rules
# tc -s filter show dev eth4 ingress
Example #4: Delete tc flower filter rule with handle 0x1
# tc filter del dev eth0 ingress protocol ip pref 49152 \
handle 1 flower
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Naveen Mamindlapalli [Thu, 18 Mar 2021 10:02:12 +0000 (15:32 +0530)]
octeontx2-pf: Add ip tos and ip proto icmp/icmpv6 flow offload support
Add support for programming the HW MCAM match key with IP tos, IP(v6)
proto icmp/icmpv6, allowing flow offload rules to be installed using
those fields. The NPC HW extracts layer type, which will be used as a
matching criteria for different IP protocols.
The ethtool n-tuple filter logic has been updated to parse the IP tos
and l4proto for HW offloading. l4proto tcp/udp/sctp/ah/esp/icmp are
supported. See example usage below.
Ex: Redirect l4proto icmp to vf 0 queue 0
ethtool -U eth0 flow-type ip4 l4proto 1 action vf 0 queue 0
Ex: Redirect flow with ip tos 8 to vf 0 queue 0
ethtool -U eth0 flow-type ip4 tos 8 vf 0 queue 0
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Tretter [Wed, 17 Mar 2021 16:16:09 +0000 (17:16 +0100)]
net: macb: simplify clk_init with dev_err_probe
On some platforms, e.g., the ZynqMP, devm_clk_get can return
-EPROBE_DEFER if the clock controller, which is implemented in firmware,
has not been probed yet.
As clk_init is only called during probe, use dev_err_probe to simplify
the error message and hide it for -EPROBE_DEFER.
Signed-off-by: Michael Tretter <m.tretter@pengutronix.de>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 17 Mar 2021 21:44:19 +0000 (14:44 -0700)]
Merge branch 'mv88e6393x'
Marek Behún says:
====================
Add support for mv88e6393x family of Marvell
after 2 months I finally had time to send v17 of Amethyst patches.
This series is tested on Marvell CN9130-CRB.
Changes since v16:
- dropped patches adding 5gbase-r, since they are already merged
- rebased onto net-next/master
- driver API renamed set_egress_flood() method into 2 methods for
ucast/mcast floods, so this is fixed
Changes from v15:
- put 10000baseKR_Full back into phylink_validate method for Amethyst,
it seems I misunderstood the meaning behind things and removed it
from v15
- removed erratum 3.7, since the procedure is done anyway in
mv88e6390_serdes_pcs_config
- renumbered errata 3.6 and 3.8 to 4.6 and 4.8, according to newer
version of the errata document
- refactored errata code a little and removed duplicate macro
definitions (for example MV88E6390_SGMII_CONTROL is already called
MV88E6390_SGMII_BMCR)
Changes from v14:
- added my Signed-off-by tags to Pavana's patches, since I am sending
them (as suggested by Andrew)
- added documentation to second patch adding 5gbase-r mode (as requested
by Russell)
- added Reviewed-by tags
- applied Vladimir's suggestions:
- reduced indentation level in mv88e6xxx_set_egress_port and
mv88e6393x_serdes_port_config
- removed 10000baseKR_Full from mv88e6393x_phylink_validate
- removed PHY_INTERFACE_MODE_10GKR from mv88e6xxx_port_set_cmode
Changes from v13:
- added patch that wraps .set_egress_port into mv88e6xxx_set_egress_port,
so that we do not have to set chip->*gress_dest_port members in every
implementation of this method
- for the patch that adds Amethyst support:
- added more information into commit message
- added these methods for mv88e6393x_ops:
.port_sync_link
.port_setup_message_port
.port_max_speed_mode (new implementation needed)
.atu_get_hash
.atu_set_hash
.serdes_pcs_config
.serdes_pcs_an_restart
.serdes_pcs_link_up
- this device can set upstream port per port, so implement
.port_set_upstream_port
instead of
.set_cpu_port
- removed USXGMII cmode (not yet supported, working on it)
- added debug messages into mv88e6393x_port_set_speed_duplex
- added Amethyst errata 4.5 (EEE should be disabled on SERDES ports)
- fixed 5gbase-r serdes configuration and interrupt handling
- refactored mv88e6393x_serdes_setup_errata
- refactored mv88e6393x_port_policy_write
- added patch implementing .port_set_policy for Amethyst
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Marek Behún [Wed, 17 Mar 2021 13:46:43 +0000 (14:46 +0100)]
net: dsa: mv88e6xxx: implement .port_set_policy for Amethyst
The 16-bit Port Policy CTL register from older chips is on 6393x changed
to Port Policy MGMT CTL, which can access more data, but indirectly and
via 8-bit registers.
The original 16-bit value is divided into first two 8-bit register in
the Port Policy MGMT CTL.
We can therefore use the previous code to compute the mask and shift,
and then
- if 0 <= shift < 8, we access register 0 in Port Policy MGMT CTL
- if 8 <= shift < 16, we access register 1 in Port Policy MGMT CTL
There are in fact other possible policy settings for Amethyst which
could be added here, but this can be done in the future.
Signed-off-by: Marek Behún <kabel@kernel.org>
Reviewed-by: Pavana Sharma <pavana.sharma@digi.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavana Sharma [Wed, 17 Mar 2021 13:46:42 +0000 (14:46 +0100)]
net: dsa: mv88e6xxx: add support for mv88e6393x family
The Marvell 88E6393X device is a single-chip integration of a 11-port
Ethernet switch with eight integrated Gigabit Ethernet (GbE)
transceivers and three 10-Gigabit interfaces.
This patch adds functionalities specific to mv88e6393x family (88E6393X,
88E6193X and 88E6191X).
The main differences between previous devices and this one are:
- port 0 can be a SERDES port
- all SERDESes are one-lane, eg. no XAUI nor RXAUI
- on the other hand the SERDESes can do USXGMII, 10GBASER and 5GBASER
(on 6191X only one SERDES is capable of more than 1g; USXGMII is not
yet supported with this change)
- Port Policy CTL register is changed to Port Policy MGMT CTL register,
via which several more registers can be accessed indirectly
- egress monitor port is configured differently
- ingress monitor/CPU/mirror ports are configured differently and can be
configured per port (ie. each port can have different ingress monitor
port, for example)
- port speed AltBit works differently than previously
- PHY registers can be also accessed via MDIO address 0x18 and 0x19
(on previous devices they could be accessed only via Global 2 offsets
0x18 and 0x19, which means two indirections; this feature is not yet
leveraged with thiis commit)
Co-developed-by: Ashkan Boldaji <ashkan.boldaji@digi.com>
Signed-off-by: Ashkan Boldaji <ashkan.boldaji@digi.com>
Signed-off-by: Pavana Sharma <pavana.sharma@digi.com>
Co-developed-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marek Behún [Wed, 17 Mar 2021 13:46:41 +0000 (14:46 +0100)]
net: dsa: mv88e6xxx: wrap .set_egress_port method
There are two implementations of the .set_egress_port method, and both
of them, if successful, set chip->*gress_dest_port variable.
To avoid code repetition, wrap this method into
mv88e6xxx_set_egress_port.
Signed-off-by: Marek Behún <kabel@kernel.org>
Reviewed-by: Pavana Sharma <pavana.sharma@digi.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavana Sharma [Wed, 17 Mar 2021 13:46:40 +0000 (14:46 +0100)]
net: dsa: mv88e6xxx: change serdes lane parameter type from u8 type to int
Returning 0 is no more an error case with MV88E6393 family
which has serdes lane numbers 0, 9 or 10.
So with this change .serdes_get_lane will return lane number
or -errno (-ENODEV or -EOPNOTSUPP).
Signed-off-by: Pavana Sharma <pavana.sharma@digi.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
dingsenjie [Wed, 17 Mar 2021 12:30:30 +0000 (20:30 +0800)]
ethernet/microchip:remove unneeded variable: "ret"
remove unneeded variable: "ret".
Signed-off-by: dingsenjie <dingsenjie@yulong.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dingsenjie [Wed, 17 Mar 2021 12:29:33 +0000 (20:29 +0800)]
ethernet/broadcom:remove unneeded variable: "ret"
remove unneeded variable: "ret".
Signed-off-by: dingsenjie <dingsenjie@yulong.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ong Boon Leong [Wed, 17 Mar 2021 01:01:23 +0000 (09:01 +0800)]
net: stmmac: add per-queue TX & RX coalesce ethtool support
Extending the driver to support per-queue RX and TX coalesce settings in
order to support below commands:
To show per-queue coalesce setting:-
$ ethtool --per-queue <DEVNAME> queue_mask <MASK> --show-coalesce
To set per-queue coalesce setting:-
$ ethtool --per-queue <DEVNAME> queue_mask <MASK> --coalesce \
[rx-usecs N] [rx-frames M] [tx-usecs P] [tx-frames Q]
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 17 Mar 2021 19:34:35 +0000 (12:34 -0700)]
Merge branch 'dsa-doc-fixups'
Vladimir Oltean says:
====================
DSA/switchdev documentation fixups
These are some small fixups after the recently merged documentation
update.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Wed, 17 Mar 2021 17:44:58 +0000 (19:44 +0200)]
Documentation: networking: dsa: mention that the master is brought up automatically
Since commit
9d5ef190e561 ("net: dsa: automatically bring up DSA master
when opening user port"), DSA manages the administrative status of the
host port automatically. Update the configuration steps to reflect this.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Wed, 17 Mar 2021 17:44:57 +0000 (19:44 +0200)]
Documentation: networking: dsa: demote subsections to simple emphasized words
"make htmldocs" complains:
configuration.rst:165: WARNING: duplicate label networking/dsa/configuration:single port, other instance in (...)
configuration.rst:212: WARNING: duplicate label networking/dsa/configuration:bridge, other instance in (...)
configuration.rst:252: WARNING: duplicate label networking/dsa/configuration:gateway, other instance in (...)
And for good reason, because the "single port", "bridge" and "gateway"
use cases are replicated twice, once for normal taggers and twice for
DSA_TAG_PROTO_NONE. So when trying to reference these sections via a
hyperlink such as:
https://www.kernel.org/doc/html/latest/networking/dsa/configuration.html#single-port
it will always reference the first occurrence, and never the second one.
This change makes the "single port", "bridge" and "gateway"
configuration examples consistent with the formatting used in the
"Configuration showcases" subsection.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Wed, 17 Mar 2021 17:44:56 +0000 (19:44 +0200)]
Documentation: networking: dsa: add missing new line in devlink section
"make htmldocs" produces these warnings:
Documentation/networking/dsa/dsa.rst:468: WARNING: Unexpected indentation.
Documentation/networking/dsa/dsa.rst:477: WARNING: Block quote ends without a blank line; unexpected unindent.
Fixes: 8411abbcad8e ("Documentation: networking: dsa: mention integration with devlink")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Wed, 17 Mar 2021 17:44:55 +0000 (19:44 +0200)]
Documentation: networking: switchdev: add missing "and" word
Even though this is clear from the context, it is nice to actually be
grammatically correct.
Fixes: 0f22ad45f47c ("Documentation: networking: switchdev: clarify device driver behavior")
Reported-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Wed, 17 Mar 2021 17:44:54 +0000 (19:44 +0200)]
Documentation: networking: switchdev: separate bulleted items with new line
It looks like "make htmldocs" produces this warning:
Documentation/networking/switchdev.rst:482: WARNING: Unexpected indentation.
Fixes: 0f22ad45f47c ("Documentation: networking: switchdev: clarify device driver behavior")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 17 Mar 2021 19:29:40 +0000 (12:29 -0700)]
Merge branch 'octeontx2-refactor'
Naveen Mamindlapalli says:
====================
refactor code related to npc install flow
This patchset refactors and cleans up the code associated with the
npc install flow API, specifically to eliminate different code paths
while installing MCAM rules by AF and PF. This makes the code easier
to understand and maintain. Also added support for multi channel NIX
promisc entry.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Naveen Mamindlapalli [Wed, 17 Mar 2021 13:35:38 +0000 (19:05 +0530)]
octeontx2-af: Modify the return code for unsupported flow keys
The mbox handler npc_install_flow returns ENOTSUPP for unsupported
flow keys. This patch modifies the return value to AF driver defined
error code for debugging purpose.
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Subbaraya Sundeep [Wed, 17 Mar 2021 13:35:37 +0000 (19:05 +0530)]
octeontx2-af: Avoid duplicate unicast rule in mcam_rules list
A mcam rule described by mcam_rule struct has all the info
such as the hardware MCAM entry number, match criteria and
corresponding action etc. All mcam rules are stored in a
linked list mcam->rules. When adding/updating a rule to the
mcam->rules it is checked if a rule already exists for the
mcam entry. If the rule already exists, the same rule is
updated instead of creating new rule. This way only one
mcam_rule exists for the only one default unicast entry
installed by AF. But a PF/VF can get different NIXLF
(or default unicast entry number) after a attach-detach-attach
sequence. When that happens mcam_rules list end up with two
default unicast rules. Fix the problem by deleting the default
unicast rule list node always when disabling mcam rules.
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Naveen Mamindlapalli [Wed, 17 Mar 2021 13:35:36 +0000 (19:05 +0530)]
octeontx2-af: Use npc_install_flow API for promisc and broadcast entries
Use npc_install_flow mailbox API for installing the default promisc
and broadcast match entries. Earlier these entries were installed
using low level npc_config_mcam_entry API, which does not store these
rules and is not available when the rules are dumped using debugfs.
Added chan_mask field to npc_install_flow_req to calculate channel
mask when channel count is greater than 1 and configure the channel
mask in entry kw_mask.
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nalla, Pradeep [Wed, 17 Mar 2021 13:35:35 +0000 (19:05 +0530)]
octeontx2-af: Add support for multi channel in NIX promisc entry
This patch adds support for multi channel NIX promisc entry. Packets sent
on all those channels by the host should be received by the interface to
which those channels belong. Channel count, if greater than 1, should be
power of 2 as only one promisc entry is available for the interface. Key
mask is modified such that incoming packets from channel base to channel
count are directed to the same pci function.
Signed-off-by: Nalla, Pradeep <pnalla@marvell.com>
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Naveen Mamindlapalli [Wed, 17 Mar 2021 13:35:34 +0000 (19:05 +0530)]
octeontx2-af: refactor function npc_install_flow for default entry
This patch refactors npc_install_flow function to install AF
installed default MCAM entries similar to other MCAM entries
installed by PF/VF. As a result the code would be more readable
and easy to maintain. Modified npc_verify_entry and npc_verify_channel
to properly check MCAM rules installed by AF.
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 17 Mar 2021 19:26:28 +0000 (12:26 -0700)]
Merge branch 'mlxsw-vlan-=vxlan'
mlxsw: Allow 802.1d and .1ad VxLAN bridges to coexist on Spectrum>=2
Ido Schimmel says:
====================
This patchset allows user space to simultaneously configure both 802.1d
and 802.1ad VxLAN bridges on Spectrum-2 and later ASICs. 802.1ad VxLAN
bridges are still forbidden on Spectrum-1.
The reason for the current limitation is that up until now the EtherType
that was pushed to decapsulated VxLAN packets was a property of the
tunnel port, of which there is only one. This meant that a 802.1ad VxLAN
bridge could not be configured if the tunnel port was already configured
to push a 802.1q tag.
This patchset improves the situation by making two changes. First,
decapsulated packets are marked as having their EtherType decided by the
egress port. Second, local ports member in the bridge (e.g., swp1) are
configured to set the correct egress EtherType.
Patchset overview:
Patch #1 adds a register required for the first change
Patches #2-#3 add the register required for the second change and a
corresponding API
Patch #4 prepares the driver for the split in behavior between
Spectrum-1 and later ASICs
Patch #5 performs the two above mentioned changes to allow the driver to
support simultaneous 802.1ad and 802.1d VxLAN bridges on Spectrum-2 and
later ASICs
Patch #6 adds a selftest
Patch #7 removes a selftest that verified the limitation that was lifted
by this patchset
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Wed, 17 Mar 2021 10:35:29 +0000 (12:35 +0200)]
selftests: mlxsw: spectrum-2: Remove q_in_vni_veto test
q_in_vni_veto.sh is not needed anymore because VxLAN with an 802.1ad
bridge and VxLAN with an 802.1d bridge can coexist.
Remove the test.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Wed, 17 Mar 2021 10:35:28 +0000 (12:35 +0200)]
selftests: forwarding: Add test for dual VxLAN bridge
Configure VxLAN with an 802.1ad bridge and VxLAN with an 802.1d bridge
at the same time in same switch, verify that traffic passed as expected.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Wed, 17 Mar 2021 10:35:27 +0000 (12:35 +0200)]
mlxsw: Allow 802.1d and .1ad VxLAN bridges to coexist on Spectrum>=2
Currently only one EtherType can be configured for pushing in tunnels
because EtherType is configured using SPVID.et_vlan for tunnel port.
This behavior is forbidden by comparing mlxsw_sp_nve_config struct for
each new tunnel, the struct contains 'ethertype' field which means that
only one EtherType is legal at any given time. Remove 'ethertype' field to
allow creating VxLAN devices with different bridges.
To allow using several types of VxLAN bridges at the same time, the
EtherType should be determined at the egress port. This behavior is
achieved by setting SPVID to decide which EtherType to push at egress and
for each local_port which is member in 802.1ad bridge, set SPEVET.et_vlan
to ether_type1 (i.e., 0x88A8).
Use switchdev_ops->init() to set different mlxsw_sp_bridge_ops for
different ASICs in order to be able to split the behavior when port joins /
leaves an 802.1ad bridge in different ASICs.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Wed, 17 Mar 2021 10:35:26 +0000 (12:35 +0200)]
mlxsw: Add struct mlxsw_sp_switchdev_ops per ASIC
A subsequent patch will need to implement different set of operations
when a port joins / leaves an 802.1ad bridge, based on the ASIC type.
Prepare for this change by allowing to initialize the bridge module
based on the ASIC type via 'struct mlxsw_sp_switchdev_ops'.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Wed, 17 Mar 2021 10:35:25 +0000 (12:35 +0200)]
mlxsw: spectrum: Add mlxsw_sp_port_egress_ethtype_set()
A subsequent patch will cause decapsulated packets to have their EtherType
determined by the egress port. Add mlxsw_sp_port_egress_ethtype_set() which
will be called when a port joins an 802.1ad bridge, so that it will set an
802.1ad EtherType on decapsulated packets transmitted through it, instead
of the default 802.1q EtherType.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Wed, 17 Mar 2021 10:35:24 +0000 (12:35 +0200)]
mlxsw: reg: Add Switch Port Egress VLAN EtherType Register
SPEVET configures which EtherType to push at egress for packets incoming
through a local port for which 'SPVID.egr_et_set' is set.
The next patches will use SPEVET to configure EtherType 0x88A8 and
0x8100 for local ports member in 802.1ad and 802.1q bridges,
respectively. This allows using dual VxLAN bridges (802.1d and 802.1ad at
the same time).
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Wed, 17 Mar 2021 10:35:23 +0000 (12:35 +0200)]
mlxsw: reg: Add egr_et_set field to SPVID
SPVID.egr_et_set=1 means that when VLAN is pushed at ingress (for untagged
packets or for QinQ push mode) then the EtherType is decided at the egress
port.
The next patches will use this field for VxLAN devices (tunnel port) in
order to allow using dual VxLAN bridges (802.1d and 802.1ad at the same
time).
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 17 Mar 2021 19:24:36 +0000 (12:24 -0700)]
Merge branch 'b53-legacy-tags'
Álvaro Fernández Rojas says:
====================
net: dsa: b53: support legacy tags
Legacy Broadcom tags are needed for older switches.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Álvaro Fernández Rojas [Wed, 17 Mar 2021 10:29:27 +0000 (11:29 +0100)]
net: dsa: b53: support legacy tags
These tags are used on BCM5325, BCM5365 and BCM63xx switches.
Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Álvaro Fernández Rojas [Wed, 17 Mar 2021 10:29:26 +0000 (11:29 +0100)]
net: dsa: tag_brcm: add support for legacy tags
Add support for legacy Broadcom tags, which are similar to DSA_TAG_PROTO_BRCM.
These tags are used on BCM5325, BCM5365 and BCM63xx switches.
Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bhaskar Chowdhury [Wed, 17 Mar 2021 09:00:59 +0000 (14:30 +0530)]
net: ppp: Mundane typo fixes in the file pppoe.c
s/procesing/processing/
s/comparations/comparisons/
Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Álvaro Fernández Rojas [Wed, 17 Mar 2021 08:42:01 +0000 (09:42 +0100)]
net: dsa: b53: relax is63xx() condition
BCM63xx switches are present on bcm63xx and bmips devices.
Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Voon Weifeng [Wed, 17 Mar 2021 04:09:04 +0000 (12:09 +0800)]
net: stmmac: add timestamp correction to rid CDC sync error
According to Synopsis DesignWare EQoS Databook, the Clock Domain Cross
synchronization error is introduced tue to the clock(GMII Tx/Rx clock)
being different at the capture as compared to the PTP
clock(clk_ptp_ref_i) that is used to generate the time.
The CDC synchronization error is almost equal to 2 times the clock
period of the PTP clock(clk_ptp_ref_i).
On a Intel Tigerlake platform (with Marvell
88E2110 external PHY):
Before applying this patch (with CDC synchronization error):
ptp4l[64.044]: rms 8 max 13 freq +30877 +/- 11 delay 216 +/- 0
ptp4l[65.047]: rms 13 max 20 freq +30869 +/- 17 delay 213 +/- 0
ptp4l[66.050]: rms 12 max 20 freq +30857 +/- 11 delay 213 +/- 0
ptp4l[67.052]: rms 11 max 22 freq +30849 +/- 10 delay 215 +/- 0
ptp4l[68.055]: rms 10 max 16 freq +30853 +/- 13 delay 215 +/- 0
ptp4l[69.057]: rms 7 max 13 freq +30848 +/- 9 delay 216 +/- 0
ptp4l[70.060]: rms 8 max 13 freq +30846 +/- 10 delay 216 +/- 0
ptp4l[71.063]: rms 9 max 15 freq +30836 +/- 8 delay 218 +/- 0
After applying this patch (CDC syncrhonization error is taken care of):
ptp4l[61.516]: rms 773 max 824 freq +31526 +/- 158 delay 200 +/- 0
ptp4l[62.519]: rms 427 max 596 freq +31668 +/- 39 delay 198 +/- 0
ptp4l[63.522]: rms 113 max 206 freq +31482 +/- 57 delay 198 +/- 0
ptp4l[64.525]: rms 40 max 56 freq +31316 +/- 29 delay 200 +/- 0
ptp4l[65.528]: rms 47 max 56 freq +31255 +/- 17 delay 200 +/- 0
ptp4l[66.531]: rms 26 max 36 freq +31246 +/- 9 delay 200 +/- 0
ptp4l[67.534]: rms 12 max 18 freq +31254 +/- 12 delay 202 +/- 0
ptp4l[68.537]: rms 7 max 12 freq +31263 +/- 10 delay 202 +/- 0
Signed-off-by: Voon Weifeng <weifeng.voon@intel.com>
Signed-off-by: Wong Vee Khee <vee.khee.wong@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 17 Mar 2021 18:51:05 +0000 (11:51 -0700)]
Merge branch 'tipc-cleanups-and-simplifications'
Jon Maloy says:
====================
tipc: cleanups and simplifications
We do a number of cleanups and simplifications, especially regarding
call signatures in the binding table. This makes the code easier to
understand and serves as preparation for upcoming functional additions.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Maloy [Wed, 17 Mar 2021 02:06:23 +0000 (22:06 -0400)]
tipc: remove some unnecessary warnings
We move some warning printouts to more strategic locations to avoid
duplicates and yield more detailed information about the reported
problem.
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Hoang Le <hoang.h.le@dektech.com.au>
Acked-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Acked-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Maloy [Wed, 17 Mar 2021 02:06:22 +0000 (22:06 -0400)]
tipc: add host-endian copy of user subscription to struct tipc_subscription
We reduce and localize the usage of the tipc_sub_xx() macros by adding a
corresponding member, with fields set in host-endian format, to struct
tipc_subscription.
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Hoang Le <hoang.h.le@dektech.com.au>
Acked-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Acked-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Maloy [Wed, 17 Mar 2021 02:06:21 +0000 (22:06 -0400)]
tipc: simplify api between binding table and topology server
The function tipc_report_overlap() is called from the binding table
with numerous parameters taken from an instance of struct publication.
A closer look reveals that it always is safe to send along a pointer
to the instance itself, and hence reduce the call signature. We do
that in this commit.
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Hoang Le <hoang.h.le@dektech.com.au>
Acked-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Acked-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Maloy [Wed, 17 Mar 2021 02:06:20 +0000 (22:06 -0400)]
tipc: simplify signature of tipc_find_service()
We reduce the signature of tipc_find_service() and
tipc_create_service(). The reason for doing this might not
be obvious, but we plan to let struct tipc_uaddr contain
information that is relevant for these functions in a later
commit.
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Hoang Le <hoang.h.le@dektech.com.au>
Acked-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Acked-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Maloy [Wed, 17 Mar 2021 02:06:19 +0000 (22:06 -0400)]
tipc: simplify signature of tipc_service_find_range()
We simplify the signatures of the functions tipc_service_create_range()
and tipc_service_find_range().
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Hoang Le <hoang.h.le@dektech.com.au>
Acked-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Acked-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Maloy [Wed, 17 Mar 2021 02:06:18 +0000 (22:06 -0400)]
tipc: simplify signature of tipc_nametbl_lookup_group()
We reduce the signature of tipc_nametbl_lookup_group() by using a
struct tipc_uaddr pointer. This entails a couple of minor changes in the
functions tipc_send_group_mcast/anycast/unicast/bcast() in socket.c
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Hoang Le <hoang.h.le@dektech.com.au>
Acked-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Acked-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Maloy [Wed, 17 Mar 2021 02:06:17 +0000 (22:06 -0400)]
tipc: simplify signature of tipc_nametbl_lookup_mcast_nodes()
We follow up the preceding commits by reducing the signature of
the function tipc_nametbl_lookup_mcast_nodes().
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Hoang Le <hoang.h.le@dektech.com.au>
Acked-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Acked-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Maloy [Wed, 17 Mar 2021 02:06:16 +0000 (22:06 -0400)]
tipc: simplify signature of tipc_namtbl_lookup_mcast_sockets()
We reduce the signature of this function according to the same
principle as the preceding commits.
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Hoang Le <hoang.h.le@dektech.com.au>
Acked-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Acked-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Maloy [Wed, 17 Mar 2021 02:06:15 +0000 (22:06 -0400)]
tipc: refactor tipc_sendmsg() and tipc_lookup_anycast()
We simplify the signature if function tipc_nametbl_lookup_anycast(),
using address structures instead of discrete integers.
This also makes it possible to make some improvements to the functions
__tipc_sendmsg() in socket.c and tipc_msg_lookup_dest() in msg.c.
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Hoang Le <hoang.h.le@dektech.com.au>
Acked-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Acked-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Maloy [Wed, 17 Mar 2021 02:06:14 +0000 (22:06 -0400)]
tipc: rename binding table lookup functions
The binding table provides four different lookup functions, which
purpose is not obvious neither by their names nor by the (lack of)
descriptions.
We now give these functions names that better match their purposes,
and improve the comments that describe what they are doing.
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Hoang Le <hoang.h.le@dektech.com.au>
Acked-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Acked-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Maloy [Wed, 17 Mar 2021 02:06:13 +0000 (22:06 -0400)]
tipc: simplify signature of tipc_nametbl_withdraw() functions
Following the principles of the preceding commits, we reduce
the number of parameters passed along in tipc_sk_withdraw(),
tipc_nametbl_withdraw() and associated functions.
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Hoang Le <hoang.h.le@dektech.com.au>
Acked-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Acked-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Maloy [Wed, 17 Mar 2021 02:06:12 +0000 (22:06 -0400)]
tipc: simplify call signatures for publication creation
We simplify the call signatures for tipc_nametbl_insert_publ() and
tipc_publ_create() so that fewer parameters are passed around.
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Hoang Le <hoang.h.le@dektech.com.au>
Acked-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Acked-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Maloy [Wed, 17 Mar 2021 02:06:11 +0000 (22:06 -0400)]
tipc: simplify signature of tipc_namtbl_publish()
Using the new address structure tipc_uaddr, we simplify the signature
of function tipc_sk_publish() and tipc_namtbl_publish() so that fewer
parameters need to be passed around.
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Hoang Le <hoang.h.le@dektech.com.au>
Acked-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Acked-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Maloy [Wed, 17 Mar 2021 02:06:10 +0000 (22:06 -0400)]
tipc: introduce new unified address type for internal use
We introduce a simplified version of struct sockaddr_tipc, using
anonymous unions and structures. Apart from being nicer to work with,
this struct will come in handy when we in a later commit add another
address type.
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Hoang Le <hoang.h.le@dektech.com.au>
Acked-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Acked-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Maloy [Wed, 17 Mar 2021 02:06:09 +0000 (22:06 -0400)]
tipc: move creation of publication item one level up in call chain
We instantiate struct publication in tipc_nametbl_insert_publ()
instead of as currently in tipc_service_insert_publ(). This has the
advantage that we can pass a pointer to the publication struct to
the next call levels, instead of the numerous individual parameters
we pass on now. It also gives us a location to keep the contents of
the additional fields we will introduce in a later commit.
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Hoang Le <hoang.h.le@dektech.com.au>
Acked-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Acked-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Maloy [Wed, 17 Mar 2021 02:06:08 +0000 (22:06 -0400)]
tipc: re-organize members of struct publication
In a future commit we will introduce more members to struct publication.
In order to keep this structure comprehensible we now group some of
its current fields into the sub-structures where they really belong,
- A struct tipc_service_range for the functional address the publication
is representing.
- A struct tipc_socket_addr for the socket bound to that service range.
We also rename the stack variable 'publ' to just 'p' in a few places.
This is just as easy to understand in the given context, and keeps the
number of wrapped code lines to a minimum.
There are no functional changes in this commit.
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Hoang Le <hoang.h.le@dektech.com.au>
Acked-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Acked-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 17 Mar 2021 18:42:31 +0000 (11:42 -0700)]
Merge branch 'ethtool-strings'
Alexander Duyck says:
====================
ethtool: Factor out common code related to writing ethtool strings
This patch set is meant to be a cleanup and refactoring of common code bits
from several drivers. Specificlly a number of drivers engage in a pattern
where they will use some variant on an sprintf or memcpy to write a string
into the ethtool string array and then they will increment their pointer by
ETH_GSTRING_LEN.
Instead of having each driver implement this independently I am refactoring
the code so that we have one central function, ethtool_sprintf that does
all this and takes a double pointer to access the data, a formatted string
to print, and the variable arguments that are associated with the string.
Changes from v1:
Fixed usage of char ** vs unsigned char ** in hisilicon drivers
Changes from RFC:
Renamed ethtool_gsprintf to ethtool_sprintf
Fixed reverse xmas tree issue in patch 2
====================
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Wed, 17 Mar 2021 00:31:55 +0000 (17:31 -0700)]
ionic: Update driver to use ethtool_sprintf
Update the ionic driver to make use of ethtool_sprintf. In addition add
separate functions for Tx/Rx stats strings in order to reduce the total
amount of indenting needed in the driver code.
Acked-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Wed, 17 Mar 2021 00:31:46 +0000 (17:31 -0700)]
bna: Update driver to use ethtool_sprintf
Update the bnad_get_strings to make use of ethtool_sprintf and avoid
unnecessary line wrapping. To do this we invert the logic for the string
set test and instead exit immediately if we are not working with the stats
strings. In addition the function is broken up into subfunctions for each
area so that we can simply call ethtool_sprintf once for each string in a
given subsection.
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Wed, 17 Mar 2021 00:31:37 +0000 (17:31 -0700)]
vmxnet3: Update driver to use ethtool_sprintf
So this patch actually does 3 things.
First it removes a stray white space at the start of the variable
declaration in vmxnet3_get_strings.
Second it flips the logic for the string test so that we exit immediately
if we are not looking for the stats strings. Doing this we can avoid
unnecessary indentation and line wrapping.
Then finally it updates the code to use ethtool_sprintf rather than a
memcpy and pointer increment to write the ethtool strings.
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Wed, 17 Mar 2021 00:31:29 +0000 (17:31 -0700)]
virtio_net: Update driver to use ethtool_sprintf
Update the code to replace instances of snprintf and a pointer update with
just calling ethtool_sprintf.
Also replace the char pointer with a u8 pointer to avoid having to recast
the pointer type.
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Wed, 17 Mar 2021 00:31:20 +0000 (17:31 -0700)]
netvsc: Update driver to use ethtool_sprintf
Replace instances of sprintf or memcpy with a pointer update with
ethtool_sprintf.
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Wed, 17 Mar 2021 00:31:11 +0000 (17:31 -0700)]
ena: Update driver to use ethtool_sprintf
Replace instances of snprintf or memcpy with a pointer update with
ethtool_sprintf.
Acked-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Wed, 17 Mar 2021 00:31:03 +0000 (17:31 -0700)]
hisilicon: Update drivers to use ethtool_sprintf
Update the hisilicon drivers to make use of ethtool_sprintf. The general
idea is to reduce code size and overhead by replacing the repeated pattern
of string printf statements and ETH_STRING_LEN counter increments.
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Wed, 17 Mar 2021 00:30:53 +0000 (17:30 -0700)]
nfp: Replace nfp_pr_et with ethtool_sprintf
The nfp_pr_et function is nearly identical to ethtool_sprintf except for
the fact that it passes the pointer by value and as a return whereas
ethtool_sprintf passes it as a pointer.
Since they are so close just update nfp to make use of ethtool_sprintf
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Wed, 17 Mar 2021 00:30:44 +0000 (17:30 -0700)]
intel: Update drivers to use ethtool_sprintf
Update the Intel drivers to make use of ethtool_sprintf. The general idea
is to reduce code size and overhead by replacing the repeated pattern of
string printf statements and ETH_STRING_LEN counter increments.
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Wed, 17 Mar 2021 00:30:36 +0000 (17:30 -0700)]
ethtool: Add common function for filling out strings
Add a function to handle the common pattern of printing a string into the
ethtool strings interface and incrementing the string pointer by the
ETH_GSTRING_LEN. Most of the drivers end up doing this and several have
implemented their own versions of this function so it would make sense to
consolidate on one implementation.
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 17 Mar 2021 18:22:39 +0000 (11:22 -0700)]
Merge tag 'mlx5-updates-2021-03-16' of git://git./linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2021-03-16
mlx5 uplink representor netdev persistence.
Before this patchset we used to have separate netdevs for Native NIC mode
and Switchdev mode (uplink representor netdev), meaning that if user
switches modes between Native to Switchdev and vice versa, the driver
would cleanup the current netdev representor and create a new one for the
new mode, such behavior created an administrative nightmare for users,
where users need to be aware of such loss of both data path and control
path configurations, e.g. netdev attributes and arp/route tables,
where the later is more painful.
A simple solution for this is not to replace the netdev in first place
and use a single netdev to serve the uplink/physical port whether it is
in switchdev mode or native mode.
We already have different HW profiles for each netdev mode, in this series
we just replace the HW profile on the fly and we keep the same netdev
attached.
Refactoring: Some refactoring has been made to overcome some technical
difficulties
1) The netdev is created with the maximum amount of tx/rx queues to serve
the two profiles.
2) Some ndos are not supported in some modes, so we added a mode check for
such cases, e.g legacy sriov ndos must be blocked in switchdev mode.
3) Some mlx5 netdev private attributes need to be moved out of profiles
and kept in a persistent place, where the netdev is created
e.g devlink port and other global HW resources
4) The netdev devlink port is now always registered with the switch id
Implementation: the last three patches implement the mechanism now as the
netdev can be shared.
5) Don't recreate the netdev on switchdev mode changes
6) Prevent changing switchdev mode when some netdev operations
are active, mostly when TC rules are being processed.
This is required since the netdev is kept registered while switchdev mode
can be changed.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Roi Dayan [Wed, 16 Sep 2020 07:11:47 +0000 (10:11 +0300)]
net/mlx5: E-Switch, Protect changing mode while adding rules
We re-use the native NIC port net device instance for the Uplink
representor, a driver currently cannot unbind TC setup callback
actively, hence protect changing E-Switch mode while adding rules.
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Roi Dayan [Wed, 16 Sep 2020 07:11:42 +0000 (10:11 +0300)]
net/mlx5: E-Switch, Change mode lock from mutex to rw semaphore
E-Switch mode change routine will take the write lock to prevent any
consumer to access the E-Switch resources while E-Switch is going
through a mode change.
In the next patch
E-Switch consumers (e.g vport representors) will take read_lock prior to
accessing E-Switch resources to prevent E-Switch mode changing in the
middle of the operation.
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Roi Dayan [Wed, 16 Sep 2020 07:11:33 +0000 (10:11 +0300)]
net/mlx5e: Do not reload ethernet ports when changing eswitch mode
When switching modes between legacy and switchdev and back, do not
reload ethernet interfaces. just change the profile from nic profile
to uplink rep profile in switchdev mode.
Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Roi Dayan [Thu, 14 Jan 2021 14:05:56 +0000 (16:05 +0200)]
net/mlx5e: Unregister eth-reps devices first
When we clean all the interfaces, i.e. rescan or reload module,
we need to clean eth-reps devices first, before eth devices.
We will re-use the native NIC port net device instance for the Uplink
representor. Changing eswitch mode will skip destroying the eth device
so the net device won't be destroyed and only change the profile.
Creating uplink eth-rep will initialize the representor related resources.
In that sense when we destroy all devices we first need to destroy
eth-rep devices so uplink eth-rep will clean all representor related
resources and only then destroy the eth device which will destroy rest
of the resources and the net device.
Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Roi Dayan [Wed, 28 Oct 2020 09:21:26 +0000 (11:21 +0200)]
net/mlx5: Move devlink port from mlx5e priv to mlx5e resources
We re-use the native NIC port net device instance for the Uplink
representor, and the devlink port.
When changing profiles we reset the mlx5e priv but we should still
use the devlink port so move it to mlx5e resources.
Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Roi Dayan [Tue, 26 Jan 2021 09:51:04 +0000 (11:51 +0200)]
net/mlx5: Move mlx5e hw resources into a sub object
This is to separate between resources attributes and other
attributes we will want to use.
Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Roi Dayan [Wed, 21 Oct 2020 07:28:50 +0000 (10:28 +0300)]
net/mlx5e: Register nic devlink port with switch id
We will re-use the native NIC port net device instance for the Uplink
representor. Since the netdev will be kept registered while we engage
switchdev mode also the devlink will be kept registered.
Register the nic devlink port with switch id so it will be available
when changing profiles.
Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Roi Dayan [Tue, 20 Oct 2020 09:26:51 +0000 (12:26 +0300)]
net/mlx5e: Move devlink port register and unregister calls
We will re-use the native NIC port net device instance for the Uplink
representor. As such we also don't want to unregister/register the
devlink port as part of the profile.
Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Roi Dayan [Wed, 16 Sep 2020 07:11:26 +0000 (10:11 +0300)]
net/mlx5e: Verify dev is present in some ndos
We will re-use the native NIC port net device instance for the Uplink
representor. While changing profiles private resources are not
available but some ndos are not checking if the netdev is present.
So for those ndos check the netdev is present in the driver before
accessing the private resources.
Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Roi Dayan [Wed, 16 Sep 2020 07:11:07 +0000 (10:11 +0300)]
net/mlx5e: Use nic mode netdev ndos and ethtool ops for uplink representor
Remove dedicated uplink rep netdev ndos and ethtools ops.
We will re-use the native NIC port net device instance and ethtool ops for
the Uplink representor.
Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Roi Dayan [Wed, 16 Sep 2020 07:11:01 +0000 (10:11 +0300)]
net/mlx5e: Add offload stats ndos to nic netdev ops
We will re-use the native NIC port net device instance for the Uplink
representor, hence same ndos must be used.
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Roi Dayan [Wed, 16 Sep 2020 07:10:56 +0000 (10:10 +0300)]
net/mlx5e: Distinguish nic and esw offload in tc setup block cb
We will re-use the native NIC port net device instance for the Uplink
representor, hence same ndos will be used.
Now we need to distinguish in the TC callback if the mode is legacy or
switchdev and set the proper flag.
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Roi Dayan [Wed, 16 Sep 2020 07:10:48 +0000 (10:10 +0300)]
net/mlx5e: Allow legacy vf ndos only if in legacy mode
We will re-use the native NIC port net device instance for the Uplink
representor. Several VF ndo ops are not relevant in switchdev mode.
Disallow them when eswitch mode is not legacy as a preparation.
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Saeed Mahameed [Mon, 23 Mar 2020 06:22:20 +0000 (23:22 -0700)]
net/mlx5e: Same max num channels for both nic and uplink profiles
In downstream patches NIC netdev can change profile dynamically from
NIC mode to uplink mode and vise-versa. It is required that both profiles
must advertise the same max amount of tx/rx queues.
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Roi Dayan [Wed, 16 Sep 2020 07:11:20 +0000 (10:11 +0300)]
net: Change dev parameter to const in netif_device_present()
Not all ndos check the present bit before calling the ndo and the driver
may want to check it. Sometimes the dev parameter passed as const so we
pass it to netif_device_present() as const.
Since netif_device_present() doesn't modify dev parameter anyway, declare
it as const.
Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Flavio Leitner [Tue, 16 Mar 2021 20:14:27 +0000 (17:14 -0300)]
openvswitch: Warn over-mtu packets only if iface is UP.
It is not unusual to have the bridge port down. Sometimes
it has the old MTU, which is fine since it's not being used.
However, the kernel spams the log with a warning message
when a packet is going to be sent over such port. Fix that
by warning only if the interface is UP.
Signed-off-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 16 Mar 2021 22:52:15 +0000 (15:52 -0700)]
Revert "net: socket: use BIT() for MSG_*"
This reverts commit
0bb3262c0248d44aea3be31076f44beb82a7b120.
Breaks things on mips64/qemu
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 16 Mar 2021 22:49:52 +0000 (15:49 -0700)]
Merge branch 'ocelot-mrp'
Horatiu Vultur says:
====================
net: ocelot: Extend MRP
This patch series extends the current support of MRP in Ocelot driver.
Currently the forwarding of the frames happened in SW because all frames
were trapped to CPU. With this patch the MRP frames will be forward in HW.
v1 -> v2:
- create a patch series instead of single patch
- rename ocelot_mrp_find_port to ocelot_mrp_find_partner_port
- rename PGID_MRP to PGID_BLACKHOLE
- use GFP_KERNEL instead of GFP_ATOMIC
- fix other whitespace issues
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Horatiu Vultur [Tue, 16 Mar 2021 20:10:19 +0000 (21:10 +0100)]
net: ocelot: Remove ocelot_xfh_get_cpuq
Now when extracting frames from CPU the cpuq is not used anymore so
remove it.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Horatiu Vultur [Tue, 16 Mar 2021 20:10:18 +0000 (21:10 +0100)]
net: ocelot: Extend MRP
This patch extends MRP support for Ocelot. It allows to have multiple
rings and when the node has the MRC role it forwards MRP Test frames in
HW. For MRM there is no change.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Horatiu Vultur [Tue, 16 Mar 2021 20:10:17 +0000 (21:10 +0100)]
net: ocelot: Add PGID_BLACKHOLE
Add a new PGID that is used not to forward frames anywhere. It is used
by MRP to make sure that MRP Test frames will not reach CPU port.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 16 Mar 2021 22:34:15 +0000 (15:34 -0700)]
Merge branch '40GbE' of git://git./linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
40GbE Intel Wired LAN Driver Updates 2021-03-16
This series contains updates to i40e, ixgbe, and ice drivers.
Magnus Karlsson says:
Optimize run_xdp_zc() for the XDP program verdict being XDP_REDIRECT
in the xsk zero-copy path. This path is only used when having AF_XDP
zero-copy on and in that case most packets will be directed to user
space. This provides around 100k extra packets in throughput on my
server when running l2fwd in xdpsock.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 16 Mar 2021 22:32:23 +0000 (15:32 -0700)]
Merge branch 'mlxsw-Add-support-for-egress-and-policy-based-sampling'
Ido Schimmel says:
====================
mlxsw: Add support for egress and policy-based sampling
So far mlxsw only supported ingress sampling using matchall classifier.
This series adds support for egress sampling and policy-based sampling
using flower classifier on Spectrum-2 and newer ASICs. As such, it is
now possible to issue these commands:
# tc filter add dev swp1 egress pref 1 proto all matchall action sample rate 100 group 1
# tc filter add dev swp2 ingress pref 1 proto ip flower dst_ip 198.51.100.1 action sample rate 100 group 2
When performing egress sampling (using either matchall or flower) the
ASIC is able to report the end-to-end latency which is passed to the
psample module.
Series overview:
Patches #1-#3 are preparations without any functional changes
Patch #4 generalizes the idea of sampling triggers and creates a hash
table to track active sampling triggers in preparation for egress and
policy-based triggers. The motivation is explained in the changelog
Patch #5 flips mlxsw to start using this hash table instead of storing
ingress sampling triggers as an attribute of the sampled port
Patch #6 finally adds support for egress sampling using matchall
classifier
Patches #7-#8 add support for policy-based sampling using flower
classifier
Patches #9 extends the mlxsw sampling selftest to cover the new triggers
Patch #10 makes sure that egress sampling configuration only fails on
Spectrum-1
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Tue, 16 Mar 2021 15:03:03 +0000 (17:03 +0200)]
selftests: mlxsw: Test egress sampling limitation on Spectrum-1 only
Make sure egress sampling configuration only fails on Spectrum-1, given
that mlxsw now supports it on Spectrum-{2,3}.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Tue, 16 Mar 2021 15:03:02 +0000 (17:03 +0200)]
selftests: mlxsw: Add tc sample tests for new triggers
Test that packets are sampled when tc-sample is used with matchall
egress binding and flower classifier. Verify that when performing
sampling on egress the end-to-end latency is reported as metadata.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Tue, 16 Mar 2021 15:03:01 +0000 (17:03 +0200)]
mlxsw: spectrum_acl: Offload FLOW_ACTION_SAMPLE
Implement support for action sample when used with a flower classifier
by implementing the required sampler_add() / sampler_del() callbacks and
registering an Rx listener for the sampled packets.
The sampler_add() callback returns an error for Spectrum-1 as the
functionality is not supported. In Spectrum-{2,3} the callback creates a
mirroring agent towards the CPU. The agent's identifier is used by the
policy engine code to mirror towards the CPU with probability.
The Rx listener for the sampled packet is registered with the 'policy
engine' mirroring reason and passes trapped packets to the psample
module after looking up their parameters (e.g., sampling group).
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Tue, 16 Mar 2021 15:03:00 +0000 (17:03 +0200)]
mlxsw: core_acl_flex_actions: Add mirror sampler action
Add core functionality required to support mirror sampler action in the
policy engine. The switch driver (e.g., 'mlxsw_spectrum') is required to
implement the sampler_add() / sampler_del() callbacks that perform the
necessary configuration before the sampler action can be installed. The
next patch will implement it for Spectrum-{2,3}, while Spectrum-1 will
return an error, given it is not supported.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Tue, 16 Mar 2021 15:02:59 +0000 (17:02 +0200)]
mlxsw: spectrum_matchall: Add support for egress sampling
Allow user space to install a matchall classifier with sample action on
egress. This is only supported on Spectrum-2 onwards, so Spectrum-1 will
continue to return an error.
Programming the hardware to sample on egress is identical to ingress
sampling with the sole change of using a different sampling trigger.
Upon receiving a sampled packet, the sampling trigger (ingress vs.
egress) will be encoded in the mirroring reason in the Completion Queue
Element (CQE). The mirroring reason is used to lookup the sampling
parameters (e.g., psample group) which are passed to the psample module.
Note that locally generated packets that are sampled are simply
consumed. This is done for several reasons.
First, such packets do not have an ingress netdev given that their Rx
local port is the CPU port. This breaks several basic assumptions.
Second, sampling using the same interface (tc), but with flower
classifier will not result in locally generated packets being sampled
given that such packets are not subject to the policy engine.
Third, realistically, this is not a big deal given that the vast
majority of the packets being transmitted through the port are not
locally generated packets.
Fourth, if such packets do need to be sampled, they can be sampled with
a 'skip_hw' filter and reported to the same sampling group as the data
path packets. The software sampling rate can also be adjusted to fit the
rate of the locally generated packets which is much lower than the rate
of the data path traffic.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Tue, 16 Mar 2021 15:02:58 +0000 (17:02 +0200)]
mlxsw: spectrum: Start using sampling triggers hash table
Start using the previously introduced sampling triggers hash table to
store sampling parameters instead of storing them as attributes of the
sampled port.
This makes it easier to introduce new sampling triggers.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>