platform/kernel/linux-rpi.git
2 years agowifi: mac80211: pass the link id in start/stop ap
Shaul Triebitz [Thu, 2 Jun 2022 12:08:16 +0000 (15:08 +0300)]
wifi: mac80211: pass the link id in start/stop ap

In start_ap and stop_ap mac80211 callbacks pass the link_id
to the drivers.

Signed-off-by: Shaul Triebitz <shaul.triebitz@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2 years agowifi: mac80211: use link in start/stop ap
Shaul Triebitz [Wed, 1 Jun 2022 09:55:14 +0000 (12:55 +0300)]
wifi: mac80211: use link in start/stop ap

Use link and link_conf according to the link_id
provided by cfg in start_ap/stop_ap and change_beacon.
Also use them in the functions called by them.
Note that for a non MLD device, the link_id is 0,
and link[0] and link_conf[0] equal to deflink and
bss_conf respectively (what was there before).

Also, call vif_info_change for BSS related changes (SSID), and
link_info_change for LINK related changes (instead of the
legacy bss_info_change).

Signed-off-by: Shaul Triebitz <shaul.triebitz@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2 years agowifi: mac80211: implement add/del interface link callbacks
Johannes Berg [Fri, 10 Jun 2022 09:21:21 +0000 (11:21 +0200)]
wifi: mac80211: implement add/del interface link callbacks

When a link is added or removed via nl80211, these are
called. Implement them so we don't have to check in all
the different per-link commands whether we've already
created the necessary datastructures.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2 years agowifi: cfg80211: add optional link add/remove callbacks
Johannes Berg [Fri, 10 Jun 2022 09:07:55 +0000 (11:07 +0200)]
wifi: cfg80211: add optional link add/remove callbacks

Add some optional callbacks for link add/remove so that
drivers can react here. Initially, I thought it would be
sufficient to just create the link in start_ap etc., but
it turns out that's not so simple, since there are quite
a few callbacks that can be called: if they're erroneously
without start_ap, things might crash.

Thus it might be easier for drivers to allocate all the
necessary data structures immediately, to not have to
worry about it in each callback, since cfg80211 checks
that the link ID is valid (has been added.)

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2 years agowifi: cfg80211: sort trace.h
Johannes Berg [Fri, 10 Jun 2022 08:50:18 +0000 (10:50 +0200)]
wifi: cfg80211: sort trace.h

We wanted to have this sorted by direction (to/from driver),
but didn't maintain that well. Sort the file now.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2 years agowifi: mac80211: add sta link addition/removal
Johannes Berg [Tue, 31 May 2022 21:20:08 +0000 (23:20 +0200)]
wifi: mac80211: add sta link addition/removal

Add the necessary infrastructure, including a new driver
method, to add/remove links to/from a station. To do this,
refactor the link alloc/free a bit, splitting that so we
can do it without linking them, to handle failures better.

Note that a station entry must be created representing an
MLD or a non-MLD STA, it cannot change between the two.
When representing an MLD, the 'deflink' is used for the
first link, which might be removed later, in which case
the memory isn't reused.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2 years agowifi: mac80211: add MLO link ID to TX frame metadata
Johannes Berg [Thu, 9 Jun 2022 20:05:07 +0000 (22:05 +0200)]
wifi: mac80211: add MLO link ID to TX frame metadata

Take a few bits out of the control.flags to add the link ID
to TX frame metadata, so drivers don't need to look it up
by the address themselves. Implement that lookup where it's
needed, for internal frame TX, and set it to "unspecified"
for data transmissions.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2 years agowifi: mac80211: remove band from TX info in MLO
Johannes Berg [Wed, 1 Jun 2022 13:59:31 +0000 (15:59 +0200)]
wifi: mac80211: remove band from TX info in MLO

If the interface is an MLD, then we don't know which band
the frame will be transmitted on, and we don't know how to
look up the band. Set the band information to zero in that
case, the driver cannot rely on it anyway.

No longer inline ieee80211_tx_skb_tid() since it's even
bigger now.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2 years agowifi: mac80211: add vif link addition/removal
Johannes Berg [Tue, 31 May 2022 21:20:08 +0000 (23:20 +0200)]
wifi: mac80211: add vif link addition/removal

Add the necessary infrastructure, including a new driver
method, to add/remove links to/from an interface.

Also add the missing link address to bss_conf (which we
use as link_conf too), and fill it, in station mode for
now just randomly, in AP mode we get the address from
cfg80211 since the link must be created with an address
first.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2 years agowifi: nl80211: support MLO in auth/assoc
Johannes Berg [Tue, 31 May 2022 17:48:33 +0000 (19:48 +0200)]
wifi: nl80211: support MLO in auth/assoc

For authentication, we need the BSS, the link_id and the AP
MLD address to create the link and station, (for now) the
driver assigns a link address and sends the frame, the MLD
address needs to be the address of the interface.

For association, pass the list of BSSes that were selected
for the MLO connection, along with extra per-STA profile
elements, the AP MLD address and the link ID on which the
association request should be sent.

Note that for now we don't have a proper way to pass the link
address(es) and so the driver/mac80211 will select one, but
depending on how that selection works it means that assoc w/o
auth data still being around (mac80211 implementation detail)
the association won't necessarily work - so this will need to
be extended in the future to sort out the link addressing.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2 years agowifi: mac80211: ignore IEEE80211_CONF_CHANGE_SMPS in chanctx mode
Johannes Berg [Wed, 8 Jun 2022 12:18:17 +0000 (14:18 +0200)]
wifi: mac80211: ignore IEEE80211_CONF_CHANGE_SMPS in chanctx mode

When channel contexts are used, IEEE80211_CONF_CHANGE_SMPS
doesn't make sense and doesn't apply (which is documented).
Mask it in this case.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2 years agowifi: mac80211_hwsim: split bss_info_changed to vif/link info_changed
Shaul Triebitz [Mon, 6 Jun 2022 12:51:45 +0000 (15:51 +0300)]
wifi: mac80211_hwsim: split bss_info_changed to vif/link info_changed

Replace the bss_info_changed callback with vif_cfg_changed
and link_info_changed callbacks (for vif changes and link
changes).

Signed-off-by: Shaul Triebitz <shaul.triebitz@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2 years agowifi: cfg80211: simplify cfg80211_mlme_auth() prototype
Johannes Berg [Wed, 1 Jun 2022 20:42:28 +0000 (22:42 +0200)]
wifi: cfg80211: simplify cfg80211_mlme_auth() prototype

This function has far too many parameters now, move out
the BSS lookup and pass the request struct instead.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2 years agowifi: ieee80211: add definitions for multi-link element
Johannes Berg [Tue, 31 May 2022 12:03:38 +0000 (14:03 +0200)]
wifi: ieee80211: add definitions for multi-link element

Add the definitions necessary to build and parse some of the
multi-link element, the per-STA profile isn't fully included.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2 years agowifi: nl80211: refactor BSS lookup in nl80211_associate()
Johannes Berg [Tue, 31 May 2022 16:31:19 +0000 (18:31 +0200)]
wifi: nl80211: refactor BSS lookup in nl80211_associate()

For MLO we'll need to do this multiple times, so refactor
this. For now keep the disconnect_bssid, but we'll need to
figure out how to handle that with MLD.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2 years agowifi: cfg80211: mlme: get BSS entry outside cfg80211_mlme_assoc()
Johannes Berg [Tue, 31 May 2022 16:00:00 +0000 (18:00 +0200)]
wifi: cfg80211: mlme: get BSS entry outside cfg80211_mlme_assoc()

Today it makes more sense to pass the necessary parameters to
look up the BSS entry to cfg80211_mlme_assoc(), but with MLO
we will need to look up multiple, and that gets awkward. Pull
the lookup code into the callers so we can change it better.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2 years agowifi: mac80211: tx: simplify chanctx_conf handling
Johannes Berg [Wed, 1 Jun 2022 12:25:44 +0000 (14:25 +0200)]
wifi: mac80211: tx: simplify chanctx_conf handling

In ieee80211_build_hdr() we do the same thing for all
interface types except for AP_VLAN, but we can simplify
the code by pulling the common thing in front of the
switch and overriding it for AP_VLAN. This will also
simplify the code for MLD here later.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2 years agowifi: mac80211: status: look up band only where needed
Johannes Berg [Wed, 1 Jun 2022 12:16:00 +0000 (14:16 +0200)]
wifi: mac80211: status: look up band only where needed

For MLD, we might eventually not really know the band on status,
but some code assumes it's there. Move the sband lookup deep to
the code that actually needs it, to make it clear where exactly
it's needed and for what purposes.

For rate control, at least initially we won't support it in MLO,
so that won't be an issue.

For TX monitoring, we may have to elide the rate and/or rely on
ieee80211_tx_status_ext() for rate information.

This also simplifies the function prototypes.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2 years agowifi: mac80211: sort trace.h file
Johannes Berg [Tue, 31 May 2022 21:06:19 +0000 (23:06 +0200)]
wifi: mac80211: sort trace.h file

This used to be sorted by driver methods, APIs and internal
functions, but got added to in the wrong sections. Fix that
by ordering the file properly again.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2 years agowifi: mac80211: correct link config data in tracing
Johannes Berg [Mon, 30 May 2022 22:02:33 +0000 (00:02 +0200)]
wifi: mac80211: correct link config data in tracing

We need to no longer use bss_conf here, but the per-link data.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2 years agowifi: mac80211: make ieee80211_he_cap_ie_to_sta_he_cap() MLO-aware
Johannes Berg [Mon, 30 May 2022 21:52:41 +0000 (23:52 +0200)]
wifi: mac80211: make ieee80211_he_cap_ie_to_sta_he_cap() MLO-aware

Add the link_id parameter and adjust the code accordingly.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2 years agowifi: mac80211: make some SMPS code MLD-aware
Johannes Berg [Mon, 30 May 2022 21:45:04 +0000 (23:45 +0200)]
wifi: mac80211: make some SMPS code MLD-aware

Start making some SMPS related code MLD-aware. This isn't
really done yet, but again cuts down our 'deflink' reliance.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2 years agowifi: mac80211: HT: make ieee80211_ht_cap_ie_to_sta_ht_cap() MLO-aware
Johannes Berg [Mon, 30 May 2022 21:34:04 +0000 (23:34 +0200)]
wifi: mac80211: HT: make ieee80211_ht_cap_ie_to_sta_ht_cap() MLO-aware

Update ieee80211_ht_cap_ie_to_sta_ht_cap() to handle per-link
data.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2 years agowifi: mac80211: add link_id to eht.c code for MLO
Johannes Berg [Mon, 30 May 2022 21:22:19 +0000 (23:22 +0200)]
wifi: mac80211: add link_id to eht.c code for MLO

Update the code in eht.c and add the link_id parameter where
necessary.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2 years agowifi: mac80211: add link_id to vht.c code for MLO
Johannes Berg [Mon, 30 May 2022 21:22:19 +0000 (23:22 +0200)]
wifi: mac80211: add link_id to vht.c code for MLO

Update the code in vht.c and add the link_id parameter where
necessary.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2 years agowifi: mac80211: refactor some link setup code
Johannes Berg [Mon, 30 May 2022 21:01:50 +0000 (23:01 +0200)]
wifi: mac80211: refactor some link setup code

We don't need to setup lists and work structs every time
we switch the interface type, factor that out into a new
ieee80211_link_init() function and use it.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2 years agowifi: mac80211: validate some driver features for MLO
Johannes Berg [Mon, 30 May 2022 20:52:14 +0000 (22:52 +0200)]
wifi: mac80211: validate some driver features for MLO

If MLO is enabled by the driver then validate a set of
capabilities that mac80211 will initially not support
in MLO. This might change if features are implemented.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2 years agowifi: mac80211: use IEEE80211_MLD_MAX_NUM_LINKS
Johannes Berg [Mon, 30 May 2022 20:36:30 +0000 (22:36 +0200)]
wifi: mac80211: use IEEE80211_MLD_MAX_NUM_LINKS

Remove MAX_STA_LINKS and use IEEE80211_MLD_MAX_NUM_LINKS
instead to unify between the station and other data structures.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2 years agowifi: mac80211: refactor some sta_info link handling
Johannes Berg [Mon, 30 May 2022 19:31:37 +0000 (21:31 +0200)]
wifi: mac80211: refactor some sta_info link handling

Refactor the code a bit to initialize a link belonging
to a station, and (later) free all allocated links.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2 years agowifi: mac80211: remove sta_info_tx_streams()
Johannes Berg [Mon, 30 May 2022 19:28:31 +0000 (21:28 +0200)]
wifi: mac80211: remove sta_info_tx_streams()

The function is unused since commit 52b4810bed83 ("mac80211: Remove
support for changing AP SMPS mode") so we can just remove it.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2 years agowifi: mac80211: make channel context code MLO-aware
Johannes Berg [Mon, 30 May 2022 16:35:23 +0000 (18:35 +0200)]
wifi: mac80211: make channel context code MLO-aware

Make the channel context code MLO aware, along with some
functions that it uses, so that the chan.c file is now
MLD-clean and no longer uses deflink/bss_conf/etc.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2 years agowifi: mac80211: pass link ID where already present
Johannes Berg [Mon, 30 May 2022 12:18:09 +0000 (14:18 +0200)]
wifi: mac80211: pass link ID where already present

In a few cases we already have the link ID in the APIs,
pass it already even if it cannot be non-zero yet.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2 years agowifi: mac80211: add per-link configuration pointer
Johannes Berg [Mon, 30 May 2022 11:09:28 +0000 (13:09 +0200)]
wifi: mac80211: add per-link configuration pointer

Add pointers so we can start using link_id throughout the
code, even if for now only link ID 0 is valid, pointing
to the "built-in" bss_conf, which is used by drivers that
are not aware of MLD.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2 years agowifi: mac80211: split bss_info_changed method
Johannes Berg [Tue, 24 May 2022 08:55:56 +0000 (10:55 +0200)]
wifi: mac80211: split bss_info_changed method

Split the bss_info_changed method to vif_cfg_changed and
link_info_changed, with the latter getting a link ID.
Also change the 'changed' parameter to u64 already, we
know we need that.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2 years agowifi: mac80211: reorg some iface data structs for MLD
Johannes Berg [Mon, 16 May 2022 13:00:15 +0000 (15:00 +0200)]
wifi: mac80211: reorg some iface data structs for MLD

Start reorganizing interface related data structures toward
MLD. The most complex part here is for the keys, since we
have to split the various kinds of GTKs off to the link but
still need to use (for WEP) the other keys as a fallback
even for multicast frames.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2 years agowifi: mac80211: move interface config to new struct
Johannes Berg [Tue, 10 May 2022 15:05:04 +0000 (17:05 +0200)]
wifi: mac80211: move interface config to new struct

We'll use bss_conf for per-link configuration later, so
move out all the non-link-specific data out into a new
struct ieee80211_vif_cfg used in the vif.

Some adjustments were done with the following spatch:

    @@
    expression sdata;
    struct ieee80211_vif *vifp;
    identifier var = { assoc, ibss_joined, aid, arp_addr_list, arp_addr_cnt, ssid, ssid_len, s1g, ibss_creator };
    @@
    (
    -sdata->vif.bss_conf.var
    +sdata->vif.cfg.var
    |
    -vifp->bss_conf.var
    +vifp->cfg.var
    )

    @bss_conf@
    struct ieee80211_bss_conf *bss_conf;
    identifier var = { assoc, ibss_joined, aid, arp_addr_list, arp_addr_cnt, ssid, ssid_len, s1g, ibss_creator };
    @@
    -bss_conf->var
    +vif_cfg->var

(though more manual fixups were needed, e.g. replacing
"vif_cfg->" by "vif->cfg." in many files.)

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2 years agowifi: mac80211: move some future per-link data to bss_conf
Johannes Berg [Tue, 10 May 2022 11:26:44 +0000 (13:26 +0200)]
wifi: mac80211: move some future per-link data to bss_conf

To add MLD, reuse the bss_conf structure later for per-link
information, so move some things into it that are per link.

Most transformations were done with the following spatch:

    @@
    expression sdata;
    identifier var = { chanctx_conf, mu_mimo_owner, csa_active, color_change_active, color_change_color };
    @@
    -sdata->vif.var
    +sdata->vif.bss_conf.var

    @@
    struct ieee80211_vif *vif;
    identifier var = { chanctx_conf, mu_mimo_owner, csa_active, color_change_active, color_change_color };
    @@
    -vif->var
    +vif->bss_conf.var

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2 years agowifi: cfg80211: do some rework towards MLO link APIs
Johannes Berg [Thu, 14 Apr 2022 14:50:57 +0000 (16:50 +0200)]
wifi: cfg80211: do some rework towards MLO link APIs

In order to support multi-link operation with multiple links,
start adding some APIs. The notable addition here is to have
the link ID in a new nl80211 attribute, that will be used to
differentiate the links in many nl80211 operations.

So far, this patch adds the netlink NL80211_ATTR_MLO_LINK_ID
attribute (as well as the NL80211_ATTR_MLO_LINKS attribute)
and plugs it through the system in some places, checking the
validity etc. along with other infrastructure needed for it.

For now, I've decided to include only the over-the-air link
ID in the API. I know we discussed that we eventually need to
have to have other ways of identifying a link, but for local
AP mode and auth/assoc commands as well as set_key etc. we'll
use the OTA ID.

Also included in this patch is some refactoring of the data
structures in struct wireless_dev, splitting for the first
time the data into type dependent pieces, to make reasoning
about these things easier.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2 years agowifi: mac80211: reject WEP or pairwise keys with key ID > 3
Johannes Berg [Thu, 19 May 2022 15:57:53 +0000 (17:57 +0200)]
wifi: mac80211: reject WEP or pairwise keys with key ID > 3

We don't really care too much right now since our data
structures are set up to not have a problem with this,
but clearly it's wrong to accept WEP and pairwise keys
with key ID > 3.

However, with MLD we need to split into per-link (GTK,
IGTK, BIGTK) and per interface/MLD (including WEP) keys
so make sure this is not a problem.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2 years agoMerge ath-next from git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git
Kalle Valo [Wed, 15 Jun 2022 12:57:20 +0000 (15:57 +0300)]
Merge ath-next from git://git./linux/kernel/git/kvalo/ath.git

ath.git patches for v5.20. Major changes:

ath10k

* 802.3 frame format support

2 years agonet: sparx5: Allow mdb entries to both CPU and ports
Casper Andersson [Tue, 14 Jun 2022 09:25:32 +0000 (11:25 +0200)]
net: sparx5: Allow mdb entries to both CPU and ports

Allow mdb entries to be forwarded to CPU and be switched at the same
time. Only remove entry when no port and the CPU isn't part of the group
anymore.

Signed-off-by: Casper Andersson <casper.casan@gmail.com>
Acked-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agobcm63xx_enet: switch to napi_build_skb() to reuse skbuff_heads
Sieng Piaw Liew [Wed, 15 Jun 2022 06:09:22 +0000 (14:09 +0800)]
bcm63xx_enet: switch to napi_build_skb() to reuse skbuff_heads

napi_build_skb() reuses NAPI skbuff_head cache in order to save some
cycles on freeing/allocating skbuff_heads on every new Rx or completed
Tx.
Use napi_consume_skb() to feed the cache with skbuff_heads of completed
Tx so it's never empty.

Signed-off-by: Sieng Piaw Liew <liew.s.piaw@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: don't check skb_count twice
Sieng Piaw Liew [Wed, 15 Jun 2022 03:24:26 +0000 (11:24 +0800)]
net: don't check skb_count twice

NAPI cache skb_count is being checked twice without condition. Change to
checking the second time only if the first check is run.

Signed-off-by: Sieng Piaw Liew <liew.s.piaw@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: bridge: allow add/remove permanent mdb entries on disabled ports
Casper Andersson [Tue, 14 Jun 2022 06:32:23 +0000 (08:32 +0200)]
net: bridge: allow add/remove permanent mdb entries on disabled ports

Adding mdb entries on disabled ports allows you to do setup before
accepting any traffic, avoiding any time where the port is not in the
multicast group.

Signed-off-by: Casper Andersson <casper.casan@gmail.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoi40e: add xdp frags support to ndo_xdp_xmit
Lorenzo Bianconi [Mon, 13 Jun 2022 16:51:50 +0000 (09:51 -0700)]
i40e: add xdp frags support to ndo_xdp_xmit

Add the capability to map non-linear xdp frames in XDP_TX and ndo_xdp_xmit
callback.

Tested-by: Sarkar Tirthendu <tirthendu.sarkar@intel.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: phy: marvell-88x2222: set proper phydev->port
Ivan Bornyakov [Sun, 12 Jun 2022 18:19:34 +0000 (21:19 +0300)]
net: phy: marvell-88x2222: set proper phydev->port

phydev->port was not set and always reported as PORT_TP.
Set phydev->port according to inserted SFP module.

Signed-off-by: Ivan Bornyakov <i.bornyakov@metrotek.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agodt-bindings: net: xilinx: document xilinx emaclite driver binding
Radhey Shyam Pandey [Thu, 9 Jun 2022 16:53:35 +0000 (22:23 +0530)]
dt-bindings: net: xilinx: document xilinx emaclite driver binding

Add basic description for the xilinx emaclite driver DT bindings.

Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'ipa-simplify-completion-stats'
David S. Miller [Wed, 15 Jun 2022 08:07:58 +0000 (09:07 +0100)]
Merge branch 'ipa-simplify-completion-stats'

Alex Elder says:

====================
net: ipa: simplify completion statistics

The first patch in this series makes the name used for variables
representing a TRE ring be consistent everywhere.  The second
renames two structure fields to better represent their purpose.

The last four rework a little code that manages some tranaction and
byte transfer statistics maintained mainly for TX endpoints.  For
the most part this series is refactoring.  The last one also
includes the first step toward no longer assuming an event ring is
dedicated to a single channel.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: ipa: rework gsi_channel_tx_update()
Alex Elder [Mon, 13 Jun 2022 17:17:59 +0000 (12:17 -0500)]
net: ipa: rework gsi_channel_tx_update()

Rename gsi_channel_tx_update() to be gsi_trans_tx_completed(), and
pass it just the transaction pointer, deriving the channel from the
transaction.  Update the comments above the function to provide a
more concise description of how statistics for TX endpoints are
maintained and used.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: ipa: stop counting total RX bytes and transactions
Alex Elder [Mon, 13 Jun 2022 17:17:58 +0000 (12:17 -0500)]
net: ipa: stop counting total RX bytes and transactions

In gsi_evt_ring_rx_update(), we update each transaction so its len
field reflects the actual number of bytes received.  In the process,
the total number of transactions and bytes processed on the channel
are summed, and added to a running total for the channel.

But we don't actually use those running totals for RX endpoints.
They're maintained for TX channels to support CoDel when they are
associated with a "real" network device.

So stop maintaining these totals for RX endpoints, and update the
comment where the fields are defined to make it clear they're only
valid for TX channels.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: ipa: simplify TX completion statistics
Alex Elder [Mon, 13 Jun 2022 17:17:57 +0000 (12:17 -0500)]
net: ipa: simplify TX completion statistics

When a TX request is issued, its channel's accumulated byte and
transaction counts are recorded.  This currently does *not* take
into account the transaction being committed.

Later, when the transaction completes, the number of bytes and
transactions that have completed since the transaction was committed
are reported to the network stack.  The transaction and its byte
count are accounted for at that time.

Instead, record the transaction and its bytes in the counts recorded
at commit time.  This avoids the need to do so when the transaction
completes, and provides a (small) simplification of that code.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: ipa: introduce gsi_trans_tx_committed()
Alex Elder [Mon, 13 Jun 2022 17:17:56 +0000 (12:17 -0500)]
net: ipa: introduce gsi_trans_tx_committed()

Create a new function that encapsulates recording information needed
for TX channel statistics when a transaction is committed.

Record the accumulated length in the transaction before the call
(for both RX and TX), so it can be used when updating TX statistics.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: ipa: rename two transaction fields
Alex Elder [Mon, 13 Jun 2022 17:17:55 +0000 (12:17 -0500)]
net: ipa: rename two transaction fields

There are two fields in a GSI transaction that keep track of TRE
counts.  The first represents the number of TREs reserved for the
transaction in the TRE ring; that's currently named "tre_count".
The second is the number of TREs that are actually *used* by the
transaction at the time it is committed.

Rename the "tre_count" field to be "rsvd_count", to make its meaning
a little more specific.  The "_count" is present in the name mainly
to avoid interpreting it as a reserved (not-to-be-used) field.  This
name also distinguishes it from the "tre_count" field associated
with a channel.

Rename the "used" field to be "used_count", to match the convention
used for reserved TREs.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: ipa: use "tre_ring" for all TRE ring local variables
Alex Elder [Mon, 13 Jun 2022 17:17:54 +0000 (12:17 -0500)]
net: ipa: use "tre_ring" for all TRE ring local variables

All local variables that represent event rings are named "ring".

All but two functions that represent a channel's TRE ring with a
local variable use the name "tre_ring".  For consistency, use that
name in the two functions that don't fit the pattern.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'support-mt7531-on-bpi-r2-pro'
Jakub Kicinski [Wed, 15 Jun 2022 05:35:18 +0000 (22:35 -0700)]
Merge branch 'support-mt7531-on-bpi-r2-pro'

Frank Wunderlich says:

====================
Support mt7531 on BPI-R2 Pro

This Series add Support for the mt7531 switch on Bananapi R2 Pro board.

This board uses port5 of the switch to conect to the gmac0 of the
rk3568 SoC.

Currently CPU-Port is hardcoded in the mt7530 driver to port 6.

Compared to v1 the reset-Patch was dropped as it was not needed and
CPU-Port-changes are completely rewriten based on suggestions/code from
Vladimir Oltean (many thanks to this).
In DTS Patch i only dropped the status-property that was not
needed/ignored by driver.

Due to the Changes i also made a regression test on mt7623 bpi-r2
(mt7623 soc + mt7530) and bpi-r64 (mt7622 soc + mt7531) with cpu-
port 6. Tests were done directly (ipv4 config on dsa user port)
and with vlan-aware bridge including vlan that was tagged outgoing
on dsa user port.
====================

Link: https://lore.kernel.org/r/20220610170541.8643-1-linux@fw-web.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoarm64: dts: rockchip: Add mt7531 dsa node to BPI-R2-Pro board
Frank Wunderlich [Fri, 10 Jun 2022 17:05:41 +0000 (19:05 +0200)]
arm64: dts: rockchip: Add mt7531 dsa node to BPI-R2-Pro board

Add Device Tree node for mt7531 switch connected to gmac0.

Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agodt-bindings: net: dsa: make reset optional and add rgmii-mode to mt7531
Frank Wunderlich [Fri, 10 Jun 2022 17:05:40 +0000 (19:05 +0200)]
dt-bindings: net: dsa: make reset optional and add rgmii-mode to mt7531

A board may have no independent reset-line, so reset cannot be used
inside switch driver.

E.g. on Bananapi-R2 Pro switch and gmac are connected to same reset-line.

Resets should be acquired only to 1 device/driver. This prevents reset to
be bound to switch-driver if reset is already used for gmac. If reset is
only used by switch driver it resets the switch *and* the gmac after the
mdio bus comes up resulting in mdio bus goes down. It takes some time
until all is up again, switch driver tries to read from mdio, will fail
and defer the probe. On next try the reset does the same again.

Make reset optional for such boards.

Allow port 5 as cpu-port and phy-mode rgmii for mt7531.

- MT7530 supports RGMII on port 5 and RGMII/TRGMII on port 6.
- MT7531 supports on port 5 RGMII and SGMII (dual-sgmii) and
  SGMII on port 6.

Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: mt7530: get cpu-port via dp->cpu_dp instead of constant
Frank Wunderlich [Fri, 10 Jun 2022 17:05:39 +0000 (19:05 +0200)]
net: dsa: mt7530: get cpu-port via dp->cpu_dp instead of constant

Replace last occurences of hardcoded cpu-port by cpu_dp member of
dsa_port struct.

Now the constant can be dropped.

Suggested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: mt7530: rework mt753[01]_setup
Frank Wunderlich [Fri, 10 Jun 2022 17:05:38 +0000 (19:05 +0200)]
net: dsa: mt7530: rework mt753[01]_setup

Enumerate available cpu-ports instead of using hardcoded constant.

Suggested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: mt7530: rework mt7530_hw_vlan_{add,del}
Frank Wunderlich [Fri, 10 Jun 2022 17:05:37 +0000 (19:05 +0200)]
net: dsa: mt7530: rework mt7530_hw_vlan_{add,del}

Rework vlan_add/vlan_del functions in preparation for dynamic cpu port.

Currently BIT(MT7530_CPU_PORT) is added to new_members, even though
mt7530_port_vlan_add() will be called on the CPU port too.

Let DSA core decide when to call port_vlan_add for the CPU port, rather
than doing it implicitly.

We can do autonomous forwarding in a certain VLAN, but not add br0 to that
VLAN and avoid flooding the CPU with those packets, if software knows it
doesn't need to process them.

Suggested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agodt-bindings: net: dsa: convert binding for mediatek switches
Frank Wunderlich [Fri, 10 Jun 2022 17:05:36 +0000 (19:05 +0200)]
dt-bindings: net: dsa: convert binding for mediatek switches

Convert txt binding to yaml binding for Mediatek switches.

Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'mlxsw-remove-xm-support'
Jakub Kicinski [Wed, 15 Jun 2022 04:51:07 +0000 (21:51 -0700)]
Merge branch 'mlxsw-remove-xm-support'

Ido Schimmel says:

====================
mlxsw: Remove XM support

The XM was supposed to be an external device connected to the
Spectrum-{2,3} ASICs using dedicated Ethernet ports. Its purpose was to
increase the number of routes that can be offloaded to hardware. This was
achieved by having the ASIC act as a cache that refers cache misses to the
XM where the FIB is stored and LPM lookup is performed.

Testing was done over an emulator and dedicated setups in the lab, but
the product was discontinued before shipping to customers.

Therefore, in order to remove dead code and reduce complexity of the
code base, revert the three patchsets that added XM support.
====================

Link: https://lore.kernel.org/r/20220613132116.2021055-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agomlxsw: Revert "Prepare for XM implementation - LPM trees"
Petr Machata [Mon, 13 Jun 2022 13:21:16 +0000 (16:21 +0300)]
mlxsw: Revert "Prepare for XM implementation - LPM trees"

This reverts commit 923ba95ea22d ("Merge branch
'mlxsw-spectrum-prepare-for-xm-implementation-lpm-trees'").

Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agomlxsw: Revert "Prepare for XM implementation - prefix insertion and removal"
Petr Machata [Mon, 13 Jun 2022 13:21:15 +0000 (16:21 +0300)]
mlxsw: Revert "Prepare for XM implementation - prefix insertion and removal"

This reverts commit e7086213f7b4 ("Merge branch
'mlxsw-spectrum-prepare-for-xm-implementation-prefix-insertion-and-removal'").

Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agomlxsw: Revert "Introduce initial XM router support"
Petr Machata [Mon, 13 Jun 2022 13:21:14 +0000 (16:21 +0300)]
mlxsw: Revert "Introduce initial XM router support"

This reverts commit 75c2a8fe8e39 ("Merge branch
'mlxsw-introduce-initial-xm-router-support'").

Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox...
Jakub Kicinski [Wed, 15 Jun 2022 02:09:39 +0000 (19:09 -0700)]
Merge branch 'mlx5-next' of git://git./linux/kernel/git/mellanox/linux

Saeed Mahameed says:

====================
mlx5-next: updates 2022-06-14

1) Updated HW bits and definitions for upcoming features
 1.1) vport debug counters
 1.2) flow meter
 1.3) Execute ASO action for flow entry
 1.4) enhanced CQE compression

2) Add ICM header-modify-pattern RDMA API

Leon Says
=========

SW steering manipulates packet's header using "modifying header" actions.
Many of these actions do the same operation, but use different data each time.
Currently we create and keep every one of these actions, which use expensive
and limited resources.

Now we introduce a new mechanism - pattern and argument, which splits
a modifying action into two parts:
1. action pattern: contains the operations to be applied on packet's header,
mainly set/add/copy of fields in the packet
2. action data/argument: contains the data to be used by each operation
in the pattern.

This way we reuse same patterns with different arguments to create new
modifying actions, and since many actions share the same operations, we end
up creating a small number of patterns that we keep in a dedicated cache.

These modify header patterns are implemented as new type of ICM memory,
so the following kernel patch series add the support for this new ICM type.
==========

* 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux:
  net/mlx5: Add bits and fields to support enhanced CQE compression
  net/mlx5: Remove not used MLX5_CAP_BITS_RW_MASK
  net/mlx5: group fdb cleanup to single function
  net/mlx5: Add support EXECUTE_ASO action for flow entry
  net/mlx5: Add HW definitions of vport debug counters
  net/mlx5: Add IFC bits and enums for flow meter
  RDMA/mlx5: Support handling of modify-header pattern ICM area
  net/mlx5: Manage ICM of type modify-header pattern
  net/mlx5: Introduce header-modify-pattern ICM properties
====================

Link: https://lore.kernel.org/r/20220614184028.51548-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agodocs: tls: document the TLS_TX_ZEROCOPY_RO
Jakub Kicinski [Fri, 10 Jun 2022 18:02:12 +0000 (11:02 -0700)]
docs: tls: document the TLS_TX_ZEROCOPY_RO

Add missing documentation for the TLS_TX_ZEROCOPY_RO opt-in.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Link: https://lore.kernel.org/r/20220610180212.110590-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agoethtool: Fix and simplify ethtool_convert_link_mode_to_legacy_u32()
Marco Bonelli [Thu, 9 Jun 2022 13:49:01 +0000 (15:49 +0200)]
ethtool: Fix and simplify ethtool_convert_link_mode_to_legacy_u32()

Fix the implementation of ethtool_convert_link_mode_to_legacy_u32(), which
is supposed to return false if src has bits higher than 31 set. The current
implementation uses the complement of bitmap_fill(ext, 32) to test high
bits of src, which is wrong as bitmap_fill() fills _with long granularity_,
and sizeof(long) can be > 4. No users of this function currently check the
return value, so the bug was dormant.

Also remove the check for __ETHTOOL_LINK_MODE_MASK_NBITS > 32, as the enum
ethtool_link_mode_bit_indices contains far beyond 32 values. Using
find_next_bit() to test the src bitmask works regardless of this anyway.

Signed-off-by: Marco Bonelli <marco@mebeim.net>
Link: https://lore.kernel.org/r/20220609134900.11201-1-marco@mebeim.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: phy: fixed_phy: set phy_mask before calling mdiobus_register()
Rasmus Villemoes [Mon, 6 Jun 2022 20:02:08 +0000 (22:02 +0200)]
net: phy: fixed_phy: set phy_mask before calling mdiobus_register()

There's no point probing for phys on this artificial bus, so we can
save a little bit of boot time by telling mdiobus_register() not to do
that.

This doesn't have any functional change, since, at this point,
fixed_mdio_read() returns 0xffff for all addresses/registers, so

  mdiobus_scan() -> get_phy_device() -> get_phy_c22_id()

will return -ENODEV, which is just ignored.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Link: https://lore.kernel.org/r/20220606200208.1665417-1-linux@rasmusvillemoes.dk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet/mlx5: Add bits and fields to support enhanced CQE compression
Ofer Levi [Wed, 8 Jun 2022 20:04:52 +0000 (13:04 -0700)]
net/mlx5: Add bits and fields to support enhanced CQE compression

Expose ifc bits and add needed structure fields and methods to
support enhanced CQE compression feature.
The enhanced CQE compression feature improves cpu utiliziation with
better packet latency from nic to host.

Signed-off-by: Ofer Levi <oferle@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5: Remove not used MLX5_CAP_BITS_RW_MASK
Shay Drory [Wed, 8 Jun 2022 20:04:51 +0000 (13:04 -0700)]
net/mlx5: Remove not used MLX5_CAP_BITS_RW_MASK

Remove not used MLX5_CAP_BITS_RW_MASK.
While at it, remove CAP_MASK, MLX5_CAP_OFF_CMDIF_CSUM
and MLX5_DEV_CAP_FLAG_*, since MLX5_CAP_BITS_RW_MASK
was their only user.

Signed-off-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5: group fdb cleanup to single function
Shay Drory [Wed, 8 Jun 2022 20:04:50 +0000 (13:04 -0700)]
net/mlx5: group fdb cleanup to single function

Currently, the allocation of fdb software objects are done is single
function, oppose to the cleanup of them.
Group the cleanup of fdb software objects to single function.

Signed-off-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5: Add support EXECUTE_ASO action for flow entry
Jianbo Liu [Wed, 8 Jun 2022 20:04:49 +0000 (13:04 -0700)]
net/mlx5: Add support EXECUTE_ASO action for flow entry

Attach flow meter to FTE with object id and index.
Use metadata register C5 to store the packet color meter result.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5: Add HW definitions of vport debug counters
Saeed Mahameed [Wed, 8 Jun 2022 20:04:48 +0000 (13:04 -0700)]
net/mlx5: Add HW definitions of vport debug counters

total_q_under_processor_handle - number of queues in error state due to an
async error or errored command.

send_queue_priority_update_flow - number of QP/SQ priority/SL update
events.

cq_overrun - number of times CQ entered an error state due to an
overflow.

async_eq_overrun -number of time an EQ mapped to async events was
overrun.

comp_eq_overrun - number of time an EQ mapped to completion events was
overrun.

quota_exceeded_command - number of commands issued and failed due to quota
exceeded.

invalid_command - number of commands issued and failed dues to any reason
other than quota exceeded.

Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Michael Guralnik <michaelgur@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5: Add IFC bits and enums for flow meter
Jianbo Liu [Wed, 8 Jun 2022 20:04:47 +0000 (13:04 -0700)]
net/mlx5: Add IFC bits and enums for flow meter

Add/extend structure layouts and defines for flow meter.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Ariel Levkovich <lariel@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agoRDMA/mlx5: Support handling of modify-header pattern ICM area
Yevgeny Kliteynik [Tue, 7 Jun 2022 12:47:45 +0000 (15:47 +0300)]
RDMA/mlx5: Support handling of modify-header pattern ICM area

Add support for allocate/deallocate and registering MR of the new type
of ICM area. Support exists only for devices that support sw_owner_v2.

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5: Manage ICM of type modify-header pattern
Yevgeny Kliteynik [Tue, 7 Jun 2022 12:47:44 +0000 (15:47 +0300)]
net/mlx5: Manage ICM of type modify-header pattern

Added support for managing new type of ICM for devices that
support sw_owner_v2.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Acked-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5: Introduce header-modify-pattern ICM properties
Yevgeny Kliteynik [Tue, 7 Jun 2022 12:47:43 +0000 (15:47 +0300)]
net/mlx5: Introduce header-modify-pattern ICM properties

Added new fields for device memory capabilities, in order to
support creation of ICM memory for modify header patterns.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Acked-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet: make __sys_accept4_file() static
Yajun Deng [Fri, 10 Jun 2022 09:10:17 +0000 (17:10 +0800)]
net: make __sys_accept4_file() static

__sys_accept4_file() isn't used outside of the file, make it static.

As the same time, move file_flags and nofile parameters into
__sys_accept4_file().

Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotcp: sk_forced_mem_schedule() optimization
Eric Dumazet [Sat, 11 Jun 2022 03:30:16 +0000 (20:30 -0700)]
tcp: sk_forced_mem_schedule() optimization

sk_memory_allocated_add() has three callers, and returns
to them @memory_allocated.

sk_forced_mem_schedule() is one of them, and ignores
the returned value.

Change sk_memory_allocated_add() to return void.

Change sock_reserve_memory() and __sk_mem_raise_allocated()
to call sk_memory_allocated().

This removes one cache line miss [1] for RPC workloads,
as first skbs in TCP write queue and receive queue go through
sk_forced_mem_schedule().

[1] Cache line holding tcp_memory_allocated.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: smsc95xx: add support for Microchip EVB-LAN8670-USB
Parthiban Veerasooran [Mon, 13 Jun 2022 09:12:07 +0000 (14:42 +0530)]
net: smsc95xx: add support for Microchip EVB-LAN8670-USB

This patch adds support for Microchip's EVB-LAN8670-USB 10BASE-T1S
ethernet device to the existing smsc95xx driver by adding the new
USB VID/PID pairs.

Signed-off-by: Parthiban Veerasooran <Parthiban.Veerasooran@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonfp: support 48-bit DMA addressing for NFP3800
Yinjun Zhang [Mon, 13 Jun 2022 09:58:31 +0000 (11:58 +0200)]
nfp: support 48-bit DMA addressing for NFP3800

48-bit DMA addressing is supported in NFP3800 HW and implemented
in NFDK firmware, so enable this feature in driver now. Note that
with this change, NFD3 firmware, which doesn't implement 48-bit
DMA, cannot be used for NFP3800 any more.

RX free list descriptor, used by both NFD3 and NFDK, is also modified
to support 48-bit DMA. That's OK because the top bits is always get
set to 0 when assigned with 40-bit address.

Based on initial work of Jakub Kicinski <jakub.kicinski@netronome.com>.

Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'ipa-refactoring'
David S. Miller [Mon, 13 Jun 2022 11:01:58 +0000 (12:01 +0100)]
Merge branch 'ipa-refactoring'

Alex Elder says:

====================
net: ipa: simple refactoring

This series contains some minor code improvements.

The first patch verifies that the configuration is compatible with a
recently-defined limit.  The second and third rename two fields so
they better reflect their use in the code.  The next gets rid of an
empty function by reworking its only caller.

The last two begin to remove the assumption that an event ring is
associated with a single channel.  Eventually we'll support having
multiple channels share an event ring but some more needs to be done
before that can happen.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: ipa: derive channel from transaction
Alex Elder [Fri, 10 Jun 2022 15:46:15 +0000 (10:46 -0500)]
net: ipa: derive channel from transaction

In gsi_channel_tx_queued(), we report when a transaction gets passed
to hardware.  Change that function so it takes transaction rather
than a channel as its argument, and derive the channel from the
transaction.  Rename the function accordingly.

Delete the header comments above the function definition; the ones
above the declaration in "gsi_private.h" should suffice.  In
addition, the comments above gsi_channel_tx_update() do a fine job
of explaining what's going on.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: ipa: determine channel from event
Alex Elder [Fri, 10 Jun 2022 15:46:14 +0000 (10:46 -0500)]
net: ipa: determine channel from event

Each event in an event ring describes the TRE whose completion
caused the event.  Currently, every event ring is dedicated to a
single channel, so the channel is easily derived from the event
ring.

An event ring can actually be shared by more than one channel
though, and to distinguish events for one channel from another, the
event structure contains a field indicating which channel the event
is associated with.

In gsi_event_trans(), use the channel ID in an event to determine
which channel the event is for.  This makes the channel pointer now
passed to that function irrelevant; pass the GSI pointer to that
function instead.

And although it shouldn't happen, warn if an event arrives that
records a channel ID that's not in use, or if the event does not
have a transaction associated with it.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: ipa: simplify endpoint transaction completion
Alex Elder [Fri, 10 Jun 2022 15:46:13 +0000 (10:46 -0500)]
net: ipa: simplify endpoint transaction completion

When a GSI transaction completes, ipa_endpoint_trans_complete() is
eventually called.  That handles TX and RX completions separately,
but ipa_endpoint_tx_complete() is a no-op.

Instead, have ipa_endpoint_trans_complete() return immediately for a
TX transaction, and incorporate code from ipa_endpoint_rx_complete()
to handle RX transactions.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: ipa: rename endpoint->trans_tre_max
Alex Elder [Fri, 10 Jun 2022 15:46:12 +0000 (10:46 -0500)]
net: ipa: rename endpoint->trans_tre_max

The trans_tre_max field of the IPA endpoint structure is only used
to limit the number of fragments allowed for an SKB being prepared
for transmission.  Recognizing that, rename the field skb_frag_max,
and reduce its value by 1.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: ipa: rename channel->tlv_count
Alex Elder [Fri, 10 Jun 2022 15:46:11 +0000 (10:46 -0500)]
net: ipa: rename channel->tlv_count

Each GSI channel has a TLV FIFO of a certain size, specified in the
configuration data for an AP channel.  That size dictates the
maximum number of TREs that are allowed in a single transaction.

The only way that value is used after initialization is as a limit
on the number of TREs in a transaction; calling it "tlv_count"
isn't helpful, and in fact gsi_channel_trans_tre_max() exists to
sort of abstract it.

Instead, rename the channel->tlv_count field trans_tre_max, and get
rid of the helper function.  Update a couple of comments as well.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: ipa: verify command channel TLV count
Alex Elder [Fri, 10 Jun 2022 15:46:10 +0000 (10:46 -0500)]
net: ipa: verify command channel TLV count

In commit 8797972afff3d ("net: ipa: remove command info pool"), the
maximum number of IPA commands that would be sent in a single
transaction was defined.  That number can't exceed the size of the
TLV FIFO on the command channel, and we can check that at runtime.

To add this check, pass a new flag to gsi_channel_data_valid() to
indicate the channel being checked is being used for IPA commands.
Knowing that we can also verify the channel direction is correct.

Use a new local variable that refers to the command-specific portion
of the data being checked.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonfp: flower: support to offload pedit of IPv6 flowinto fields
Yinjun Zhang [Thu, 9 Jun 2022 08:01:36 +0000 (10:01 +0200)]
nfp: flower: support to offload pedit of IPv6 flowinto fields

Previously the traffic class field is ignored while firmware has
already supported to pedit flowinfo fields, including traffic
class and flow label, now add it back.

Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Link: https://lore.kernel.org/r/20220609080136.151830-1-simon.horman@corigine.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoethernet: Remove vf rate limit check for drivers
Bin Chen [Thu, 9 Jun 2022 08:47:17 +0000 (10:47 +0200)]
ethernet: Remove vf rate limit check for drivers

The commit a14857c27a50 ("rtnetlink: verify rate parameters for calls to
ndo_set_vf_rate") has been merged to master, so we can to remove the
now-duplicate checks in drivers.

Signed-off-by: Bin Chen <bin.chen@corigine.com>
Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20220609084717.155154-1-simon.horman@corigine.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next...
Jakub Kicinski [Sat, 11 Jun 2022 05:07:31 +0000 (22:07 -0700)]
Merge branch '10GbE' of git://git./linux/kernel/git/tnguy/next-queue

Tony Nguyen says:

====================
10GbE Intel Wired LAN Driver Updates 2022-06-09

Maximilian Heyne adds reporting of VF statistics on ixgbe via iproute2
interface.

Kai-Heng Feng removes duplicate defines from igb.

Jiaqing Zhao fixes typos in e1000, ixgb, and ixgbe drivers.

Julia Lawall fixes typos for fm10k, ixgbe, and ice drivers.

* '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  drivers/net/ethernet/intel: fix typos in comments
  ixgbe: Fix typos in comments
  ixgb: Fix typos in comments
  e1000: Fix typos in comments
  igb: Remove duplicate defines
  drivers, ixgbe: export vf statistics
====================

Link: https://lore.kernel.org/r/20220609171257.2727150-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'net-reduce-tcp_memory_allocated-inflation'
Jakub Kicinski [Fri, 10 Jun 2022 23:21:39 +0000 (16:21 -0700)]
Merge branch 'net-reduce-tcp_memory_allocated-inflation'

Eric Dumazet says:

====================
net: reduce tcp_memory_allocated inflation

Hosts with a lot of sockets tend to hit so called TCP memory pressure,
leading to very bad TCP performance and/or OOM.

The problem is that some TCP sockets can hold up to 2MB of 'forward
allocations' in their per-socket cache (sk->sk_forward_alloc),
and there is no mechanism to make them relinquish their share
under mem pressure.
Only under some potentially rare events their share is reclaimed,
one socket at a time.

In this series, I implemented a per-cpu cache instead of a per-socket one.

Each CPU has a +1/-1 MB (256 pages on x86) forward alloc cache, in order
to not dirty tcp_memory_allocated shared cache line too often.

We keep sk->sk_forward_alloc values as small as possible, to meet
memcg page granularity constraint.

Note that memcg already has a per-cpu cache, although MEMCG_CHARGE_BATCH
is defined to 32 pages, which seems a bit small.

Note that while this cover letter mentions TCP, this work is generic
and supports TCP, UDP, DECNET, SCTP.
====================

Link: https://lore.kernel.org/r/20220609063412.2205738-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: unexport __sk_mem_{raise|reduce}_allocated
Eric Dumazet [Thu, 9 Jun 2022 06:34:12 +0000 (23:34 -0700)]
net: unexport __sk_mem_{raise|reduce}_allocated

These two helpers are only used from core networking.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: keep sk->sk_forward_alloc as small as possible
Eric Dumazet [Thu, 9 Jun 2022 06:34:11 +0000 (23:34 -0700)]
net: keep sk->sk_forward_alloc as small as possible

Currently, tcp_memory_allocated can hit tcp_mem[] limits quite fast.

Each TCP socket can forward allocate up to 2 MB of memory, even after
flow became less active.

10,000 sockets can have reserved 20 GB of memory,
and we have no shrinker in place to reclaim that.

Instead of trying to reclaim the extra allocations in some places,
just keep sk->sk_forward_alloc values as small as possible.

This should not impact performance too much now we have per-cpu
reserves: Changes to tcp_memory_allocated should not be too frequent.

For sockets not using SO_RESERVE_MEM:
 - idle sockets (no packets in tx/rx queues) have zero forward alloc.
 - non idle sockets have a forward alloc smaller than one page.

Note:

 - Removal of SK_RECLAIM_CHUNK and SK_RECLAIM_THRESHOLD
   is left to MPTCP maintainers as a follow up.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: fix sk_wmem_schedule() and sk_rmem_schedule() errors
Eric Dumazet [Thu, 9 Jun 2022 06:34:10 +0000 (23:34 -0700)]
net: fix sk_wmem_schedule() and sk_rmem_schedule() errors

If sk->sk_forward_alloc is 150000, and we need to schedule 150001 bytes,
we want to allocate 1 byte more (rounded up to one page),
instead of 150001 :/

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: implement per-cpu reserves for memory_allocated
Eric Dumazet [Thu, 9 Jun 2022 06:34:09 +0000 (23:34 -0700)]
net: implement per-cpu reserves for memory_allocated

We plan keeping sk->sk_forward_alloc as small as possible
in future patches.

This means we are going to call sk_memory_allocated_add()
and sk_memory_allocated_sub() more often.

Implement a per-cpu cache of +1/-1 MB, to reduce number
of changes to sk->sk_prot->memory_allocated, which
would otherwise be cause of false sharing.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: add per_cpu_fw_alloc field to struct proto
Eric Dumazet [Thu, 9 Jun 2022 06:34:08 +0000 (23:34 -0700)]
net: add per_cpu_fw_alloc field to struct proto

Each protocol having a ->memory_allocated pointer gets a corresponding
per-cpu reserve, that following patches will use.

Instead of having reserved bytes per socket,
we want to have per-cpu reserves.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: remove SK_MEM_QUANTUM and SK_MEM_QUANTUM_SHIFT
Eric Dumazet [Thu, 9 Jun 2022 06:34:07 +0000 (23:34 -0700)]
net: remove SK_MEM_QUANTUM and SK_MEM_QUANTUM_SHIFT

Due to memcg interface, SK_MEM_QUANTUM is effectively PAGE_SIZE.

This might change in the future, but it seems better to avoid the
confusion.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoRevert "net: set SK_MEM_QUANTUM to 4096"
Eric Dumazet [Thu, 9 Jun 2022 06:34:06 +0000 (23:34 -0700)]
Revert "net: set SK_MEM_QUANTUM to 4096"

This reverts commit bd68a2a854ad5a85f0c8d0a9c8048ca3f6391efb.

This change broke memcg on arches with PAGE_SIZE != 4096

Later, commit 2bb2f5fb21b04 ("net: add new socket option SO_RESERVE_MEM")
also assumed PAGE_SIZE==SK_MEM_QUANTUM

Following patches in the series will greatly reduce the over allocations
problem.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>