platform/kernel/linux-rpi.git
3 years agonet/mlx5: Bridge, dynamic entry ageing
Vlad Buslov [Tue, 30 Mar 2021 17:08:57 +0000 (20:08 +0300)]
net/mlx5: Bridge, dynamic entry ageing

Dynamic FDB entries require capability to age out unused entries. Such
entries are either aged out by kernel software bridge implementation or by
hardware switch that offloaded them (and notified the kernel to mark them
as SWITCHDEV_FDB_ADD_TO_BRIDGE). Leaving ageing to kernel bridge would
result it deleting offloaded dynamic FDB entries every ageing_time period
due to packets being processed by hardware and, consecutively, 'used'
timestamp for FDB entry not being updated. However, since hardware doesn't
support ageing, software solution inside the driver is required.

In order to emulate hardware ageing in driver, extend bridge FDB ingress
flows with counter and create delayed br_offloads->update_work task on
bridge offloads workqueue. Run the task every second, update 'used'
timestamp in software bridge dynamic entry by sending
SWITCHDEV_FDB_ADD_TO_BRIDGE for the entry, if it flow hardware counter
lastuse field was changed since last update. If lastuse wasn't changed for
ageing_time period, then delete the FDB entry and notify kernel bridge by
sending SWITCHDEV_FDB_DEL_TO_BRIDGE notification.

Register blocking switchdev notifier callback and handle attribute set
SWITCHDEV_ATTR_ID_BRIDGE_AGEING_TIME event to allow user to dynamically
configure bridge FDB entry ageing timeout. Save the value per-bridge in
struct mlx5_esw_bridge. Silently ignore
SWITCHDEV_ATTR_ID_PORT_{PRE_}BRIDGE_FLAGS switchdev event since mlx5 bridge
implementation relies on software bridge for implementing necessary
behavior for all of these flags.

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: Bridge, handle FDB events
Vlad Buslov [Mon, 7 Jun 2021 14:42:49 +0000 (17:42 +0300)]
net/mlx5: Bridge, handle FDB events

Hardware supported by mlx5 driver doesn't provide learning and requires the
driver to emulate all switch-like behavior in software. As such, all
packets by default go through miss path, appear on representor and get to
software bridge, if it is the upper device of the representor. This causes
bridge to process packet in software, learn the MAC address to FDB and send
SWITCHDEV_FDB_ADD_TO_DEVICE event to all subscribers.

In order to offload FDB entries in mlx5, register switchdev notifier
callback and implement support for both 'added_by_user' and dynamic FDB
entry SWITCHDEV_FDB_ADD_TO_DEVICE events asynchronously using new
mlx5_esw_bridge_offloads->wq ordered workqueue. In workqueue callback
offload the ingress rule (matching FDB entry MAC as packet source MAC) and
egress table rule (matching FDB entry MAC as destination MAC). For ingress
table rule also match source vport to ensure that only traffic coming from
expected bridge port is matched by offloaded rule. Save all the relevant
FDB entry data in struct mlx5_esw_bridge_fdb_entry instance and insert the
instance in new mlx5_esw_bridge->fdb_list list (for traversing all entries
by software ageing implementation in following patch) and in new
mlx5_esw_bridge->fdb_ht hash table for fast retrieval. Notify the bridge
that FDB entry has been offloaded by sending SWITCHDEV_FDB_OFFLOADED
notification.

Delete FDB entry on reception of SWITCHDEV_FDB_DEL_TO_DEVICE event.

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: Bridge, add offload infrastructure
Vlad Buslov [Fri, 2 Apr 2021 12:57:02 +0000 (15:57 +0300)]
net/mlx5: Bridge, add offload infrastructure

Create new files bridge.{c|h} in en/rep directory that implement bridge
interaction with representor netdevices and handle required
events/notifications, bridge.{c|h} in esw directory that implement all
necessary eswitch offloading infrastructure and works on vport/eswitch
level. Provide new kconfig MLX5_BRIDGE which is automatically selected when
both kernel bridge and mlx5 eswitch configs are enabled.

Provide basic infrastructure for bridge offloads:

- struct mlx5_esw_bridge_offloads - per-eswitch bridge offload structure
that encapsulates generic bridge-offloads data (notifier blocks, ingress
flow table/group, etc.) that is created/deleted on enable/disable eswitch
offloads.

- struct mlx5_esw_bridge - per-bridge structure that encapsulates
per-bridge data (reference counter, FDB, egress flow table/group, etc.)
that is created when first eswitch represetor is attached to new bridge and
deleted when last representor is removed from the bridge as a result of
NETDEV_CHANGEUPPER event.

The bridge tables are created with new priority FDB_BR_OFFLOAD in FDB
namespace. The new priority is between tc-miss and slow path priorities.
Priority consist of two levels: the ingress table that is global per
eswitch and matches incoming packets by src_mac/vid and redirects them to
next level (egress table) that is chosen according to ingress port bridge
membership and matches on dst_mac/vid in order to redirect packet to vport
according to the following diagram:

                +
                |
      +---------v----------+
      |                    |
      |   FDB_TC_OFFLOAD   |
      |                    |
      +---------+----------+
                |
                |
      +---------v----------+
      |                    |
      |   FDB_FT_OFFLOAD   |
      |                    |
      +---------+----------+
                |
                |
      +---------v----------+
      |                    |
      |    FDB_TC_MISS     |
      |                    |
      +---------+----------+
                |
+--------------------------------------+
|               |                      |
|        +------+                      |
|        |                             |
| +------v--------+   FDB_BR_OFFLOAD   |
| | INGRESS_TABLE |                    |
| +------+---+----+                    |
|        |   |      match              |
|        |   +---------+               |
|        |             |               |    +-------+
|        |     +-------v-------+ match |    |       |
|        |     | EGRESS_TABLE  +------------> vport |
|        |     +-------+-------+       |    |       |
|        |             |               |    +-------+
|        |    miss     |               |
|        +------+------+               |
|               |                      |
+--------------------------------------+
                |
                |
      +---------v----------+
      |                    |
      |   FDB_SLOW_PATH    |
      |                    |
      +---------+----------+
                |
                v

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5e: Refactor mlx5e_eswitch_{*}rep() helpers
Vlad Buslov [Wed, 21 Apr 2021 18:52:43 +0000 (21:52 +0300)]
net/mlx5e: Refactor mlx5e_eswitch_{*}rep() helpers

Change the helper to functions to accept constant pointer to struct
net_device. This is necessary for following patches in series that pass
mlx5e_eswitch_rep() as a callback to kernel bridge infrastructure code.

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: Create TC-miss priority and table
Vlad Buslov [Thu, 4 Mar 2021 11:09:53 +0000 (13:09 +0200)]
net/mlx5: Create TC-miss priority and table

In order to adhere to kernel software datapath model bridge offloads must
come after TC and NF FDBs. Following patches in this series add new FDB
priority for bridge after FDB_FT_OFFLOAD. However, since netfilter offload
is implemented with unmanaged tables, its miss path is not automatically
connected to next priority and requires the code to manually connect with
slow table. To keep bridge offloads encapsulated and not mix it with
eswitch offloads, create a new FDB_TC_MISS priority between FDB_FT_OFFLOAD
and FDB_SLOW_PATH:

          +
          |
+---------v----------+
|                    |
|   FDB_TC_OFFLOAD   |
|                    |
+---------+----------+
          |
          |
          |
+---------v----------+
|                    |
|   FDB_FT_OFFLOAD   |
|                    |
+---------+----------+
          |
          |
          |
+---------v----------+
|                    |
|    FDB_TC_MISS     |
|                    |
+---------+----------+
          |
          |
          |
+---------v----------+
|                    |
|   FDB_SLOW_PATH    |
|                    |
+---------+----------+
          |
          v

Initialize the new priority with single default empty managed table and use
the table as TC/NF miss patch instead of slow table. This approach allows
bridge offloads to be created as new FDB namespace priority between
FDB_TC_MISS and FDB_SLOW_PATH without exposing its internal tables to any
other modules since miss path of managed TC-miss table is automatically
wired to next priority.

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: DR, Support EMD tag in modify header for STEv1
Yevgeny Kliteynik [Sun, 14 Mar 2021 01:08:28 +0000 (03:08 +0200)]
net/mlx5: DR, Support EMD tag in modify header for STEv1

Add support for EMD tag in modify header set/copy actions
on device that supports STEv1.

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: DR, Added support for INSERT_HEADER reformat type
Yevgeny Kliteynik [Thu, 28 Jan 2021 20:12:07 +0000 (22:12 +0200)]
net/mlx5: DR, Added support for INSERT_HEADER reformat type

Add support for INSERT_HEADER packet reformat context type

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: Added new parameters to reformat context
Yevgeny Kliteynik [Tue, 9 Mar 2021 01:30:44 +0000 (03:30 +0200)]
net/mlx5: Added new parameters to reformat context

Adding new reformat context type (INSERT_HEADER) requires adding two new
parameters to reformat context - reformat_param_0 and reformat_param_1.
As defined by HW spec, these parameters have different meaning for
different reformat context type.

The first parameter (reformat_param_0) is not new to HW spec, but it
wasn't used by any of the supported reformats. The second parameter
(reformat_param_1) is new to the HW spec - it was added to allow
supporting INSERT_HEADER.

For NSERT_HEADER, reformat_param_0 indicates the header used to
reference the location of the inserted header, and reformat_param_1
indicates the offset of the inserted header from the reference point
defined by reformat_param_0.

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: DR, Allow encap action for RX for supporting devices
Yevgeny Kliteynik [Wed, 10 Mar 2021 02:38:06 +0000 (04:38 +0200)]
net/mlx5: DR, Allow encap action for RX for supporting devices

Encap actions on RX flow were not supported on older devices.
However, this is no longer the case in devices that support STEv1.
This patch adds support for encap l3/l2 on RX flow for supported
devices: update actions state machine by adding the newely supported
transitions and add the required support in STEv0/1 files.
The new transitions that are supported are:
 - from decap/modify-header/pop-vlan to encap
 - from encap to termination table

Signed-off-by: Erez Shitrit <erezsh@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: DR, Split reformat state to Encap and Decap
Yevgeny Kliteynik [Wed, 10 Mar 2021 02:12:28 +0000 (04:12 +0200)]
net/mlx5: DR, Split reformat state to Encap and Decap

Split single reformat state into two separate states for encap and decap.
This will allow adding actions to the specific domain, such as encap on RX.

Signed-off-by: Erez Shitrit <erezsh@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: mlx5_ifc support for header insert/remove
Yevgeny Kliteynik [Tue, 9 Mar 2021 01:29:16 +0000 (03:29 +0200)]
net/mlx5: mlx5_ifc support for header insert/remove

Add support for HCA caps 2 that contains capabilities for the new
insert/remove header actions.

Added the required definitions for supporting the new reformat type:
added packet reformat parameters, reformat anchors and definitions
to allow copy/set into the inserted EMD (Embedded MetaData) tag.

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agoMerge branch 'ipa-mem-1'
David S. Miller [Wed, 9 Jun 2021 22:59:34 +0000 (15:59 -0700)]
Merge branch 'ipa-mem-1'

Alex Elder says:

====================
net: ipa: memory region rework, part 1

This is the first portion of a very long series of patches that has
been split in two.  Once these patches are accepted, I'll post the
remaining patches.

The combined series reworks the way memory regions are defined in
the configuration data, and in the process solidifies code that
ensures configurations are valid.

In this portion (part 1), most of the focus is on improving
validation of code.  This validation is now done unconditionally
(something I promised Leon Romanovsky I would work on).  Validation
will occur earlier than before, catching configuration problems as
early as possible and permitting the rest of the driver to avoid
needing to do some error checking.  There will now be checks to
ensure all defined regions are supported by the hardware, that
required regions are all defined, and that there are no duplicate
regions.

The second portion (part 2) is mainly a set of small but pervasive
changes whose result is to have the memory region array not be
indexed by region ID.  I'll provide further explanation when I post
that series.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ipa: use bitmap to check for missing regions
Alex Elder [Wed, 9 Jun 2021 22:35:03 +0000 (17:35 -0500)]
net: ipa: use bitmap to check for missing regions

In ipa_mem_valid(), wait until regions have been marked in the memory
region bitmap, and check all that are not found there to ensure they
are not required.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ipa: flag duplicate memory regions
Alex Elder [Wed, 9 Jun 2021 22:35:02 +0000 (17:35 -0500)]
net: ipa: flag duplicate memory regions

Add a test in ipa_mem_valid() to ensure no memory region is defined
more than once, using a bitmap to record each defined memory region.
Skip over undefined regions when checking (we can have any number of
those).

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ipa: validate memory regions based on version
Alex Elder [Wed, 9 Jun 2021 22:35:01 +0000 (17:35 -0500)]
net: ipa: validate memory regions based on version

Introduce ipa_mem_id_valid(), and use it to check defined memory
regions to ensure they are valid for a given version of IPA.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ipa: introduce ipa_mem_id_optional()
Alex Elder [Wed, 9 Jun 2021 22:35:00 +0000 (17:35 -0500)]
net: ipa: introduce ipa_mem_id_optional()

Introduce a new function that indicates whether a given memory
region is required for a given version of IPA hardware.  Use it to
verify that all required regions are present during initialization.

Reorder the definitions of the memory region IDs to be based on
the version in which they're first defined.  Use "+" rather than
"and above" where defining the IPA versions in which memory IDs are
used, and indicate which regions are optional (many are not).

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ipa: pass memory configuration data to ipa_mem_valid()
Alex Elder [Wed, 9 Jun 2021 22:34:59 +0000 (17:34 -0500)]
net: ipa: pass memory configuration data to ipa_mem_valid()

Pass the memory configuration data array to ipa_mem_valid() for
validation, and use that rather than assuming it's already been
recorded in the IPA structure.  Move the memory data array size
check into ipa_mem_valid().

Call ipa_mem_valid() early in ipa_mem_init(), and only proceed with
assigning the memory array pointer and size if it is found to be
valid.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ipa: validate memory regions at init time
Alex Elder [Wed, 9 Jun 2021 22:34:58 +0000 (17:34 -0500)]
net: ipa: validate memory regions at init time

Move the memory region validation check so it happens earlier when
initializing the driver, at init time rather than config time (i.e.,
before access to hardware is required).

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ipa: separate region range check from other validation
Alex Elder [Wed, 9 Jun 2021 22:34:57 +0000 (17:34 -0500)]
net: ipa: separate region range check from other validation

The only thing done by ipa_mem_valid_one() that requires hardware
access is the check for whether all regions fit within the size of
IPA local memory specified by an IPA register.

Introduce ipa_mem_size_valid() to implement this verification and
stop doing so in ipa_mem_valid_one().  Call the new function from
ipa_mem_config() (which is also the caller of ipa_mem_valid()).

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ipa: separate memory validation from initialization
Alex Elder [Wed, 9 Jun 2021 22:34:56 +0000 (17:34 -0500)]
net: ipa: separate memory validation from initialization

Currently, memory regions are validated in the loop that initializes
them.  Instead, validate them separately.

Rename ipa_mem_valid() to be ipa_mem_valid_one().  Define a *new*
function named ipa_mem_valid() that performs validation of the array
of memory regions provided.  This function calls ipa_mem_valid_one()
for each region in turn.

Skip validation for any "empty" region descriptors, which have zero
size and are not preceded by any canary values.  Issue a warning for
such descriptors if the offset is non-zero.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ipa: validate memory regions unconditionally
Alex Elder [Wed, 9 Jun 2021 22:34:55 +0000 (17:34 -0500)]
net: ipa: validate memory regions unconditionally

Do memory region descriptor validation unconditionally, rather than
having it depend on IPA_VALIDATION being defined.

Pass the address of a memory region descriptor rather than a memory
ID to ipa_mem_valid().

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ipa: store memory region id in descriptor
Alex Elder [Wed, 9 Jun 2021 22:34:54 +0000 (17:34 -0500)]
net: ipa: store memory region id in descriptor

Store the memory region ID in the memory descriptor structure.  This
is a move toward *not* indexing the array by the ID, but for now we
must still specify those index values.  Define an explicitly
undefined region ID, value 0, so uninitialized entries in the array
won't use an otherwise valid ID.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ipa: define IPA_MEM_END_MARKER
Alex Elder [Wed, 9 Jun 2021 22:34:53 +0000 (17:34 -0500)]
net: ipa: define IPA_MEM_END_MARKER

Define a new pseudo memory region identifer that specifies the
offset at the end of IPA resident memory.  Use it instead of
IPA_MEM_UC_EVENT_RING in places where the size of that region was
defined to be 0.

The size of the IPA_MEM_END_MARKER pseudo region must be zero.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: sja1105: Fix assigned yet unused return code rc
Colin Ian King [Wed, 9 Jun 2021 17:43:53 +0000 (18:43 +0100)]
net: dsa: sja1105: Fix assigned yet unused return code rc

The return code variable rc is being set to return error values in two
places in sja1105_mdiobus_base_tx_register and yet it is not being
returned, the function always returns 0 instead. Fix this by replacing
the return 0 with the return code rc.

Addresses-Coverity: ("Unused value")
Fixes: 5a8f09748ee7 ("net: dsa: sja1105: register the MDIO buses for 100base-T1 and 100base-TX")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agostmmac: prefetch right address
Matteo Croce [Wed, 9 Jun 2021 17:23:03 +0000 (19:23 +0200)]
stmmac: prefetch right address

To support XDP, a headroom is prepended to the packet data.
Consider this offset when doing a prefetch.

Fixes: da5ec7f22a0f ("net: stmmac: refactor stmmac_init_rx_buffers for stmmac_reinit_rx_buffers")
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agomlxsw: thermal: Fix null dereference of NULL temperature parameter
Colin Ian King [Wed, 9 Jun 2021 17:56:57 +0000 (18:56 +0100)]
mlxsw: thermal: Fix null dereference of NULL temperature parameter

The call to mlxsw_thermal_module_temp_and_thresholds_get passes a NULL
pointer for the temperature and this can be dereferenced in this function
if the mlxsw_reg_query call fails.  The simplist fix is to pass the
address of dummy temperature variable instead of a NULL pointer.

Addresses-Coverity: ("Explicit null dereferenced")
Fixes: 72a64c2fe9d8 ("mlxsw: thermal: Read module temperature thresholds using MTMP register")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ethernet: rmnet: Always subtract MAP header
Kristian Evensen [Wed, 9 Jun 2021 14:32:49 +0000 (16:32 +0200)]
net: ethernet: rmnet: Always subtract MAP header

Commit e1d9a90a9bfd ("net: ethernet: rmnet: Support for ingress MAPv5
checksum offload") broke ingress handling for devices where
RMNET_FLAGS_INGRESS_MAP_CKSUMV5 or RMNET_FLAGS_INGRESS_MAP_CKSUMV4 are
not set. Unless either of these flags are set, the MAP header is not
removed. This commit restores the original logic by ensuring that the
MAP header is removed for all MAP packets.

Fixes: e1d9a90a9bfd ("net: ethernet: rmnet: Support for ingress MAPv5 checksum offload")
Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ena: make symbol 'ena_alloc_map_page' static
Wei Yongjun [Wed, 9 Jun 2021 14:25:06 +0000 (14:25 +0000)]
net: ena: make symbol 'ena_alloc_map_page' static

The sparse tool complains as follows:

drivers/net/ethernet/amazon/ena/ena_netdev.c:978:13: warning:
 symbol 'ena_alloc_map_page' was not declared. Should it be static?

This symbol is not used outside of ena_netdev.c, so marks it static.

Fixes: 947c54c395cb ("net: ena: Use dev_alloc() in RX buffer allocation")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: phy: realtek: net: Fix less than zero comparison of a u16
Colin Ian King [Wed, 9 Jun 2021 17:17:48 +0000 (18:17 +0100)]
net: phy: realtek: net: Fix less than zero comparison of a u16

The comparisons of the u16 values priv->phycr1 and priv->phycr2 to less
than zero always false because they are unsigned. Fix this by using an
int for the assignment and less than zero check.

Addresses-Coverity: ("Unsigned compared against 0")
Fixes: 0a4355c2b7f8 ("net: phy: realtek: add dt property to disable CLKOUT clock")
Fixes: d90db36a9e74 ("net: phy: realtek: add dt property to enable ALDPS mode")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: stmmac: Fix missing { } around two statements in an if statement
Colin Ian King [Wed, 9 Jun 2021 17:05:12 +0000 (18:05 +0100)]
net: stmmac: Fix missing { } around two statements in an if statement

There are missing { } around a block of code on an if statement. Fix this
by adding them in.

Addresses-Coverity: ("Nesting level does not match indentation")
Fixes: 46682cb86a37 ("net: stmmac: enable Intel mGbE 2.5Gbps link speed")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ethernet: ti: cpsw-phy-sel: Use devm_platform_ioremap_resource_byname()
Yang Yingliang [Wed, 9 Jun 2021 13:51:38 +0000 (21:51 +0800)]
net: ethernet: ti: cpsw-phy-sel: Use devm_platform_ioremap_resource_byname()

Use the devm_platform_ioremap_resource_byname() helper instead of
calling platform_get_resource_byname() and devm_ioremap_resource()
separately.

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'mvpp2-prefetch'
David S. Miller [Wed, 9 Jun 2021 22:26:50 +0000 (15:26 -0700)]
Merge branch 'mvpp2-prefetch'

Matteo Croce says:

====================
mvpp2: prefetch data early

These two patches prefetch some data from RAM so to reduce stall
and speedup the packet processing.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agomvpp2: prefetch page
Matteo Croce [Wed, 9 Jun 2021 13:47:14 +0000 (15:47 +0200)]
mvpp2: prefetch page

Most of the time during the RX is caused by the compound_head() call
done at the end of the RX loop:

       â”‚     build_skb():
       [...]
       â”‚     static inline struct page *compound_head(struct page *page)
       â”‚     {
       â”‚     unsigned long head = READ_ONCE(page->compound_head);
 65.23 â”‚       ldr  x2, [x1, #8]

Prefetch the page struct as soon as possible, to speedup the RX path
noticeabily by a ~3-4% packet rate in a drop test.

       â”‚     build_skb():
       [...]
       â”‚     static inline struct page *compound_head(struct page *page)
       â”‚     {
       â”‚     unsigned long head = READ_ONCE(page->compound_head);
 17.92 â”‚       ldr  x2, [x1, #8]

Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agomvpp2: prefetch right address
Matteo Croce [Wed, 9 Jun 2021 13:47:13 +0000 (15:47 +0200)]
mvpp2: prefetch right address

In the RX buffer, the received data starts after a headroom used to
align the IP header and to allow prepending headers efficiently.
The prefetch() should take this into account, and prefetch from
the very start of the received data.

We can see that ether_addr_equal_64bits(), which is the first function
to access the data, drops from the top of the perf top output.

prefetch(data):

Overhead  Shared Object     Symbol
  11.64%  [kernel]          [k] eth_type_trans

prefetch(data + MVPP2_MH_SIZE + MVPP2_SKB_HEADROOM):

Overhead  Shared Object     Symbol
  13.42%  [kernel]          [k] build_skb
  10.35%  [mvpp2]           [k] mvpp2_rx
   9.35%  [kernel]          [k] __netif_receive_skb_core
   8.24%  [kernel]          [k] kmem_cache_free
   7.97%  [kernel]          [k] dev_gro_receive
   7.68%  [kernel]          [k] page_pool_put_page
   7.32%  [kernel]          [k] kmem_cache_alloc
   7.09%  [mvpp2]           [k] mvpp2_bm_pool_put
   3.36%  [kernel]          [k] eth_type_trans

Also, move the eth_type_trans() call a bit down, to give the RAM more
time to prefetch the data.

Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ethernet: ti: am65-cpts: Use devm_platform_ioremap_resource_byname()
Yang Yingliang [Wed, 9 Jun 2021 13:45:37 +0000 (21:45 +0800)]
net: ethernet: ti: am65-cpts: Use devm_platform_ioremap_resource_byname()

Use the devm_platform_ioremap_resource_byname() helper instead of
calling platform_get_resource_byname() and devm_ioremap_resource()
separately.

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: stmmac: Use devm_platform_ioremap_resource_byname()
Yang Yingliang [Wed, 9 Jun 2021 13:36:55 +0000 (21:36 +0800)]
net: stmmac: Use devm_platform_ioremap_resource_byname()

Use the devm_platform_ioremap_resource_byname() helper instead of
calling platform_get_resource_byname() and devm_ioremap_resource()
separately.

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: sgi: ioc3-eth: check return value after calling platform_get_resource()
Yang Yingliang [Wed, 9 Jun 2021 13:25:15 +0000 (21:25 +0800)]
net: sgi: ioc3-eth: check return value after calling platform_get_resource()

It will cause null-ptr-deref if platform_get_resource() returns NULL,
we need check the return value.

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoRevert "nvme-tcp-offload: ULP Series"
Shai Malin [Wed, 9 Jun 2021 10:49:18 +0000 (13:49 +0300)]
Revert "nvme-tcp-offload: ULP Series"

This reverts commits:
762411542050dbe27c7c96f13c57f93da5d9b89a
     nvme: NVME_TCP_OFFLOAD should not default to m
5ff5622ea1f16d535f1be4e478e712ef48fe183b:
     Merge branch 'NVMeTCP-Offload-ULP'

As requested on the mailing-list: https://lore.kernel.org/netdev/SJ0PR18MB3882C20793EA35A3E8DAE300CC379@SJ0PR18MB3882.namprd18.prod.outlook.com/
This patch will revert the nvme-tcp-offload ULP from net-next.

The nvme-tcp-offload ULP series will continue to be considered only on
linux-nvme@lists.infradead.org.

Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com>
Signed-off-by: Michal Kalderon <mkalderon@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Shai Malin <smalin@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: usb: asix: ax88772: Fix less than zero comparison of a u16
Colin Ian King [Wed, 9 Jun 2021 10:24:48 +0000 (11:24 +0100)]
net: usb: asix: ax88772: Fix less than zero comparison of a u16

The comparison of the u16 priv->phy_addr < 0 is always false because
phy_addr is unsigned. Fix this by assigning the return from the call
to function asix_read_phy_addr to int ret and using this for the
less than zero error check comparison.

Fixes: 7e88b11a862a ("net: usb: asix: refactor asix_read_phy_addr() and handle errors on return")
Addresses-Coverity: ("Unsigned compared against 0")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: usb: asix: Fix less than zero comparison of a u16
Colin Ian King [Wed, 9 Jun 2021 10:24:47 +0000 (11:24 +0100)]
net: usb: asix: Fix less than zero comparison of a u16

The comparison of the u16 priv->phy_addr < 0 is always false because
phy_addr is unsigned. Fix this by assigning the return from the call
to function asix_read_phy_addr to int ret and using this for the
less than zero error check comparison.

Addresses-Coverity: ("Unsigned compared against 0")
Fixes: e532a096be0e ("net: usb: asix: ax88772: add phylib support")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
David S. Miller [Wed, 9 Jun 2021 21:50:35 +0000 (14:50 -0700)]
Merge git://git./linux/kernel/git/pablo/nf-next

Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for net-next:

1) Add nfgenmsg field to nfnetlink's struct nfnl_info and use it.

2) Remove nft_ctx_init_from_elemattr() and nft_ctx_init_from_setattr()
   helper functions.

3) Add the nf_ct_pernet() helper function to fetch the conntrack
   pernetns data area.

4) Expose TCP and UDP flowtable offload timeouts through sysctl,
   from Oz Shlomo.

5) Add nfnetlink_hook subsystem to fetch the netfilter hook
   pipeline configuration, from Florian Westphal. This also includes
   a new field to annotate the hook type as metadata.

6) Fix unsafe memory access to non-linear skbuff in the new SCTP
   chunk support for nft_exthdr, from Phil Sutter.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonetdevsim: delete unnecessary debugfs checking
Dan Carpenter [Wed, 9 Jun 2021 09:56:45 +0000 (12:56 +0300)]
netdevsim: delete unnecessary debugfs checking

In normal situations where the driver doesn't dereference
"nsim_node->ddir" or "nsim_node->rate_parent" itself then we are not
supposed to check the return from debugfs functions.  In the case of
debugfs_create_dir() the check was wrong as well because it doesn't
return NULL, it returns error pointers.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodevlink: Fix error message in devlink_rate_set_ops_supported()
Dan Carpenter [Wed, 9 Jun 2021 09:54:31 +0000 (12:54 +0300)]
devlink: Fix error message in devlink_rate_set_ops_supported()

The WARN_ON() macro takes a condition, it doesn't take a message.  Use
WARN() instead.

Fixes: 1897db2ec310 ("devlink: Allow setting tx rate for devlink rate leaf objects")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: qca8k: check the correct variable in qca8k_set_mac_eee()
Dan Carpenter [Wed, 9 Jun 2021 09:53:03 +0000 (12:53 +0300)]
net: dsa: qca8k: check the correct variable in qca8k_set_mac_eee()

This code check "reg" but "ret" was intended so the error handling will
never trigger.

Fixes: 7c9896e37807 ("net: dsa: qca8k: check return value of read functions correctly")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: qca8k: fix an endian bug in qca8k_get_ethtool_stats()
Dan Carpenter [Wed, 9 Jun 2021 09:52:12 +0000 (12:52 +0300)]
net: dsa: qca8k: fix an endian bug in qca8k_get_ethtool_stats()

The "hi" variable is a u64 but the qca8k_read() writes to the top 32
bits of it.  That will work on little endian systems but it's a bit
subtle.  It's cleaner to make declare "hi" as a u32.  We will still need
to cast it when we shift it later on in the function but that's fine.

Fixes: 7c9896e37807 ("net: dsa: qca8k: check return value of read functions correctly")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'lapbther-cleanups'
David S. Miller [Wed, 9 Jun 2021 21:02:58 +0000 (14:02 -0700)]
Merge branch 'lapbther-cleanups'

Peng Li says:

====================
net: lapbether: clean up some code style issues

This patchset clean up some code style issues.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: lapbether: fix the code style issue about line length
Peng Li [Wed, 9 Jun 2021 09:39:55 +0000 (17:39 +0800)]
net: lapbether: fix the code style issue about line length

According to the chackpatch.pl,
line length of 123 exceeds 100 columns, so fix it.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: lapbether: fix the alignment issue
Peng Li [Wed, 9 Jun 2021 09:39:54 +0000 (17:39 +0800)]
net: lapbether: fix the alignment issue

Alignment should match open parenthesis.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: lapbether: replace comparison to NULL with "lapbeth_get_x25_dev"
Peng Li [Wed, 9 Jun 2021 09:39:53 +0000 (17:39 +0800)]
net: lapbether: replace comparison to NULL with "lapbeth_get_x25_dev"

According to the chackpatch.pl, comparison to NULL could
be written "lapbeth_get_x25_dev".

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: lapbether: fix the comments style issue
Peng Li [Wed, 9 Jun 2021 09:39:52 +0000 (17:39 +0800)]
net: lapbether: fix the comments style issue

Networking block comments don't use an empty /* line,
use /* Comment...

Block comments use * on subsequent lines.
Block comments use a trailing */ on a separate line.

This patch fixes the comments style issues.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: lapbether: remove unnecessary out of memory message
Peng Li [Wed, 9 Jun 2021 09:39:51 +0000 (17:39 +0800)]
net: lapbether: remove unnecessary out of memory message

This patch removes unnecessary out of memory message,
to fix the following checkpatch.pl warning:
"WARNING: Possible unnecessary 'out of memory' message"

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: lapbether: remove trailing whitespaces
Peng Li [Wed, 9 Jun 2021 09:39:50 +0000 (17:39 +0800)]
net: lapbether: remove trailing whitespaces

This patch removes trailing whitespaces.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: lapbether: move out assignment in if condition
Peng Li [Wed, 9 Jun 2021 09:39:49 +0000 (17:39 +0800)]
net: lapbether: move out assignment in if condition

Should not use assignment in if condition.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: lapbether: add blank line after declarations
Peng Li [Wed, 9 Jun 2021 09:39:48 +0000 (17:39 +0800)]
net: lapbether: add blank line after declarations

This patch fixes the checkpatch error about missing a blank line
after declarations.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: lapbether: remove redundant blank line
Peng Li [Wed, 9 Jun 2021 09:39:47 +0000 (17:39 +0800)]
net: lapbether: remove redundant blank line

This patch the redundant blank line.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: hns3: use list_move_tail instead of list_del/list_add_tail in hclge_main.c
Baokun Li [Wed, 9 Jun 2021 07:20:56 +0000 (15:20 +0800)]
net: hns3: use list_move_tail instead of list_del/list_add_tail in hclge_main.c

Using list_move_tail() instead of list_del() + list_add_tail() in hclge_main.c.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: hns3: use list_move_tail instead of list_del/list_add_tail in hclgevf_main.c
Baokun Li [Wed, 9 Jun 2021 07:17:20 +0000 (15:17 +0800)]
net: hns3: use list_move_tail instead of list_del/list_add_tail in hclgevf_main.c

Using list_move_tail() instead of list_del() + list_add_tail() in hclgevf_main.c.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonfp: use list_move instead of list_del/list_add in nfp_cppcore.c
Baokun Li [Wed, 9 Jun 2021 07:09:21 +0000 (15:09 +0800)]
nfp: use list_move instead of list_del/list_add in nfp_cppcore.c

Using list_move() instead of list_del() + list_add() in nfp_cppcore.c.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet/x25: fix a mistake in grammar
gushengxian [Wed, 9 Jun 2021 03:03:17 +0000 (20:03 -0700)]
net/x25: fix a mistake in grammar

Fix a mistake in grammar.

Signed-off-by: gushengxian <gushengxian@yulong.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ethernet: ravb: Use devm_platform_get_and_ioremap_resource()
Yang Yingliang [Wed, 9 Jun 2021 01:24:44 +0000 (09:24 +0800)]
net: ethernet: ravb: Use devm_platform_get_and_ioremap_resource()

Use devm_platform_get_and_ioremap_resource() to simplify
code.

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: b53: Do not force CPU to be always tagged
Florian Fainelli [Tue, 8 Jun 2021 21:22:04 +0000 (14:22 -0700)]
net: dsa: b53: Do not force CPU to be always tagged

Commit ca8931948344 ("net: dsa: b53: Keep CPU port as tagged in all
VLANs") forced the CPU port to be always tagged in any VLAN membership.
This was necessary back then because we did not support Broadcom tags
for all configurations so the only way to differentiate tagged and
untagged traffic while DSA_TAG_PROTO_NONE was used was to force the CPU
port into being always tagged.

With most configurations enabling Broadcom tags, especially after
8fab459e69ab ("net: dsa: b53: Enable Broadcom tags for 531x5/539x
families") we do not need to apply this unconditional force tagging of
the CPU port in all VLANs.

A helper function is introduced to faciliate the encapsulation of the
specific condition requiring the CPU port to be tagged in all VLANs and
the dsa_switch_ops::untag_bridge_pvid boolean is moved to when
dsa_switch_ops::setup is called when we have already determined the
tagging protocol we will be using.

Reported-by: Matthew Hagan <mnhagan88@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Tested-by: Matthew Hagan <mnhagan88@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonetfilter: nf_tables: move base hook annotation to init helper
Florian Westphal [Tue, 8 Jun 2021 21:06:07 +0000 (23:06 +0200)]
netfilter: nf_tables: move base hook annotation to init helper

coverity scanner says:
2187  if (nft_is_base_chain(chain)) {
vvv   CID 1505166:  Memory - corruptions  (UNINIT)
vvv   Using uninitialized value "basechain".
2188  basechain->ops.hook_ops_type = NF_HOOK_OP_NF_TABLES;

... I don't see how nft_is_base_chain() can evaluate to true
while basechain pointer is garbage.

However, it seems better to place the NF_HOOK_OP_NF_TABLES annotation
in nft_basechain_hook_init() instead.

Reported-by: coverity-bot <keescook+coverity-bot@chromium.org>
Addresses-Coverity-ID: 1505166 ("Memory - corruptions")
Fixes: 65b8b7bfc5284f ("netfilter: annotate nf_tables base hook ops")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 years agonetfilter: nfnetlink_hook: add depends-on nftables
Florian Westphal [Tue, 8 Jun 2021 20:53:22 +0000 (22:53 +0200)]
netfilter: nfnetlink_hook: add depends-on nftables

nfnetlink_hook.c: In function 'nfnl_hook_put_nft_chain_info':
nfnetlink_hook.c:76:7: error: implicit declaration of 'nft_is_active'

This macro is only defined when NF_TABLES is enabled.
While its possible to also add an ifdef-guard, the infrastructure
is currently not useful without nf_tables.

Reported-by: kernel test robot <lkp@intel.com>
Fixes: 252956528caa ("netfilter: add new hook nfnl subsystem")
Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 years agonetfilter: nfnetlink_hook: fix array index out-of-bounds error
Colin Ian King [Tue, 8 Jun 2021 15:34:08 +0000 (16:34 +0100)]
netfilter: nfnetlink_hook: fix array index out-of-bounds error

Currently the array net->nf.hooks_ipv6 is accessed by index hook
before hook is sanity checked. Fix this by moving the sanity check
to before the array access.

Addresses-Coverity: ("Out-of-bounds access")
Fixes: e2cf17d3774c ("netfilter: add new hook nfnl subsystem")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 years agonet: appletalk: fix some mistakes in grammar
gushengxian [Wed, 9 Jun 2021 01:52:57 +0000 (18:52 -0700)]
net: appletalk: fix some mistakes in grammar

Fix some mistakes in grammar.

Signed-off-by: gushengxian <gushengxian@yulong.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonetfilter: nft_exthdr: Fix for unsafe packet data read
Phil Sutter [Tue, 8 Jun 2021 09:40:57 +0000 (11:40 +0200)]
netfilter: nft_exthdr: Fix for unsafe packet data read

While iterating through an SCTP packet's chunks, skb_header_pointer() is
called for the minimum expected chunk header size. If (that part of) the
skbuff is non-linear, the following memcpy() may read data past
temporary buffer '_sch'. Use skb_copy_bits() instead which does the
right thing in this situation.

Fixes: 133dc203d77df ("netfilter: nft_exthdr: Support SCTP chunks")
Suggested-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 years agonet: stmmac: explicitly deassert GMAC_AHB_RESET
Matthew Hagan [Tue, 8 Jun 2021 18:59:06 +0000 (19:59 +0100)]
net: stmmac: explicitly deassert GMAC_AHB_RESET

We are currently assuming that GMAC_AHB_RESET will already be deasserted
by the bootloader. However if this has not been done, probing of the GMAC
will fail. To remedy this we must ensure GMAC_AHB_RESET has been deasserted
prior to probing.

v2 changes:
 - remove NULL condition check for stmmac_ahb_rst in stmmac_main.c
 - unwrap dev_err() message in stmmac_main.c
 - add PTR_ERR() around plat->stmmac_ahb_rst in stmmac_platform.c

v3 changes:
 - add error pointer to dev_err() output
 - add reset_control_assert(stmmac_ahb_rst) in stmmac_dvr_remove
 - revert PTR_ERR() around plat->stmmac_ahb_rst since this is performed
   on the returned value of ret by the calling function

Signed-off-by: Matthew Hagan <mnhagan88@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agosh_eth: Use devm_platform_get_and_ioremap_resource()
Yang Yingliang [Tue, 8 Jun 2021 13:57:18 +0000 (21:57 +0800)]
sh_eth: Use devm_platform_get_and_ioremap_resource()

Use devm_platform_get_and_ioremap_resource() to simplify
code.

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: nixge: simplify code with devm platform functions
Yang Yingliang [Tue, 8 Jun 2021 13:56:22 +0000 (21:56 +0800)]
net: nixge: simplify code with devm platform functions

Use devm_platform_get_and_ioremap_resource() and
devm_platform_ioremap_resource_byname to simplify
code.

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: felix: set TX flow control according to the phylink_mac_link_up resolution
Vladimir Oltean [Tue, 8 Jun 2021 11:16:51 +0000 (14:16 +0300)]
net: dsa: felix: set TX flow control according to the phylink_mac_link_up resolution

Instead of relying on the static initialization done by ocelot_init_port()
which enables flow control unconditionally, set SYS_PAUSE_CFG_PAUSE_ENA
according to the parameters negotiated by the PHY.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: x25: Use list_for_each_entry() to simplify code in x25_forward.c
Wang Hai [Tue, 8 Jun 2021 13:30:07 +0000 (13:30 +0000)]
net: x25: Use list_for_each_entry() to simplify code in x25_forward.c

Convert list_for_each() to list_for_each_entry() where
applicable. This simplifies the code.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoethernet/qlogic: Use list_for_each_entry() to simplify code in qlcnic_hw.c
Wang Hai [Tue, 8 Jun 2021 13:29:08 +0000 (13:29 +0000)]
ethernet/qlogic: Use list_for_each_entry() to simplify code in qlcnic_hw.c

Convert list_for_each() to list_for_each_entry() where
applicable. This simplifies the code.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: stmmac: fix NPD with phylink_set_pcs if there is no MDIO bus
Vladimir Oltean [Tue, 8 Jun 2021 13:10:37 +0000 (16:10 +0300)]
net: stmmac: fix NPD with phylink_set_pcs if there is no MDIO bus

priv->plat->mdio_bus_data is optional, some platforms may not set it,
however we proceed to look straight at priv->plat->mdio_bus_data->has_xpcs.

Since the xpcs is instantiated based on the has_xpcs property, we can
avoid looking at the priv->plat->mdio_bus_data structure altogether and
just check for the presence of the xpcs pointer.

Fixes: 11059740e616 ("net: pcs: xpcs: convert to phylink_pcs_ops")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: lapb: Use list_for_each_entry() to simplify code in lapb_iface.c
Wang Hai [Tue, 8 Jun 2021 08:13:01 +0000 (08:13 +0000)]
net: lapb: Use list_for_each_entry() to simplify code in lapb_iface.c

Convert list_for_each() to list_for_each_entry() where
applicable. This simplifies the code.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: x25: Use list_for_each_entry() to simplify code in x25_link.c
Wang Hai [Tue, 8 Jun 2021 08:05:05 +0000 (08:05 +0000)]
net: x25: Use list_for_each_entry() to simplify code in x25_link.c

Convert list_for_each() to list_for_each_entry() where
applicable. This simplifies the code.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: qede: Use list_for_each_entry() to simplify code
Wang Hai [Tue, 8 Jun 2021 07:57:37 +0000 (07:57 +0000)]
net: qede: Use list_for_each_entry() to simplify code

Convert list_for_each() to list_for_each_entry() where
applicable. This simplifies the code.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'hns3-RAS'
David S. Miller [Tue, 8 Jun 2021 21:43:31 +0000 (14:43 -0700)]
Merge branch 'hns3-RAS'

Guangbin Huang says:

====================
net: hns3: add RAS compatibility adaptation solution

This patchset adds RAS compatibility adaptation solution for new devices.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: hns3: add error handling compatibility during initialization
Jiaran Zhang [Tue, 8 Jun 2021 13:08:31 +0000 (21:08 +0800)]
net: hns3: add error handling compatibility during initialization

During initialization, the driver logs and clears the hw errors that
already occurred. For device supports imp-handle ras capability, it
needs handle different error status, otherwise it may cause wrong reset.

So fix it by adding a new processing branch.

Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: hns3: update error recovery module and type
Jiaran Zhang [Tue, 8 Jun 2021 13:08:30 +0000 (21:08 +0800)]
net: hns3: update error recovery module and type

Update error recovery module and type for RoCE.

The enumeration values of module names and error types are not sorted
in sequence. If use the current printing mode, they cannot be correctly
printed.

Use the index mode, If mod_id and type_id match the enumerated value,
display the corresponding information.

Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: hns3: add support for imp-handle ras capability
Jiaran Zhang [Tue, 8 Jun 2021 13:08:29 +0000 (21:08 +0800)]
net: hns3: add support for imp-handle ras capability

IMP(Intelligent Management Processor) firmware add a new feature to
handle and consolidate RAS information for new devices, NIC driver
only needs to query the reported RAS information. NIC driver adds
support for this feature.

Driver queries device capability to check whether IMP support this
feature, If yes, execute the new RAS processing branch.

In order to add a method to check whether PF supports imp-handle RAS
feature, add dumping this info in debugfs.

Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: hns3: add the RAS compatibility adaptation solution
Jiaran Zhang [Tue, 8 Jun 2021 13:08:28 +0000 (21:08 +0800)]
net: hns3: add the RAS compatibility adaptation solution

To adapt to hardware modification and ensure that the driver is
compatible with the original error handling content, we need to add the
RAS compatibility adaptation solution.

Add a processing branch to the driver during error handling. In the new
processing branch, NIC fault information is integrated by the IMP. An
interaction command is added between the driver and IMP to query
and clear the fault source and interrupt source. The IMP integrates
error information and reports the highest reset level to the driver.

Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: hns3: add support for handling all errors through MSI-X
Yufeng Mo [Tue, 8 Jun 2021 13:08:27 +0000 (21:08 +0800)]
net: hns3: add support for handling all errors through MSI-X

Currently, hardware errors can be reported through AER or MSI-X mode.
However, the AER mode is intended to handle only bus errors, but not
hardware errors. On the other hand, virtual machines cannot handle
AER errors. When an AER error is reported, virtual machines will be
suspended. So add support for handling all these hardware errors
through MSI-X mode which depends on a newer version of firmware,
and reserve the handler of the AER mode for compatibility.

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'ena-updates'
David S. Miller [Tue, 8 Jun 2021 21:41:10 +0000 (14:41 -0700)]
Merge branch 'ena-updates'

Shay Agroskin says:

====================
se build_skb and reorganize some code in ENA

this patchset introduces several changes:

- Use build_skb() on RX side.
  This allows to ensure that the headers are in the linear part

- Batch some code into functions and remove some of the code to make it more
  readable and less error prone

- Fix RST format and outdated description in ENA documentation

- Improve cache alignment in the code
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ena: re-organize code to improve readability
Shay Agroskin [Tue, 8 Jun 2021 16:01:18 +0000 (19:01 +0300)]
net: ena: re-organize code to improve readability

Restructure some ethtool to a switch-case blocks to make it more uniform
with other similar functions.
Also restructure variable declaration to create reversed x-mas tree.

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ena: Use dev_alloc() in RX buffer allocation
Shay Agroskin [Tue, 8 Jun 2021 16:01:17 +0000 (19:01 +0300)]
net: ena: Use dev_alloc() in RX buffer allocation

Use dev_alloc() when allocating RX buffers instead of specifying the
allocation flags explicitly. This result in same behaviour with less
code.

Also move the page allocation and its DMA mapping into a function. This
creates a logical block, which may help understanding the code.

Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ena: aggregate doorbell common operations into a function
Shay Agroskin [Tue, 8 Jun 2021 16:01:16 +0000 (19:01 +0300)]
net: ena: aggregate doorbell common operations into a function

The ena_ring_tx_doorbell() is introduced to call the doorbell and
increase the driver's corresponding stat.

Signed-off-by: Ido Segev <idose@amazon.com>
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ena: fix RST format in ENA documentation file
Shay Agroskin [Tue, 8 Jun 2021 16:01:15 +0000 (19:01 +0300)]
net: ena: fix RST format in ENA documentation file

The documentation file used to be written in markdown format but was
converted to reStructuredText (rst).

The converted file doesn't keep up with rst format requirements which
results in hard-to-read text.

This patch fixes the formatting of the file. The patch also
* Highlights and emphasizes some lines to improve readability
* Rephrases some hard-to-understand text
* Updates outdated function descriptions.
* Removes TSO description which falsely claims the driver supports it

Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ena: Remove module param and change message severity
Shay Agroskin [Tue, 8 Jun 2021 16:01:14 +0000 (19:01 +0300)]
net: ena: Remove module param and change message severity

Remove the module param 'debug' which allows to specify the message
level of the driver. This value can be specified using ethtool command.
Also reduce the message level of LLQ support to be a warning since it is
not an indication of an error.

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ena: add jiffies of last napi call to stats
Shay Agroskin [Tue, 8 Jun 2021 16:01:13 +0000 (19:01 +0300)]
net: ena: add jiffies of last napi call to stats

There are instances when we want to know when the last napi was
called for debugging.

On stuck / heavy loaded CPUs, the ena napi handler might not be
called for a long period of time. This stat can help us to
determine how much time passed since the last execution of napi.

Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ena: use build_skb() in RX path
Shay Agroskin [Tue, 8 Jun 2021 16:01:12 +0000 (19:01 +0300)]
net: ena: use build_skb() in RX path

This patch converts the RX path to use build_skb() for packets larger
than copybreak (set to 256 by default). This function makes the first
descriptor's page to be the linear part of the sk_buff struct buffer.

Also remove the SKB description from the README since most of it no
longer relevant and the parts that are left don't add information.

Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ena: Improve error logging in driver
Shay Agroskin [Tue, 8 Jun 2021 16:01:11 +0000 (19:01 +0300)]
net: ena: Improve error logging in driver

Add prints to improve logging of driver's errors.

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ena: Remove unused code
Shay Agroskin [Tue, 8 Jun 2021 16:01:10 +0000 (19:01 +0300)]
net: ena: Remove unused code

The ENA_DEFAULT_MIN_RX_BUFF_ALLOC_SIZE macro,
ena_xdp_queues_present() function and SUSPEND_RESUME enums aren't used
in the driver, and so not needed.

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ena: optimize data access in fast-path code
Shay Agroskin [Tue, 8 Jun 2021 16:01:09 +0000 (19:01 +0300)]
net: ena: optimize data access in fast-path code

This tweaks several small places to improve the data access in fast
path:

* Remove duplicates of first_interrupt flag and surround it with
  WRITE/READ_ONCE macros:

  The flag is used to detect HW disorders in its
  interrupt communication with the driver. The flag is set when an
  interrupt is received and used in the health check function
  (ena_timer_service()) to help it find irregularities.

* Reorder some fields in ena_napi struct to take better advantage of
  cache access pattern.

* Move XDP TX queue number to a variable to save its calculation for
  every packet.

* Use likely in a condition to improve branch prediction

The 'first_interrupt' and 'interrupt_masked' flags were moved to reside
in the same cache line as the first fields of 'napi' struct. This
placement ensures that all memory accessed during upper-half handler
reside in the same cacheline (napi_schedule_irqoff() only accesses
'state' and 'poll_list' fields which are at the beginning of napi
struct).

Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'mlxsw-various-updates'
David S. Miller [Tue, 8 Jun 2021 21:39:07 +0000 (14:39 -0700)]
Merge branch 'mlxsw-various-updates'

Ido Schimmel says:

====================
mlxsw: Various updates

This patchset contains various updates for mlxsw. The most significant
change is the long overdue removal of the abort mechanism in the first
two patches.

Patches #1-#2 remove the route abort mechanism. This change is long
overdue and explained in detail in the commit message.

Patch #3 sets ports down in a few selftests that forgot to do so.
Discovered using a BPF tool (WIP) that monitors ASIC resources.

Patch #4 fixes an issue introduced by commit 557c4d2f780c ("selftests:
devlink_lib: add check for devlink device existence").

Patches #5-#8 modify the driver to read transceiver module's temperature
thresholds using MTMP register (when supported) instead of directly from
the module's EEPROM using MCIA register. This is both more reliable and
more efficient as now the module's temperature and thresholds are read
using one transaction instead of three.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agomlxsw: thermal: Read module temperature thresholds using MTMP register
Mykola Kostenok [Tue, 8 Jun 2021 12:44:14 +0000 (15:44 +0300)]
mlxsw: thermal: Read module temperature thresholds using MTMP register

mlxsw_thermal_module_trips_update() is used to update the trip points of
the module's thermal zone. Currently, this is done by querying the
thresholds from the module's EEPROM via MCIA register. This data does
not pass validation and in some cases can be unreliable. For example,
due to some problem with transceiver module.

Previous patch made it possible to read module's temperature and
thresholds via MTMP register. Therefore, extend
mlxsw_thermal_module_trips_update() to use the thresholds queried from
MTMP, if valid.

This is both more reliable and more efficient than current method, as
temperature and thresholds are queried in one transaction instead of
three. This is significant when working over a slow bus such as I2C.

Signed-off-by: Mykola Kostenok <c_mykolak@nvidia.com>
Acked-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agomlxsw: thermal: Add function for reading module temperature and thresholds
Mykola Kostenok [Tue, 8 Jun 2021 12:44:13 +0000 (15:44 +0300)]
mlxsw: thermal: Add function for reading module temperature and thresholds

Provide new function mlxsw_thermal_module_temp_and_thresholds_get() for
reading temperature and temperature thresholds by a single operation.
The motivation is to reduce the number of transactions with the device
which is important when operating over a slow bus such as I2C.

Currently, the sole caller of the function is only using it to read the
module's temperature. The next patch will also use it to query the
module's temperature thresholds.

Signed-off-by: Mykola Kostenok <c_mykolak@nvidia.com>
Acked-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agomlxsw: core_env: Read module temperature thresholds using MTMP register
Mykola Kostenok [Tue, 8 Jun 2021 12:44:12 +0000 (15:44 +0300)]
mlxsw: core_env: Read module temperature thresholds using MTMP register

Currently, module temperature thresholds are obtained from Management
Cable Info Access (MCIA) register by specifying the thresholds offsets
within module EEPROM layout. This data does not pass validation and in
some cases can be unreliable. For example, due to some problem with the
module.

Add support for a new feature provided by Management Temperature (MTMP)
register for sanitization of temperature thresholds values.

Extend mlxsw_env_module_temp_thresholds_get() to get temperature
thresholds through MTMP field 'max_operational_temperature' - if it is
not zero, feature is supported. Otherwise fallback to old method and get
the thresholds through MCIA.

Signed-off-by: Mykola Kostenok <c_mykolak@nvidia.com>
Acked-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agomlxsw: reg: Extend MTMP register with new threshold field
Mykola Kostenok [Tue, 8 Jun 2021 12:44:11 +0000 (15:44 +0300)]
mlxsw: reg: Extend MTMP register with new threshold field

Extend Management Temperature (MTMP) register with new field specifying
the maximum temperature threshold.

Extend mlxsw_reg_mtmp_unpack() function with two extra arguments,
providing high and maximum temperature thresholds. For modules, these
thresholds correspond to critical and emergency thresholds that are read
from the module's EEPROM.

Signed-off-by: Mykola Kostenok <c_mykolak@nvidia.com>
Acked-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoselftests: devlink_lib: Fix bouncing of netdevsim DEVLINK_DEV
Petr Machata [Tue, 8 Jun 2021 12:44:10 +0000 (15:44 +0300)]
selftests: devlink_lib: Fix bouncing of netdevsim DEVLINK_DEV

In the commit referenced below, a check was added to devlink_lib that
asserts the existence of a devlink device referenced by $DEVLINK_DEV.
Unfortunately, several netdevsim tests point DEVLINK_DEV at a device that
does not exist at the time that devlink_lib is sourced. Thus these tests
spuriously fail.

Fix this by introducing an override. By setting DEVLINK_DEV to an empty
string, the user declares their intention to handle DEVLINK_DEV management
on their own.

In all netdevsim tests that use devlink_lib and set DEVLINK_DEV, set
instead an empty DEVLINK_DEV just before sourcing devlink_lib, and set it
to the correct value right afterwards.

Fixes: 557c4d2f780c ("selftests: devlink_lib: add check for devlink device existence")
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoselftests: Clean forgotten resources as part of cleanup()
Amit Cohen [Tue, 8 Jun 2021 12:44:09 +0000 (15:44 +0300)]
selftests: Clean forgotten resources as part of cleanup()

Several tests do not set some ports down as part of their cleanup(),
resulting in IPv6 link-local addresses and associated routes not being
deleted.

These leaks were found using a BPF tool that monitors ASIC resources.

Solve this by setting the ports down at the end of the tests.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>