platform/kernel/linux-starfive.git
4 years agonet/mlx5e: kTLS, Cleanup redundant capability check
Tariq Toukan [Mon, 22 Jun 2020 15:32:36 +0000 (18:32 +0300)]
net/mlx5e: kTLS, Cleanup redundant capability check

All callers of mlx5e_ktls_build_netdev() check capability
before the call.
Remove the repeated check in the function.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: Increase Async ICO SQ size
Tariq Toukan [Thu, 18 Jun 2020 09:45:59 +0000 (12:45 +0300)]
net/mlx5e: Increase Async ICO SQ size

Resync communication with HW for kTLS RX is done via the
async ICOSQs.
kTLS RX resync requests might come in bursts. To improve the
success chances for such bursts, use a larger ICOSQ.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: kTLS, Add kTLS RX stats
Tariq Toukan [Mon, 15 Jun 2020 12:25:23 +0000 (15:25 +0300)]
net/mlx5e: kTLS, Add kTLS RX stats

Add global and per-channel ethtool SW stats for the device
offload.
Document the new counters in tls-offload.rst.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: kTLS, Add kTLS RX resync support
Tariq Toukan [Tue, 16 Jun 2020 12:15:06 +0000 (15:15 +0300)]
net/mlx5e: kTLS, Add kTLS RX resync support

Implement the RX resync procedure, using the TLS async resync API.

The HW offload of TLS decryption in RX side might get out-of-sync
due to out-of-order reception of packets.
This requires SW intervention to update the HW context and get it
back in-sync.

Performance:
CPU: Intel(R) Xeon(R) CPU E5-2687W v4 @ 3.00GHz, 24 cores, HT off
NIC: ConnectX-6 Dx 100GbE dual port

Goodput (app-layer throughput) comparison:
+---------------+-------+-------+---------+
| # connections |   1   |   4   |    8    |
+---------------+-------+-------+---------+
| SW (Gbps)     |  7.26 | 24.70 |   50.30 |
+---------------+-------+-------+---------+
| HW (Gbps)     | 18.50 | 64.30 |   92.90 |
+---------------+-------+-------+---------+
| Speedup       | 2.55x | 2.56x | 1.85x * |
+---------------+-------+-------+---------+

* After linerate is reached, diff is observed in CPU util.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/tls: Add asynchronous resync
Boris Pismenny [Mon, 8 Jun 2020 16:11:38 +0000 (19:11 +0300)]
net/tls: Add asynchronous resync

This patch adds support for asynchronous resynchronization in tls_device.
Async resync follows two distinct stages:

1. The NIC driver indicates that it would like to resync on some TLS
record within the received packet (P), but the driver does not
know (yet) which of the TLS records within the packet.
At this stage, the NIC driver will query the device to find the exact
TCP sequence for resync (tcpsn), however, the driver does not wait
for the device to provide the response.

2. Eventually, the device responds, and the driver provides the tcpsn
within the resync packet to KTLS. Now, KTLS can check the tcpsn against
any processed TLS records within packet P, and also against any record
that is processed in the future within packet P.

The asynchronous resync path simplifies the device driver, as it can
save bits on the packet completion (32-bit TCP sequence), and pass this
information on an asynchronous command instead.

Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agoRevert "net/tls: Add force_resync for driver resync"
Boris Pismenny [Mon, 8 Jun 2020 09:42:52 +0000 (12:42 +0300)]
Revert "net/tls: Add force_resync for driver resync"

This reverts commit b3ae2459f89773adcbf16fef4b68deaaa3be1929.
Revert the force resync API.
Not in use. To be replaced by a better async resync API downstream.

Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: kTLS, Add kTLS RX HW offload support
Tariq Toukan [Thu, 28 May 2020 07:13:00 +0000 (10:13 +0300)]
net/mlx5e: kTLS, Add kTLS RX HW offload support

Implement driver support for the kTLS RX HW offload feature.
Resync support is added in a downstream patch.

New offload contexts post their static/progress params WQEs
over the per-channel async ICOSQ, protected under a spin-lock.
The Channel/RQ is selected according to the socket's rxq index.

Feature is OFF by default. Can be turned on by:
$ ethtool -K <if> tls-hw-rx-offload on

A new TLS-RX workqueue is used to allow asynchronous addition of
steering rules, out of the NAPI context.
It will be also used in a downstream patch in the resync procedure.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: kTLS, Use kernel API to extract private offload context
Tariq Toukan [Thu, 28 May 2020 07:04:03 +0000 (10:04 +0300)]
net/mlx5e: kTLS, Use kernel API to extract private offload context

Modify the implementation of the private kTLS TX HW offload context
getter and setter, so it uses the kernel API functions, instead of
a local shadow structure.
A single BUILD_BUG_ON check is sufficient, remove the duplicate.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: kTLS, Improve TLS feature modularity
Tariq Toukan [Tue, 26 May 2020 10:58:09 +0000 (13:58 +0300)]
net/mlx5e: kTLS, Improve TLS feature modularity

Better separate the code into c/h files, so that kTLS internals
are exposed to the corresponding non-accel flow as follows:
- Necessary datapath functions are exposed via ktls_txrx.h.
- Necessary caps and configuration functions are exposed via ktls.h,
  which became very small.

In addition, kTLS internal code sharing is done via ktls_utils.h,
which is not exposed to any non-accel file.

Add explicit WQE structures for the TLS static and progress
params, breaking the union of the static with UMR, and the progress
with PSV.

Generalize the API as a preparation for TLS RX offload support.

Move kTLS TX-specific code to the proper file.
Remove the inline tag for function in C files, let the compiler decide.
Use kzalloc/kfree for the priv_tx context.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
4 years agonet/mlx5e: Accel, Expose flow steering API for rules add/del
Tariq Toukan [Tue, 16 Jun 2020 10:29:07 +0000 (13:29 +0300)]
net/mlx5e: Accel, Expose flow steering API for rules add/del

Given a socket, the function extracts the TCP/IP{4,6} ntuple
and adds rule to steering.
Another function gets the rule and deletes it.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
4 years agonet/mlx5e: Receive flow steering framework for accelerated TCP flows
Boris Pismenny [Sun, 14 Apr 2019 13:35:24 +0000 (16:35 +0300)]
net/mlx5e: Receive flow steering framework for accelerated TCP flows

The framework allows creating flow tables to steer incoming traffic of
TCP sockets to the acceleration TIRs.
This is used in downstream patches for TLS, and will be used in the
future for other offloads.

Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: API to manipulate TTC rules destinations
Saeed Mahameed [Thu, 2 Apr 2020 09:02:33 +0000 (02:02 -0700)]
net/mlx5e: API to manipulate TTC rules destinations

Store the default destinations of the on-load generated TTC
(Traffic Type Classifier) rules in the ttc rules table.

Introduce TTC API functions to manipulate/restore and get the TTC rule
destination and use these API functions in arfs implementation.

This will allow a better decoupling between TTC implementation and its
users.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
4 years agonet/mlx5e: Refactor build channel params
Tariq Toukan [Sat, 13 Jun 2020 19:53:32 +0000 (22:53 +0300)]
net/mlx5e: Refactor build channel params

Take the CQ params into their respective RQ/SQ params.
Split the params build of the different ICOSQs (sync and async),
as they require different init values.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: Turn XSK ICOSQ into a general asynchronous one
Tariq Toukan [Tue, 26 Nov 2019 14:23:23 +0000 (16:23 +0200)]
net/mlx5e: Turn XSK ICOSQ into a general asynchronous one

There is an upcoming demand (in downstream patches) for
an ICOSQ to be populated out of the NAPI context, asynchronously.

There is already an existing one serving XSK-related use case.
In this patch, promote this ICOSQ to serve as general async ICOSQ,
to be used for XSK and non-XSK flows.

As part of this, the reg_umr bit of the SQ context is now set
(if capable), as the general async ICOSQ should support possible
posts of UMR WQEs.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agoMerge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox...
Saeed Mahameed [Sat, 27 Jun 2020 21:00:04 +0000 (14:00 -0700)]
Merge branch 'mlx5-next' of git://git./linux/kernel/git/mellanox/linux

* 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux:
  net/mlx5: kTLS, Improve TLS params layout structures
  net/mlx5: Avoid eswitch header inclusion in fs core layer
  net/mlx5: Avoid RDMA file inclusion in core driver
  net/mlx5: Add support in query QP, CQ and MKEY segments
  net/mlx5: Export resource dump interface

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5: kTLS, Improve TLS params layout structures
Tariq Toukan [Fri, 26 Jun 2020 05:59:43 +0000 (22:59 -0700)]
net/mlx5: kTLS, Improve TLS params layout structures

Add explicit WQE segment structures for the TLS static and progress
params.
According to the HW spec, TISN is not part of the progress params context,
take it out of it.
Rename the control segment tisn field as it could hold either a TIS or
a TIR number.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5: Avoid eswitch header inclusion in fs core layer
Parav Pandit [Fri, 26 Jun 2020 05:59:42 +0000 (22:59 -0700)]
net/mlx5: Avoid eswitch header inclusion in fs core layer

Flow steering core layer is independent of the eswitch layer.
Hence avoid fs_core dependency on eswitch.

Fixes: 328edb499f99 ("net/mlx5: Split FDB fast path prio to multiple namespaces")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5: Avoid RDMA file inclusion in core driver
Parav Pandit [Fri, 26 Jun 2020 05:59:41 +0000 (22:59 -0700)]
net/mlx5: Avoid RDMA file inclusion in core driver

mlx5 cq.h does not depend on RDMA verbs.
Remove RDMA verbs file inclusion.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agoMerge branch 'net-atlantic-various-non-functional-changes'
David S. Miller [Fri, 26 Jun 2020 23:32:51 +0000 (16:32 -0700)]
Merge branch 'net-atlantic-various-non-functional-changes'

Igor Russkikh says:

====================
net: atlantic: various non-functional changes

This patchset contains several non-functional changes, which were made in
out of tree driver over the time.
Mostly typos, checkpatch findings and comment fixes.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: atlantic: put ptp code under IS_REACHABLE check
Igor Russkikh [Fri, 26 Jun 2020 18:40:38 +0000 (21:40 +0300)]
net: atlantic: put ptp code under IS_REACHABLE check

A1 requires additional processing for both egress and ingress to support
PTP.
And it makes sense to get rid of this processing altogether (via ifdef),
if PTP clock is disabled globally.

This patch puts the PTP code under the corresponding IS_REACHABLE check.

Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: atlantic: add alignment checks in hw_atl2_utils_fw.c
Mark Starovoytov [Fri, 26 Jun 2020 18:40:37 +0000 (21:40 +0300)]
net: atlantic: add alignment checks in hw_atl2_utils_fw.c

This patch adds alignment checks in all the helper macros in
hw_atl2_utils_fw.c
These alignment checks are compile-time, so runtime is not affected.

All these helper macros assume the length to be aligned (multiple of 4).
If it's not aligned, then there might be issues, e.g. stack corruption.

Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: atlantic: missing space in a comment in aq_nic.h
Dmitry Bezrukov [Fri, 26 Jun 2020 18:40:36 +0000 (21:40 +0300)]
net: atlantic: missing space in a comment in aq_nic.h

This patch add a missing space in the comment in aq_nic.h

Signed-off-by: Dmitry Bezrukov <dbezrukov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: atlantic: fix typo in aq_ring_tx_clean
Mark Starovoytov [Fri, 26 Jun 2020 18:40:35 +0000 (21:40 +0300)]
net: atlantic: fix typo in aq_ring_tx_clean

This patch fixes a typo in aq_ring_tx_clean.
stats is a union, so the typo doesn't cause any issues, but it's a typo
nonetheless.

Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: atlantic: make aq_pci_func_init static
Mark Starovoytov [Fri, 26 Jun 2020 18:40:34 +0000 (21:40 +0300)]
net: atlantic: make aq_pci_func_init static

This patch makes aq_pci_func_init() static, because it's not used anywhere
outside the file itself.

Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: atlantic: Replace ENOTSUPP usage to EOPNOTSUPP
Mark Starovoytov [Fri, 26 Jun 2020 18:40:33 +0000 (21:40 +0300)]
net: atlantic: Replace ENOTSUPP usage to EOPNOTSUPP

This patch replaces ENOTSUPP (where it was used by mistake) with
EOPNOTSUPP.

Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: atlantic: fix variable type in aq_ethtool_get_pauseparam
Nikita Danilov [Fri, 26 Jun 2020 18:40:32 +0000 (21:40 +0300)]
net: atlantic: fix variable type in aq_ethtool_get_pauseparam

This patch fixes the type for variable which is assigned from enum,
as such it should have been int, not u32.

Signed-off-by: Nikita Danilov <ndanilov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: atlantic: MACSec offload statistics checkpatch fix
Mark Starovoytov [Fri, 26 Jun 2020 18:40:31 +0000 (21:40 +0300)]
net: atlantic: MACSec offload statistics checkpatch fix

This patch fixes a checkpatch warning.

Fixes: aec0f1aac58e ("net: atlantic: MACSec offload statistics implementation")

Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'mptcp-refactor-token-container'
David S. Miller [Fri, 26 Jun 2020 23:21:39 +0000 (16:21 -0700)]
Merge branch 'mptcp-refactor-token-container'

Paolo Abeni says:

====================
mptcp: refactor token container

Currently the msk sockets are stored in a single radix tree, protected by a
global spin_lock. This series moves to an hash table, allocated at boot time,
with per bucker spin_lock - alike inet_hashtables, but using a different key:
the token itself.

The above improves scalability, as write operations will have a far later chance
to compete for lock acquisition, allows lockless lookup, and will allow
easier msk traversing - e.g. for diag interface implementation's sake.

This also introduces trivial, related, kunit tests and move the existing in
kernel's one to kunit.

v1 -> v2:
 - fixed a few extra and sparse warns
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agomptcp: introduce token KUNIT self-tests
Paolo Abeni [Fri, 26 Jun 2020 17:30:02 +0000 (19:30 +0200)]
mptcp: introduce token KUNIT self-tests

Unit tests for the internal MPTCP token APIs, using KUNIT

v1 -> v2:
 - use the correct RCU annotation when initializing icsk ulp
 - fix a few checkpatch issues

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agomptcp: move crypto test to KUNIT
Paolo Abeni [Fri, 26 Jun 2020 17:30:01 +0000 (19:30 +0200)]
mptcp: move crypto test to KUNIT

currently MPTCP uses a custom hook to executed unit tests at
boot time. Let's use the KUNIT framework instead.
Additionally move the relevant code to a separate file and
export the function needed by the test when self-tests
are build as a module.

Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agomptcp: refactor token container
Paolo Abeni [Fri, 26 Jun 2020 17:30:00 +0000 (19:30 +0200)]
mptcp: refactor token container

Replace the radix tree with a hash table allocated
at boot time. The radix tree has some shortcoming:
a single lock is contented by all the mptcp operation,
the lookup currently use such lock, and traversing
all the items would require a lock, too.

With hash table instead we trade a little memory to
address all the above - a per bucket lock is used.

To hash the MPTCP sockets, we re-use the msk' sk_node
entry: the MPTCP sockets are never hashed by the stack.
Replace the existing hash proto callbacks with a dummy
implementation, annotating the above constraint.

Additionally refactor the token creation to code to:

- limit the number of consecutive attempts to a fixed
maximum. Hitting a hash bucket with a long chain is
considered a failed attempt

- accept() no longer can fail to token management.

- if token creation fails at connect() time, we do
fallback to TCP (before the connection was closed)

v1 -> v2:
 - fix "no newline at end of file" - Jakub

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agomptcp: add __init annotation on setup functions
Paolo Abeni [Fri, 26 Jun 2020 17:29:59 +0000 (19:29 +0200)]
mptcp: add __init annotation on setup functions

Add the missing annotation in some setup-only
functions.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'net-organize-driver-docs-by-device-type'
David S. Miller [Fri, 26 Jun 2020 23:08:45 +0000 (16:08 -0700)]
Merge branch 'net-organize-driver-docs-by-device-type'

Jakub Kicinski says:

====================
net: organize driver docs by device type

This series finishes off what I started in
commit b255e500c8dc ("net: documentation: build a directory structure for drivers").
The objective is to de-clutter our documentation folder so folks
have a chance of finding relevant info. I _think_ I got all the
driver docs from the main documentation directory this time around.

While doing this I realized that many of them are of limited relevance
these days, so I went ahead and sliced the drivers directory by
technology. Those feeling nostalgic are free to dive into the FDDI,
ATM etc. docs, but for most Ethernet is what we care about.

v1:
 - simplify Intel's docs list in MAINTAINERS.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodocs: networking: move FDDI drivers to the hw driver section
Jakub Kicinski [Fri, 26 Jun 2020 17:27:31 +0000 (10:27 -0700)]
docs: networking: move FDDI drivers to the hw driver section

Move docs for defza and skfp under device_drivers/fddi.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodocs: networking: move ATM drivers to the hw driver section
Jakub Kicinski [Fri, 26 Jun 2020 17:27:30 +0000 (10:27 -0700)]
docs: networking: move ATM drivers to the hw driver section

Move docs for cxacru, fore200e and iphase under device_drivers/atm.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodocs: networking: move AppleTalk / LocalTalk drivers to the hw driver section
Jakub Kicinski [Fri, 26 Jun 2020 17:27:29 +0000 (10:27 -0700)]
docs: networking: move AppleTalk / LocalTalk drivers to the hw driver section

Move docs for cops and ltpc under device_drivers/appletalk.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodocs: networking: move remaining Ethernet driver docs to the hw section
Jakub Kicinski [Fri, 26 Jun 2020 17:27:28 +0000 (10:27 -0700)]
docs: networking: move remaining Ethernet driver docs to the hw section

Move docs for hinic and altera_tse under device_drivers/ethernet.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodocs: networking: move ray_cs to the hw driver section
Jakub Kicinski [Fri, 26 Jun 2020 17:27:27 +0000 (10:27 -0700)]
docs: networking: move ray_cs to the hw driver section

Move ray_cs into Wi-Fi driver docs subdirectory.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodocs: networking: move baycom to the hw driver section
Jakub Kicinski [Fri, 26 Jun 2020 17:27:26 +0000 (10:27 -0700)]
docs: networking: move baycom to the hw driver section

Move baycom to hamradio.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodocs: networking: move z8530 to the hw driver section
Jakub Kicinski [Fri, 26 Jun 2020 17:27:25 +0000 (10:27 -0700)]
docs: networking: move z8530 to the hw driver section

Move z8530 docs to hamradio and wan subdirectories.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodocs: networking: reorganize driver documentation again
Jakub Kicinski [Fri, 26 Jun 2020 17:27:24 +0000 (10:27 -0700)]
docs: networking: reorganize driver documentation again

Organize driver documentation by device type. Most documents
have fairly verbose yet uninformative names, so let users
first select a well defined device type, and then search for
a particular driver.

While at it rename the section from Vendor drivers to
Hardware drivers. This seems more accurate, besides people
sometimes refer to out-of-tree drivers as vendor drivers.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Acked-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'net-phy-relax-PHY-and-MDIO-reset-handling'
David S. Miller [Fri, 26 Jun 2020 20:40:18 +0000 (13:40 -0700)]
Merge branch 'net-phy-relax-PHY-and-MDIO-reset-handling'

Bartosz Golaszewski says:

====================
net: phy: relax PHY and MDIO reset handling

Previously these patches were submitted as part of a larger series[1]
but since the approach in it will have to be reworked I'm resending
the ones that were non-controversial and have been reviewed for upstream.

Florian suggested a better solution for managing multiple resets. While
I will definitely try to implement something at the driver model's bus
level (together with regulator support), the 'resets' and 'reset-gpios'
DT property is a stable ABI defined in mdio.yaml so improving its support
is in order as we'll have to stick with it anyway. Current implementation
contains an unnecessary limitation where drivers without probe() can't
define resets.

Changes from the previous version:
- order forward declarations in patch 4 alphabetically
- collect review tags

[1] https://lkml.org/lkml/2020/6/22/253
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: phy: mdio: reset MDIO devices even if probe() is not implemented
Bartosz Golaszewski [Fri, 26 Jun 2020 15:53:25 +0000 (17:53 +0200)]
net: phy: mdio: reset MDIO devices even if probe() is not implemented

Similarily to PHY drivers - there's no reason to require probe() to be
implemented in order to call mdio_device_reset(). MDIO devices can have
resets defined without needing to do anything in probe().

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: phy: reset the PHY even if probe() is not implemented
Bartosz Golaszewski [Fri, 26 Jun 2020 15:53:24 +0000 (17:53 +0200)]
net: phy: reset the PHY even if probe() is not implemented

Currently we only call phy_device_reset() if the PHY driver implements
the probe() callback. This is not mandatory and many drivers (e.g.
realtek) don't need probe() for most devices but still can have reset
GPIOs defined. There's no reason to depend on the presence of probe()
here so pull the reset code out of the if clause.

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: mdio: add a forward declaration for reset_control to mdio.h
Bartosz Golaszewski [Fri, 26 Jun 2020 15:53:23 +0000 (17:53 +0200)]
net: mdio: add a forward declaration for reset_control to mdio.h

This header refers to struct reset_control but doesn't include any reset
header. The structure definition is probably somehow indirectly pulled in
since no warnings are reported but for the sake of correctness add the
forward declaration for struct reset_control.

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: phy: arrange headers in phy_device.c alphabetically
Bartosz Golaszewski [Fri, 26 Jun 2020 15:53:22 +0000 (17:53 +0200)]
net: phy: arrange headers in phy_device.c alphabetically

Keeping the headers in alphabetical order is better for readability and
allows to easily see if given header is already included.

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: phy: arrange headers in mdio_device.c alphabetically
Bartosz Golaszewski [Fri, 26 Jun 2020 15:53:21 +0000 (17:53 +0200)]
net: phy: arrange headers in mdio_device.c alphabetically

Keeping the headers in alphabetical order is better for readability and
allows to easily see if given header is already included.

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: phy: arrange headers in mdio_bus.c alphabetically
Bartosz Golaszewski [Fri, 26 Jun 2020 15:53:20 +0000 (17:53 +0200)]
net: phy: arrange headers in mdio_bus.c alphabetically

Keeping the headers in alphabetical order is better for readability and
allows to easily see if given header is already included.

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: mvneta: speed down the PHY, if WoL used, to save energy
Daniel González Cabanelas [Fri, 26 Jun 2020 15:18:19 +0000 (17:18 +0200)]
net: mvneta: speed down the PHY, if WoL used, to save energy

Some PHYs connected to this ethernet hardware support the WoL feature.
But when WoL is enabled and the machine is powered off, the PHY remains
waiting for a magic packet at max speed (i.e. 1Gbps), which is a waste of
energy.

Slow down the PHY speed before stopping the ethernet if WoL is enabled,
and save some energy while the machine is powered off or sleeping.

Tested using an Armada 370 based board (LS421DE) equipped with a Marvell
88E1518 PHY.

Signed-off-by: Daniel González Cabanelas <dgcbueu@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next...
David S. Miller [Fri, 26 Jun 2020 19:22:34 +0000 (12:22 -0700)]
Merge branch '40GbE' of git://git./linux/kernel/git/jkirsher/next-queue

Jeff Kirsher says:

====================
40GbE Intel Wired LAN Driver Updates 2020-06-25

This series contains updates to i40e driver and removes the individual
driver versions from all of the Intel wired LAN drivers.

Shiraz moves the client header so that it can easily be shared between
the i40e LAN driver and i40iw RDMA driver.

Jesse cleans up the unused defines, since they are just dead weight.

Alek reduces the unreasonably long wait time for a PF reset after reboot
by using jiffies to limit the maximum wait time for the PF reset to
succeed.  Added additional logging to let the user know when the driver
transitions into recovery mode.  Adds new device support for our 5 Gbps
NICs.

Todd adds a check to see if MFS is set after warm reboot and notifies
the user when MFS is set to anything lower than the default value.

Arkadiusz fixes a possible race condition, where were holding a
spin-lock while in atomic context.

v2: removed code comments that were no longer applicable in patch 2 of
    the series.  Also removed 'inline' from patch 4 and patch 8 of the
    series.  Also re-arranged code to be able to remove the forward
    function declarations.  Dropped patch 9 of the series, while the
    author works on cleaning up the commit message.
v3: Updated patch 8 description to answer Jakub's questions
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agobonding: Remove extraneous parentheses in bond_setup
Nathan Chancellor [Fri, 26 Jun 2020 04:10:02 +0000 (21:10 -0700)]
bonding: Remove extraneous parentheses in bond_setup

Clang warns:

drivers/net/bonding/bond_main.c:4657:23: warning: equality comparison
with extraneous parentheses [-Wparentheses-equality]
        if ((BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP))
             ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~

drivers/net/bonding/bond_main.c:4681:23: warning: equality comparison
with extraneous parentheses [-Wparentheses-equality]
        if ((BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP))
             ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~

This warning occurs when a comparision has two sets of parentheses,
which is usually the convention for doing an assignment within an
if statement. Since equality comparisons do not need a second set of
parentheses, remove them to fix the warning.

Fixes: 18cb261afd7b ("bonding: support hardware encryption offload to slaves")
Link: https://github.com/ClangBuiltLinux/linux/issues/1066
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Reported-by: kernelci.org bot <bot@kernelci.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: stmmac: dwmac-meson8b: use clk_parent_data for clock registration
Martin Blumenstingl [Thu, 25 Jun 2020 18:21:42 +0000 (20:21 +0200)]
net: stmmac: dwmac-meson8b: use clk_parent_data for clock registration

Simplify meson8b_init_rgmii_tx_clk() by using struct clk_parent_data to
initialize the clock parents. No functional changes intended.

Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agobnx2x: use generic power management
Vaibhav Gupta [Wed, 24 Jun 2020 17:51:17 +0000 (23:21 +0530)]
bnx2x: use generic power management

With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states.

After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.

The driver was also calling bnx2x_set_power_state() to set the power state
of the device by changing the device's registers' value. It is no more
needed.

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Acked-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoi40e: Remove scheduling while atomic possibility
Aleksandr Loktionov [Fri, 29 May 2020 21:10:39 +0000 (14:10 -0700)]
i40e: Remove scheduling while atomic possibility

In some occasions task held spinlock (mac_filter_hash_lock),
while being rescheduled due to admin queue mutex_lock.  The struct
i40e_spinlock asq_spinlock, which later expands to struct mutex
spinlock.  Moved i40e_aq_set_vsi_multicast_promiscuous(),
i40e_aq_set_vsi_unicast_promiscuous(),
i40e_aq_set_vsi_mc_promisc_on_vlan(), and
i40e_aq_set_vsi_uc_promisc_on_vlan() outside of atomic context.  Without
this patch there is a race condition, which might result in scheduling
while in atomic context.  The race condition is between the thread, which
holds mac_filter_hash_lock, while trying to acquire an admin queue mutex
and a thread, which already has said admin queue mutex. The thread, which
holds spinlock, fails to acquire the mutex, which causes this thread to
sleep.

Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
4 years agoi40e: Add support for 5Gbps cards
Aleksandr Loktionov [Fri, 29 May 2020 20:01:22 +0000 (13:01 -0700)]
i40e: Add support for 5Gbps cards

Make possible for the i40e driver to bind to the new v710 for 5GBASE-T
NICs.

Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
4 years agonet/intel: remove driver versions from Intel drivers
Jeff Kirsher [Fri, 29 May 2020 07:18:33 +0000 (00:18 -0700)]
net/intel: remove driver versions from Intel drivers

As with other networking drivers, remove the unnecessary driver version
from the Intel drivers. The ethtool driver information and module version
will then report the kernel version instead.

For ixgbe, i40e and ice drivers, the driver passes the driver version to
the firmware to confirm that we are up and running.  So we now pass the
value of UTS_RELEASE to the firmware.  This adminq call is required per
the HAS document.  The Device then sends an indication to the BMC that the
PF driver is present. This is done using Host NC Driver Status Indication
in NC-SI Get Link command or via the Host Network Controller Driver Status
Change AEN.

What the BMC may do with this information is implementation-dependent, but
this is a standard NC-SI 1.1 command we honor per the HAS.

CC: Bruce Allan <bruce.w.allan@intel.com>
CC: Jesse Brandeburg <jesse.brandeburg@intel.com>
CC: Alek Loktionov <aleksandr.loktionov@intel.com>
CC: Kevin Liedtke <kevin.d.liedtke@intel.com>
CC: Aaron Rowden <aaron.f.rowden@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Co-developed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
4 years agoi40e: Add a check to see if MFS is set
Todd Fujinaka [Fri, 29 May 2020 05:27:12 +0000 (22:27 -0700)]
i40e: Add a check to see if MFS is set

A customer was chain-booting to provision his systems and one of the
steps was setting MFS. MFS isn't cleared by normal warm reboots
(clearing requires a GLOBR) and there was no indication of why Jumbo
Frame receives were failing.

Add a warning if MFS is set to anything lower than the default.

Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
4 years agoi40e: detect and log info about pre-recovery mode
Piotr Kwapulinski [Wed, 27 May 2020 21:12:04 +0000 (14:12 -0700)]
i40e: detect and log info about pre-recovery mode

Detect and log information about pre-recovery mode when firmware
transitions to a recovery mode.
When a firmware transitions to a recovery mode it stores a number
of unexpected EMP resets in one of its registers. The number of EMP
resets ranging from 0x21 to 0x2A indicates that FW transitions
to recovery mode. Use these values to emit log entry about transition
process. Previously the pre-recovery mode may not have been detected
and there was no log entry when NIC was in pre-recovery mode.

Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Piotr Kwapulinski <piotr.kwapulinski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
4 years agoi40e: make PF wait reset loop reliable
Piotr Kwapulinski [Tue, 26 May 2020 10:51:12 +0000 (12:51 +0200)]
i40e: make PF wait reset loop reliable

Use jiffies to limit max waiting time for PF reset to succeed.
Previous wait loop was unreliable. It required unreasonably long time
to wait for PF reset after reboot when NIC was about to enter
recovery mode

Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Piotr Kwapulinski <piotr.kwapulinski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
4 years agoi40e: remove unused defines
Jesse Brandeburg [Tue, 7 Jan 2020 00:09:33 +0000 (16:09 -0800)]
i40e: remove unused defines

Remove all the unused defines as they are just dead weight.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
4 years agoi40e: Move client header location
Shiraz Saleem [Mon, 4 May 2020 16:43:48 +0000 (09:43 -0700)]
i40e: Move client header location

Move i40e_client.h to include/linux/net/intel/*
since its shared between i40iw and i40e.

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
4 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
David S. Miller [Fri, 26 Jun 2020 02:29:51 +0000 (19:29 -0700)]
Merge git://git./linux/kernel/git/netdev/net

Minor overlapping changes in xfrm_device.c, between the double
ESP trailing bug fix setting the XFRM_INIT flag and the changes
in net-next preparing for bonding encryption support.

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Fri, 26 Jun 2020 01:27:40 +0000 (18:27 -0700)]
Merge git://git./linux/kernel/git/netdev/net

Pull networking fixes from David Miller:

 1) Don't insert ESP trailer twice in IPSEC code, from Huy Nguyen.

 2) The default crypto algorithm selection in Kconfig for IPSEC is out
    of touch with modern reality, fix this up. From Eric Biggers.

 3) bpftool is missing an entry for BPF_MAP_TYPE_RINGBUF, from Andrii
    Nakryiko.

 4) Missing init of ->frame_sz in xdp_convert_zc_to_xdp_frame(), from
    Hangbin Liu.

 5) Adjust packet alignment handling in ax88179_178a driver to match
    what the hardware actually does. From Jeremy Kerr.

 6) register_netdevice can leak in the case one of the notifiers fail,
    from Yang Yingliang.

 7) Use after free in ip_tunnel_lookup(), from Taehee Yoo.

 8) VLAN checks in sja1105 DSA driver need adjustments, from Vladimir
    Oltean.

 9) tg3 driver can sleep forever when we get enough EEH errors, fix from
    David Christensen.

10) Missing {READ,WRITE}_ONCE() annotations in various Intel ethernet
    drivers, from Ciara Loftus.

11) Fix scanning loop break condition in of_mdiobus_register(), from
    Florian Fainelli.

12) MTU limit is incorrect in ibmveth driver, from Thomas Falcon.

13) Endianness fix in mlxsw, from Ido Schimmel.

14) Use after free in smsc95xx usbnet driver, from Tuomas Tynkkynen.

15) Missing bridge mrp configuration validation, from Horatiu Vultur.

16) Fix circular netns references in wireguard, from Jason A. Donenfeld.

17) PTP initialization on recovery is not done properly in qed driver,
    from Alexander Lobakin.

18) Endian conversion of L4 ports in filters of cxgb4 driver is wrong,
    from Rahul Lakkireddy.

19) Don't clear bound device TX queue of socket prematurely otherwise we
    get problems with ktls hw offloading, from Tariq Toukan.

20) ipset can do atomics on unaligned memory, fix from Russell King.

21) Align ethernet addresses properly in bridging code, from Thomas
    Martitz.

22) Don't advertise ipv4 addresses on SCTP sockets having ipv6only set,
    from Marcelo Ricardo Leitner.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (149 commits)
  rds: transport module should be auto loaded when transport is set
  sch_cake: fix a few style nits
  sch_cake: don't call diffserv parsing code when it is not needed
  sch_cake: don't try to reallocate or unshare skb unconditionally
  ethtool: fix error handling in linkstate_prepare_data()
  wil6210: account for napi_gro_receive never returning GRO_DROP
  hns: do not cast return value of napi_gro_receive to null
  socionext: account for napi_gro_receive never returning GRO_DROP
  wireguard: receive: account for napi_gro_receive never returning GRO_DROP
  vxlan: fix last fdb index during dump of fdb with nhid
  sctp: Don't advertise IPv4 addresses if ipv6only is set on the socket
  tc-testing: avoid action cookies with odd length.
  bpf: tcp: bpf_cubic: fix spurious HYSTART_DELAY exit upon drop in min RTT
  tcp_cubic: fix spurious HYSTART_DELAY exit upon drop in min RTT
  net: dsa: sja1105: fix tc-gate schedule with single element
  net: dsa: sja1105: recalculate gating subschedule after deleting tc-gate rules
  net: dsa: sja1105: unconditionally free old gating config
  net: dsa: sja1105: move sja1105_compose_gating_subschedule at the top
  net: macb: free resources on failure path of at91ether_open()
  net: macb: call pm_runtime_put_sync on failure path
  ...

4 years agosch_cake: add RFC 8622 LE PHB support to CAKE diffserv handling
Kevin Darbyshire-Bryant [Thu, 25 Jun 2020 20:18:00 +0000 (22:18 +0200)]
sch_cake: add RFC 8622 LE PHB support to CAKE diffserv handling

Change tin mapping on diffserv3, 4 & 8 for LE PHB support, in essence
making LE a member of the Bulk tin.

Bulk has the least priority and minimum of 1/16th total bandwidth in the
face of higher priority traffic.

NB: Diffserv 3 & 4 swap tin 0 & 1 priorities from the default order as
found in diffserv8, in case anyone is wondering why it looks a bit odd.

Signed-off-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
[ reword commit message slightly ]
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agords: transport module should be auto loaded when transport is set
Rao Shoaib [Thu, 25 Jun 2020 20:46:00 +0000 (13:46 -0700)]
rds: transport module should be auto loaded when transport is set

This enhancement auto loads transport module when the transport
is set via SO_RDS_TRANSPORT socket option.

Reviewed-by: Ka-Cheong Poon <ka-cheong.poon@oracle.com>
Reviewed-by: HÃ¥kon Bugge <haakon.bugge@oracle.com>
Signed-off-by: Rao Shoaib <rao.shoaib@oracle.com>
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'sched-A-couple-of-fixes-for-sch_cake'
David S. Miller [Thu, 25 Jun 2020 23:24:05 +0000 (16:24 -0700)]
Merge branch 'sched-A-couple-of-fixes-for-sch_cake'

Toke Høiland-Jørgensen says:

====================
sched: A couple of fixes for sch_cake

This series contains a couple of fixes for diffserv handling in sch_cake that
provide a nice speedup (with a somewhat pedantic nit fix tacked on to the end).

Not quite sure about whether this should go to stable; it does provide a nice
speedup, but it's not strictly a fix in the "correctness" sense. I lean towards
including this in stable as well, since our most important consumer of that
(OpenWrt) is likely to backport the series anyway.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agosch_cake: fix a few style nits
Toke Høiland-Jørgensen [Thu, 25 Jun 2020 20:12:09 +0000 (22:12 +0200)]
sch_cake: fix a few style nits

I spotted a few nits when comparing the in-tree version of sch_cake with
the out-of-tree one: A redundant error variable declaration shadowing an
outer declaration, and an indentation alignment issue. Fix both of these.

Fixes: 046f6fd5daef ("sched: Add Common Applications Kept Enhanced (cake) qdisc")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agosch_cake: don't call diffserv parsing code when it is not needed
Toke Høiland-Jørgensen [Thu, 25 Jun 2020 20:12:08 +0000 (22:12 +0200)]
sch_cake: don't call diffserv parsing code when it is not needed

As a further optimisation of the diffserv parsing codepath, we can skip it
entirely if CAKE is configured to neither use diffserv-based
classification, nor to zero out the diffserv bits.

Fixes: c87b4ecdbe8d ("sch_cake: Make sure we can write the IP header before changing DSCP bits")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agosch_cake: don't try to reallocate or unshare skb unconditionally
Ilya Ponetayev [Thu, 25 Jun 2020 20:12:07 +0000 (22:12 +0200)]
sch_cake: don't try to reallocate or unshare skb unconditionally

cake_handle_diffserv() tries to linearize mac and network header parts of
skb and to make it writable unconditionally. In some cases it leads to full
skb reallocation, which reduces throughput and increases CPU load. Some
measurements of IPv4 forward + NAPT on MIPS router with 580 MHz single-core
CPU was conducted. It appears that on kernel 4.9 skb_try_make_writable()
reallocates skb, if skb was allocated in ethernet driver via so-called
'build skb' method from page cache (it was discovered by strange increase
of kmalloc-2048 slab at first).

Obtain DSCP value via read-only skb_header_pointer() call, and leave
linearization only for DSCP bleaching or ECN CE setting. And, as an
additional optimisation, skip diffserv parsing entirely if it is not needed
by the current configuration.

Fixes: c87b4ecdbe8d ("sch_cake: Make sure we can write the IP header before changing DSCP bits")
Signed-off-by: Ilya Ponetayev <i.ponetaev@ndmsystems.com>
[ fix a few style issues, reflow commit message ]
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'net-phy-mscc-multiple-improvements'
David S. Miller [Thu, 25 Jun 2020 23:22:11 +0000 (16:22 -0700)]
Merge branch 'net-phy-mscc-multiple-improvements'

Antoine Tenart says:

====================
net: phy: mscc: multiple improvements

This series contains various improvements to the MSCC PHY driver, fixing
sparse and smatch warnings, using functions provided by the PHY core,
and improving the driver consistency and maintenance.

I don't think any of those improvements and fixes is worth backporting
to stable trees.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: phy: mscc: improve vsc8514/8584_config_init consistency
Antoine Tenart [Thu, 25 Jun 2020 15:42:11 +0000 (17:42 +0200)]
net: phy: mscc: improve vsc8514/8584_config_init consistency

All PHY read and write return values are checked for errors in
vsc8514_config_init and vsc8584_config_init, except for one. Fix this.

Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: phy: mscc: remove useless page configuration in the config init
Antoine Tenart [Thu, 25 Jun 2020 15:42:10 +0000 (17:42 +0200)]
net: phy: mscc: remove useless page configuration in the config init

In the middle of vsc8584_config_init and vsc8514_config_init, the page
is set to 'standard'. This is the default value, and the page isn't set
to another value before. Those pages configuration can be safely
removed.

Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: phy: mscc: restore the base page in vsc8514/8584_config_init
Antoine Tenart [Thu, 25 Jun 2020 15:42:09 +0000 (17:42 +0200)]
net: phy: mscc: restore the base page in vsc8514/8584_config_init

In the vsc8584_config_init and vsc8514_config_init, the base page is set
to 'GPIO', configuration is done, and the page is never explicitly
restored to the standard page. No bug was triggered as it turns out
helpers called in those config_init functions do modify the base page,
and set it back to standard. But that is dangerous and any modification
to those functions would introduce bugs. This patch fixes this, to
improve maintenance, by restoring the base page to 'standard' once
'GPIO' accesses are completed.

Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: phy: mscc: do not access the MDIO bus lock directly
Antoine Tenart [Thu, 25 Jun 2020 15:42:08 +0000 (17:42 +0200)]
net: phy: mscc: do not access the MDIO bus lock directly

This patch improves the MSCC driver by using the provided
phy_lock_mdio_bus and phy_unlock_mdio_bus helpers instead of locking and
unlocking the MDIO bus lock directly. The patch is only cosmetic but
should improve maintenance and consistency.

Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: phy: mscc: ptp: fix a typo in a comment
Antoine Tenart [Thu, 25 Jun 2020 15:42:07 +0000 (17:42 +0200)]
net: phy: mscc: ptp: fix a typo in a comment

This patch fixes a typo in a comment, s/Ths/This/. The patch is cosmetic
only.

Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: phy: mscc: ptp: fix a smatch error
Antoine Tenart [Thu, 25 Jun 2020 15:42:06 +0000 (17:42 +0200)]
net: phy: mscc: ptp: fix a smatch error

The following error was reported by smatch:
vsc85xx_ts_read_csr() error: uninitialized symbol 'blk_hw'.

In practice this is very unlikely, as all the block identifiers given to
this functions are handled and described in an enum. The smatch error is
fixed by doing what is already done in vsc85xx_ts_write_csr: using the
"PROCESSOR" block by default.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: phy: mscc: fix a possible double unlock
Antoine Tenart [Thu, 25 Jun 2020 15:42:05 +0000 (17:42 +0200)]
net: phy: mscc: fix a possible double unlock

On vsc8584_ptp_init failure we jump to the 'err' label, which unlocks
the MDIO bus lock. But vsc8584_ptp_init isn't called with the MDIO bus
lock taken, which could result in a double unlock. Fix this.

Fixes: ab2bf9339357 ("net: phy: mscc: 1588 block initialization")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: phy: mscc: macsec: fix sparse warnings
Antoine Tenart [Thu, 25 Jun 2020 15:42:04 +0000 (17:42 +0200)]
net: phy: mscc: macsec: fix sparse warnings

This patch fixes the following sparse warnings when building MACsec
support in the MSCC PHY driver.

  mscc_macsec.c:393:42: warning: cast from restricted sci_t
  mscc_macsec.c:395:42: warning: restricted sci_t degrades to integer
  mscc_macsec.c:402:42: warning: restricted __be16 degrades to integer
  mscc_macsec.c:608:34: warning: cast from restricted sci_t
  mscc_macsec.c:610:34: warning: restricted sci_t degrades to integer

Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoethtool: fix error handling in linkstate_prepare_data()
Michal Kubecek [Wed, 24 Jun 2020 22:09:08 +0000 (00:09 +0200)]
ethtool: fix error handling in linkstate_prepare_data()

When getting SQI or maximum SQI value fails in linkstate_prepare_data(), we
must not return without calling ethnl_ops_complete(dev) as that could
result in imbalance between ethtool_ops ->begin() and ->complete() calls.

Fixes: 806602191592 ("ethtool: provide UAPI for PHY Signal Quality Index (SQI)")
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge tag 'trace-v5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt...
Linus Torvalds [Thu, 25 Jun 2020 23:16:49 +0000 (16:16 -0700)]
Merge tag 'trace-v5.8-rc2' of git://git./linux/kernel/git/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:
 "Four small fixes:

   - Fix a ringbuffer bug for nested events having time go backwards

   - Fix a config dependency for boot time tracing to depend on
     synthetic events instead of histograms.

   - Fix trigger format parsing to handle multiple spaces

   - Fix bootconfig to handle failures in multiple events"

* tag 'trace-v5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing/boottime: Fix kprobe multiple events
  tracing: Fix event trigger to accept redundant spaces
  tracing/boot: Fix config dependency for synthedic event
  ring-buffer: Zero out time extend if it is nested and not absolute

4 years agoMerge branch 'napi_gro_receive-caller-return-value-cleanups'
David S. Miller [Thu, 25 Jun 2020 23:16:21 +0000 (16:16 -0700)]
Merge branch 'napi_gro_receive-caller-return-value-cleanups'

Jason A. Donenfeld says:

====================
napi_gro_receive caller return value cleanups

In 6570bc79c0df ("net: core: use listified Rx for GRO_NORMAL in
napi_gro_receive()"), the GRO_NORMAL case stopped calling
netif_receive_skb_internal, checking its return value, and returning
GRO_DROP in case it failed. Instead, it calls into
netif_receive_skb_list_internal (after a bit of indirection), which
doesn't return any error. Therefore, napi_gro_receive will never return
GRO_DROP, making handling GRO_DROP dead code.

I emailed the author of 6570bc79c0df on netdev [1] to see if this change
was intentional, but the dlink.ru email address has been disconnected,
and looking a bit further myself, it seems somewhat infeasible to start
propagating return values backwards from the internal machinations of
netif_receive_skb_list_internal.

Taking a look at all the callers of napi_gro_receive, it appears that
three are checking the return value for the purpose of comparing it to
the now never-happening GRO_DROP, and one just casts it to (void), a
likely historical leftover. Every other of the 120 callers does not
bother checking the return value.

And it seems like these remaining 116 callers are doing the right thing:
after calling napi_gro_receive, the packet is now in the hands of the
upper layers of the newtworking, and the device driver itself has no
business now making decisions based on what the upper layers choose to
do. Incrementing stats counters on GRO_DROP seems like a mistake, made
by these three drivers, but not by the remaining 117.

It would seem, therefore, that after rectifying these four callers of
napi_gro_receive, that I should go ahead and just remove returning the
value from napi_gro_receive all together. However, napi_gro_receive has
a function event tracer, and being able to introspect into the
networking stack to see how often napi_gro_receive is returning whatever
interesting GRO status (aside from _DROP) remains an interesting
data point worth keeping for debugging.

So, this series simply gets rid of the return value checking for the
four useless places where that check never evaluates to anything
meaningful.

[1] https://lore.kernel.org/netdev/20200624210606.GA1362687@zx2c4.com/
====================

Acked-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agowil6210: account for napi_gro_receive never returning GRO_DROP
Jason A. Donenfeld [Wed, 24 Jun 2020 22:06:06 +0000 (16:06 -0600)]
wil6210: account for napi_gro_receive never returning GRO_DROP

The napi_gro_receive function no longer returns GRO_DROP ever, making
handling GRO_DROP dead code. This commit removes that dead code.
Further, it's not even clear that device drivers have any business in
taking action after passing off received packets; that's arguably out of
their hands. In this case, too, the non-gro path didn't bother checking
the return value. Plus, this had some clunky debugging functions that
duplicated code from elsewhere and was generally pretty messy. So, this
commit cleans that all up too.

Fixes: 6570bc79c0df ("net: core: use listified Rx for GRO_NORMAL in napi_gro_receive()")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agohns: do not cast return value of napi_gro_receive to null
Jason A. Donenfeld [Wed, 24 Jun 2020 22:06:05 +0000 (16:06 -0600)]
hns: do not cast return value of napi_gro_receive to null

Basically no drivers care about the return value here, and there's no
__must_check that would make casting to void sensible, so remove it.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agosocionext: account for napi_gro_receive never returning GRO_DROP
Jason A. Donenfeld [Wed, 24 Jun 2020 22:06:04 +0000 (16:06 -0600)]
socionext: account for napi_gro_receive never returning GRO_DROP

The napi_gro_receive function no longer returns GRO_DROP ever, making
handling GRO_DROP dead code. This commit removes that dead code.
Further, it's not even clear that device drivers have any business in
taking action after passing off received packets; that's arguably out of
their hands.

Fixes: 6570bc79c0df ("net: core: use listified Rx for GRO_NORMAL in napi_gro_receive()")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agowireguard: receive: account for napi_gro_receive never returning GRO_DROP
Jason A. Donenfeld [Wed, 24 Jun 2020 22:06:03 +0000 (16:06 -0600)]
wireguard: receive: account for napi_gro_receive never returning GRO_DROP

The napi_gro_receive function no longer returns GRO_DROP ever, making
handling GRO_DROP dead code. This commit removes that dead code.
Further, it's not even clear that device drivers have any business in
taking action after passing off received packets; that's arguably out of
their hands.

Fixes: e7096c131e51 ("net: WireGuard secure network tunnel")
Fixes: 6570bc79c0df ("net: core: use listified Rx for GRO_NORMAL in napi_gro_receive()")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agovxlan: fix last fdb index during dump of fdb with nhid
Roopa Prabhu [Wed, 24 Jun 2020 21:02:36 +0000 (14:02 -0700)]
vxlan: fix last fdb index during dump of fdb with nhid

This patch fixes last saved fdb index in fdb dump handler when
handling fdb's with nhid.

Fixes: 1274e1cc4226 ("vxlan: ecmp support for mac fdb entries")
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agosctp: Don't advertise IPv4 addresses if ipv6only is set on the socket
Marcelo Ricardo Leitner [Wed, 24 Jun 2020 20:34:18 +0000 (17:34 -0300)]
sctp: Don't advertise IPv4 addresses if ipv6only is set on the socket

If a socket is set ipv6only, it will still send IPv4 addresses in the
INIT and INIT_ACK packets. This potentially misleads the peer into using
them, which then would cause association termination.

The fix is to not add IPv4 addresses to ipv6only sockets.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Tested-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agotc-testing: avoid action cookies with odd length.
Briana Oursler [Wed, 24 Jun 2020 19:29:14 +0000 (12:29 -0700)]
tc-testing: avoid action cookies with odd length.

Update odd length cookie hexstrings in csum.json, tunnel_key.json and
bpf.json to be even length to comply with check enforced in commit
0149dabf2a1b ("tc: m_actions: check cookie hexstring len") in iproute2.

Signed-off-by: Briana Oursler <briana.oursler@gmail.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'tcp_cubic-fix-spurious-HYSTART_DELAY-on-RTT-decrease'
David S. Miller [Thu, 25 Jun 2020 23:08:47 +0000 (16:08 -0700)]
Merge branch 'tcp_cubic-fix-spurious-HYSTART_DELAY-on-RTT-decrease'

Neal Cardwell says:

====================
tcp_cubic: fix spurious HYSTART_DELAY on RTT decrease

This series fixes a long-standing bug in the TCP CUBIC
HYSTART_DELAY mechanim recently reported by Mirja Kuehlewind. The
code can cause a spurious exit of slow start in some particular
cases: upon an RTT decrease that happens on the 9th or later ACK
in a round trip. This series fixes the original Hystart code and
also the recent BPF implementation.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agobpf: tcp: bpf_cubic: fix spurious HYSTART_DELAY exit upon drop in min RTT
Neal Cardwell [Wed, 24 Jun 2020 16:42:03 +0000 (12:42 -0400)]
bpf: tcp: bpf_cubic: fix spurious HYSTART_DELAY exit upon drop in min RTT

Apply the fix from:
 "tcp_cubic: fix spurious HYSTART_DELAY exit upon drop in min RTT"
to the BPF implementation of TCP CUBIC congestion control.

Repeating the commit description here for completeness:

Mirja Kuehlewind reported a bug in Linux TCP CUBIC Hystart, where
Hystart HYSTART_DELAY mechanism can exit Slow Start spuriously on an
ACK when the minimum rtt of a connection goes down. From inspection it
is clear from the existing code that this could happen in an example
like the following:

o The first 8 RTT samples in a round trip are 150ms, resulting in a
  curr_rtt of 150ms and a delay_min of 150ms.

o The 9th RTT sample is 100ms. The curr_rtt does not change after the
  first 8 samples, so curr_rtt remains 150ms. But delay_min can be
  lowered at any time, so delay_min falls to 100ms. The code executes
  the HYSTART_DELAY comparison between curr_rtt of 150ms and delay_min
  of 100ms, and the curr_rtt is declared far enough above delay_min to
  force a (spurious) exit of Slow start.

The fix here is simple: allow every RTT sample in a round trip to
lower the curr_rtt.

Fixes: 6de4a9c430b5 ("bpf: tcp: Add bpf_cubic example")
Reported-by: Mirja Kuehlewind <mirja.kuehlewind@ericsson.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agotcp_cubic: fix spurious HYSTART_DELAY exit upon drop in min RTT
Neal Cardwell [Wed, 24 Jun 2020 16:42:02 +0000 (12:42 -0400)]
tcp_cubic: fix spurious HYSTART_DELAY exit upon drop in min RTT

Mirja Kuehlewind reported a bug in Linux TCP CUBIC Hystart, where
Hystart HYSTART_DELAY mechanism can exit Slow Start spuriously on an
ACK when the minimum rtt of a connection goes down. From inspection it
is clear from the existing code that this could happen in an example
like the following:

o The first 8 RTT samples in a round trip are 150ms, resulting in a
  curr_rtt of 150ms and a delay_min of 150ms.

o The 9th RTT sample is 100ms. The curr_rtt does not change after the
  first 8 samples, so curr_rtt remains 150ms. But delay_min can be
  lowered at any time, so delay_min falls to 100ms. The code executes
  the HYSTART_DELAY comparison between curr_rtt of 150ms and delay_min
  of 100ms, and the curr_rtt is declared far enough above delay_min to
  force a (spurious) exit of Slow start.

The fix here is simple: allow every RTT sample in a round trip to
lower the curr_rtt.

Fixes: ae27e98a5152 ("[TCP] CUBIC v2.3")
Reported-by: Mirja Kuehlewind <mirja.kuehlewind@ericsson.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'Fixes-for-SJA1105-DSA-tc-gate-action'
David S. Miller [Thu, 25 Jun 2020 23:06:56 +0000 (16:06 -0700)]
Merge branch 'Fixes-for-SJA1105-DSA-tc-gate-action'

Vladimir Oltean says:

====================
Fixes for SJA1105 DSA tc-gate action

This small series fixes 2 bugs in the tc-gate implementation:
1. The TAS state machine keeps getting rescheduled even after removing
   tc-gate actions on all ports.
2. tc-gate actions with only one gate control list entry are installed
   to hardware with an incorrect interval of zero, which makes the
   switch erroneously drop those packets (since the configuration is
   invalid).

To keep the code palatable, a forward-declaration was avoided by moving
some code around in patch 1/4. I hope that isn't too much of an issue.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: dsa: sja1105: fix tc-gate schedule with single element
Vladimir Oltean [Wed, 24 Jun 2020 13:54:47 +0000 (16:54 +0300)]
net: dsa: sja1105: fix tc-gate schedule with single element

The sja1105_gating_cfg_time_to_interval function does this, as per the
comments:

/* The gate entries contain absolute times in their e->interval field. Convert
 * that to proper intervals (i.e. "0, 5, 10, 15" to "5, 5, 5, 5").
 */

To perform that task, it iterates over gating_cfg->entries, at each step
updating the interval of the _previous_ entry. So one interval remains
to be updated at the end of the loop: the last one (since it isn't
"prev" for anyone else).

But there was an erroneous check, that the last element's interval
should not be updated if it's also the only element. I'm not quite sure
why that check was there, but it's clearly incorrect, as a tc-gate
schedule with a single element would get an e->interval of zero,
regardless of the duration requested by the user. The switch wouldn't
even consider this configuration as valid: it will just drop all traffic
that matches the rule.

Fixes: 834f8933d5dd ("net: dsa: sja1105: implement tc-gate using time-triggered virtual links")
Reported-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: dsa: sja1105: recalculate gating subschedule after deleting tc-gate rules
Vladimir Oltean [Wed, 24 Jun 2020 13:54:46 +0000 (16:54 +0300)]
net: dsa: sja1105: recalculate gating subschedule after deleting tc-gate rules

Currently, tas_data->enabled would remain true even after deleting all
tc-gate rules from the switch ports, which would cause the
sja1105_tas_state_machine to get unnecessarily scheduled.

Also, if there were any errors which would prevent the hardware from
enabling the gating schedule, the sja1105_tas_state_machine would
continuously detect and print that, spamming the kernel log, even if the
rules were subsequently deleted.

The rules themselves are _not_ active, because sja1105_init_scheduling
does enough of a job to not install the gating schedule in the static
config. But the virtual link rules themselves are still present.

So call the functions that remove the tc-gate configuration from
priv->tas_data.gating_cfg, so that tas_data->enabled can be set to
false, and sja1105_tas_state_machine will stop from being scheduled.

Fixes: 834f8933d5dd ("net: dsa: sja1105: implement tc-gate using time-triggered virtual links")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: dsa: sja1105: unconditionally free old gating config
Vladimir Oltean [Wed, 24 Jun 2020 13:54:45 +0000 (16:54 +0300)]
net: dsa: sja1105: unconditionally free old gating config

Currently sja1105_compose_gating_subschedule is not prepared to be
called for the case where we want to recompute the global tc-gate
configuration after we've deleted those actions on a port.

After deleting the tc-gate actions on the last port, max_cycle_time
would become zero, and that would incorrectly prevent
sja1105_free_gating_config from getting called.

So move the freeing function above the check for the need to apply a new
configuration.

Fixes: 834f8933d5dd ("net: dsa: sja1105: implement tc-gate using time-triggered virtual links")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: dsa: sja1105: move sja1105_compose_gating_subschedule at the top
Vladimir Oltean [Wed, 24 Jun 2020 13:54:44 +0000 (16:54 +0300)]
net: dsa: sja1105: move sja1105_compose_gating_subschedule at the top

It turns out that sja1105_compose_gating_subschedule must also be called
from sja1105_vl_delete, to recalculate the overall tc-gate
configuration. Currently this is not possible without introducing a
forward declaration. So move the function at the top of the file, along
with its dependencies.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'RGMII-Internal-delay-common-property'
David S. Miller [Thu, 25 Jun 2020 23:05:21 +0000 (16:05 -0700)]
Merge branch 'RGMII-Internal-delay-common-property'

Dan Murphy says:

====================
RGMII Internal delay common property

The RGMII internal delay is a common setting found in most RGMII capable PHY
devices.  It was found that many vendor specific device tree properties exist
to do the same function. This creates a common property to be used for PHY's
that have internal delays for the Rx and Tx paths.

If the internal delay is tunable then the caller needs to pass the internal
delay array and the return will be the index in the array that was found in
the firmware node.

If the internal delay is fixed then the caller only needs to indicate which
delay to return.  There is no need for a fixed delay to add device properties
since the value is not configurable. Per the ethernet-controller.yaml the
interface type indicates that the PHY should provide the delay.

This series contains examples of both a configurable delay and a fixed delay.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: phy: DP83822: Add setting the fixed internal delay
Dan Murphy [Wed, 24 Jun 2020 12:16:05 +0000 (07:16 -0500)]
net: phy: DP83822: Add setting the fixed internal delay

The DP83822 can be configured to use the RGMII interface. There are
independent fixed 3.5ns clock shift (aka internal delay) for the TX and RX
paths. This allow either one to be set if the MII interface is RGMII and
the value is set in the firmware node.

Signed-off-by: Dan Murphy <dmurphy@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: dp83869: Add RGMII internal delay configuration
Dan Murphy [Wed, 24 Jun 2020 12:16:04 +0000 (07:16 -0500)]
net: dp83869: Add RGMII internal delay configuration

Add RGMII internal delay configuration for Rx and Tx.

Signed-off-by: Dan Murphy <dmurphy@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodt-bindings: net: Add RGMII internal delay for DP83869
Dan Murphy [Wed, 24 Jun 2020 12:16:03 +0000 (07:16 -0500)]
dt-bindings: net: Add RGMII internal delay for DP83869

Add the internal delay values into the header and update the binding
with the internal delay properties.

Signed-off-by: Dan Murphy <dmurphy@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>