platform/kernel/linux-rpi.git
3 years agoMerge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
David S. Miller [Tue, 16 Mar 2021 22:04:30 +0000 (15:04 -0700)]
Merge branch '1GbE' of git://git./linux/kernel/git/tnguy/next-queue

Tony Nguyen says:

====================
1GbE Intel Wired LAN Driver Updates 2021-03-15

This series contains updates to e1000e only.

Chen Yu says:

The NIC is put in runtime suspend status when there is no cable connected.
As a result, it is safe to keep non-wakeup NIC in runtime suspended during
s2ram because the system does not rely on the NIC plug event nor WoL to
wake up the system. Besides that, unlike the s2idle, s2ram does not need to
manipulate S0ix settings during suspend.
====================

Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ipv4: route.c: simplify procfs code
Yejune Deng [Tue, 16 Mar 2021 02:57:36 +0000 (10:57 +0800)]
net: ipv4: route.c: simplify procfs code

proc_creat_seq() that directly take a struct seq_operations,
and deal with network namespaces in ->open.

Signed-off-by: Yejune Deng <yejune.deng@gmail.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'bridge-m,cast-cleanups'
David S. Miller [Tue, 16 Mar 2021 18:57:57 +0000 (11:57 -0700)]
Merge branch 'bridge-m,cast-cleanups'

Nikolay Aleksandrov says:

====================
net: bridge: mcast: simplify allow/block EHT code

The set does two minor cleanups of the EHT allow/block handling code:
patch 01 removes code which is unreachable (it was used in initial EHT
versions, but isn't anymore) and prepares the allow/block functions to be
factored out. Patch 02 factors out common allow/block handling code.
There are no functional changes.

v2: send patch 02 and the proper version of both patches
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: bridge: mcast: factor out common allow/block EHT handling
Nikolay Aleksandrov [Mon, 15 Mar 2021 17:13:42 +0000 (19:13 +0200)]
net: bridge: mcast: factor out common allow/block EHT handling

We hande EHT state change for ALLOW messages in INCLUDE mode and for
BLOCK messages in EXCLUDE mode similarly - create the new set entries
with the proper filter mode. We also handle EHT state change for ALLOW
messages in EXCLUDE mode and for BLOCK messages in INCLUDE mode in a
similar way - delete the common entries (current set and new set).
Factor out all the common code as follows:
 - ALLOW/INCLUDE, BLOCK/EXCLUDE: call __eht_create_set_entries()
 - ALLOW/EXCLUDE, BLOCK/INCLUDE: call __eht_del_common_set_entries()

The set entries creation can be reused in __eht_inc_exc() as well.

Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: bridge: mcast: remove unreachable EHT code
Nikolay Aleksandrov [Mon, 15 Mar 2021 17:13:41 +0000 (19:13 +0200)]
net: bridge: mcast: remove unreachable EHT code

In the initial EHT versions there were common functions which handled
allow/block messages for both INCLUDE and EXCLUDE modes, but later they
were separated. It seems I've left some common code which cannot be
reached because the filter mode is checked before calling the respective
functions, i.e. the host filter is always in EXCLUDE mode when using
__eht_allow_excl() and __eht_block_excl() thus we can drop the host_excl
checks inside and simplify the code a bit.

Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: mt7530: support MDB and bridge flag operations
DENG Qingfang [Mon, 15 Mar 2021 17:09:40 +0000 (01:09 +0800)]
net: dsa: mt7530: support MDB and bridge flag operations

Support port MDB and bridge flag operations.

As the hardware can manage multicast forwarding itself, offload_fwd_mark
can be unconditionally set to true.

Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'bcm6368'
David S. Miller [Tue, 16 Mar 2021 18:52:18 +0000 (11:52 -0700)]
Merge branch 'bcm6368'

Álvaro Fernández Rojas says:

====================
net: mdio: Add BCM6368 MDIO mux bus controller

This controller is present on BCM6318, BCM6328, BCM6362, BCM6368 and BCM63268
SoCs.

v2: add changes suggested by Andrew Lunn and Jakub Kicinski.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: mdio: Add BCM6368 MDIO mux bus controller
Álvaro Fernández Rojas [Mon, 15 Mar 2021 15:45:28 +0000 (16:45 +0100)]
net: mdio: Add BCM6368 MDIO mux bus controller

This controller is present on BCM6318, BCM6328, BCM6362, BCM6368 and BCM63268
SoCs.

Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodt-bindings: net: Add bcm6368-mdio-mux bindings
Álvaro Fernández Rojas [Mon, 15 Mar 2021 15:45:27 +0000 (16:45 +0100)]
dt-bindings: net: Add bcm6368-mdio-mux bindings

Add documentations for bcm6368 mdio mux driver.

Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: lapbether: Prevent racing when checking whether the netif is running
Xie He [Thu, 11 Mar 2021 07:23:09 +0000 (23:23 -0800)]
net: lapbether: Prevent racing when checking whether the netif is running

There are two "netif_running" checks in this driver. One is in
"lapbeth_xmit" and the other is in "lapbeth_rcv". They serve to make
sure that the LAPB APIs called in these functions are called before
"lapb_unregister" is called by the "ndo_stop" function.

However, these "netif_running" checks are unreliable, because it's
possible that immediately after "netif_running" returns true, "ndo_stop"
is called (which causes "lapb_unregister" to be called).

This patch adds locking to make sure "lapbeth_xmit" and "lapbeth_rcv" can
reliably check and ensure the netif is running while doing their work.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Xie He <xie.he.0141@gmail.com>
Acked-by: Martin Schiller <ms@dev.tdt.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'ipa-qmi-fixes'
David S. Miller [Tue, 16 Mar 2021 18:17:59 +0000 (11:17 -0700)]
Merge branch 'ipa-qmi-fixes'

Alex Elder says:

====================
net: ipa: QMI fixes

Mani Sadhasivam discovered some errors in the definitions of some
QMI messages used for IPA.  This series addresses those errors,
and extends the definition of one message type to include some
newly-defined fields.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ipa: extend the INDICATION_REGISTER request
Alex Elder [Mon, 15 Mar 2021 15:21:12 +0000 (10:21 -0500)]
net: ipa: extend the INDICATION_REGISTER request

The specified format of the INDICATION_REGISTER QMI request message
has been extended to support two more optional fields:
  endpoint_desc_ind:
    sender wishes to receive endpoint descriptor information via
    an IPA ENDP_DESC indication QMI message
  bw_change_ind:
    sender wishes to receive bandwidth change information via
    an IPA BW_CHANGE indication QMI message

Add definitions that permit these fields to be formatted and parsed
by the QMI library code.

Signed-off-by: Alex Elder <elder@linaro.org>
Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ipa: fix another QMI message definition
Alex Elder [Mon, 15 Mar 2021 15:21:11 +0000 (10:21 -0500)]
net: ipa: fix another QMI message definition

The ipa_init_modem_driver_req_ei[] encoding array for the
INIT_MODEM_DRIVER request message has some errors in it.

First, the tlv_type associated with the hw_stats_quota_size field is
wrong; it duplicates the valiue used for the hw_stats_quota_base_addr
field (0x1f) and should use 0x20 instead.  The tlv_type value for
the hw_stats_drop_size field also uses the same duplicate value; it
should use 0x22 instead.

Second, there is no definition for the hw_stats_drop_base_addr
field.  It is an optional 32-bit enumerated type value.

Finally, the hw_stats_quota_base_addr, hw_stats_quota_size, and
hw_stats_drop_size fields are defined as enumerated types; they
should be unsigned 4-byte values.

Reported-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Alex Elder <elder@linaro.org>
Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ipa: fix a duplicated tlv_type value
Alex Elder [Mon, 15 Mar 2021 15:21:10 +0000 (10:21 -0500)]
net: ipa: fix a duplicated tlv_type value

In the ipa_indication_register_req_ei[] encoding array, the tlv_type
associated with the ipa_mhi_ready_ind field is wrong.  It duplicates
the value used for the data_usage_quota_reached field (0x11) and
should use value 0x12 instead.  Fix this bug.

Reported-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Alex Elder <elder@linaro.org>
Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: sja1105: fix error return code in sja1105_cls_flower_add()
Wei Yongjun [Mon, 15 Mar 2021 14:43:23 +0000 (14:43 +0000)]
net: dsa: sja1105: fix error return code in sja1105_cls_flower_add()

The return value 'rc' maybe overwrite to 0 in the flow_action_for_each
loop, the error code from the offload not support error handling will
not set. This commit fix it to return -EOPNOTSUPP.

Fixes: 6a56e19902af ("flow_offload: reject configuration of packet-per-second policing in offload drivers")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: b53: spi: allow device tree probing
Álvaro Fernández Rojas [Mon, 15 Mar 2021 14:14:23 +0000 (15:14 +0100)]
net: dsa: b53: spi: allow device tree probing

Add missing of_match_table to allow device tree probing.

Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: mdio: Alphabetically sort header inclusion
Calvin Johnson [Mon, 15 Mar 2021 10:49:05 +0000 (16:19 +0530)]
net: mdio: Alphabetically sort header inclusion

Alphabetically sort header inclusion

Signed-off-by: Calvin Johnson <calvin.johnson@oss.nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'ionic-tx-updates'
David S. Miller [Tue, 16 Mar 2021 04:27:06 +0000 (21:27 -0700)]
Merge branch 'ionic-tx-updates'

Shannon Nelson says:

====================
ionic Tx updates

Just as the Rx path recently got a face lift, it is time for the Tx path to
get some attention.  The original TSO-to-descriptor mapping was ugly and
convoluted and needed some deep work.  This series pulls the dma mapping
out of the descriptor frag mapping loop and makes the dma mapping more
generic for use in the non-TSO case.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoionic: aggregate Tx byte counting calls
Shannon Nelson [Tue, 16 Mar 2021 02:31:36 +0000 (19:31 -0700)]
ionic: aggregate Tx byte counting calls

Gather the Tx packet and byte counts and call
netdev_tx_completed_queue() only once per clean cycle.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoionic: simplify tx clean
Shannon Nelson [Tue, 16 Mar 2021 02:31:35 +0000 (19:31 -0700)]
ionic: simplify tx clean

The descriptor mappings are set up the same way whether
or not it is a TSO, so we don't need separate logic for
the two cases.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoionic: generic tx skb mapping
Shannon Nelson [Tue, 16 Mar 2021 02:31:34 +0000 (19:31 -0700)]
ionic: generic tx skb mapping

Make the new ionic_tx_map_tso() usable by the non-TSO paths,
and pull the call up a level into ionic_tx() before calling
the csum or no-csum routines.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoionic: simplify TSO descriptor mapping
Shannon Nelson [Tue, 16 Mar 2021 02:31:33 +0000 (19:31 -0700)]
ionic: simplify TSO descriptor mapping

One issue with the original TSO code was that it was working too
hard to deal with skb layouts that were never going to show up,
such as an skb->data that was longer than a single descriptor's
length.  The other issue was trying to arrange the fragment dma
mapping at the same time as figuring out the descriptors needed.
There was just too much going on at the same time.

Now we do the dma mapping first, which sets up the buffers with
skb->data in buf[0] and the remaining frags in buf[1..n-1].
Next we spread the bufs across the descriptors needed, where
each descriptor gets up to mss number of bytes.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'net-qualcomm-rmnet-stop-using-C-bit-fields'
David S. Miller [Tue, 16 Mar 2021 03:41:59 +0000 (20:41 -0700)]
Merge branch 'net-qualcomm-rmnet-stop-using-C-bit-fields'

Alex Elder says:

====================
net: qualcomm: rmnet: stop using C bit-fields

Version 6 is the same as version 5, but has been rebased on updated
net-next/master.  With any luck, the patches I'm sending out this
time won't contain garbage.

Version 5 of this series responds to a suggestion made by Alexander
Duyck, to determine the offset to the checksummed range of a packet
using skb_network_header_len() on patch 2.  I have added his
Reviewed-by tag to all (other) patches, and removed Bjorn's from
patch 2.

The change required some updates to the subsequent patches, and I
reordered some assignments in a minor way in the last patch.

I don't expect any more discussion on this series (but will respond
if there is any).  So at this point I would really appreciate it
if KS and/or Sean would offer a review, or at least acknowledge it.
I presume you two are able to independently test the code as well,
so I request that, and hope you are willing to do so.

Version 4 of this series is here:
  https://lore.kernel.org/netdev/20210315133455.1576188-1-elder@linaro.org
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: qualcomm: rmnet: don't use C bit-fields in rmnet checksum header
Alex Elder [Mon, 15 Mar 2021 21:51:51 +0000 (16:51 -0500)]
net: qualcomm: rmnet: don't use C bit-fields in rmnet checksum header

Replace the use of C bit-fields in the rmnet_map_ul_csum_header
structure with a single two-byte (big endian) structure member,
and use masks to encode or get values within it.  The content of
these fields can be accessed using simple bitwise AND and OR
operations on the (host byte order) value of the new structure
member.

Previously rmnet_map_ipv4_ul_csum_header() would update C bit-field
values in host byte order, then forcibly fix their byte order using
a combination of byte swap operations and types.

Instead, just compute the value that needs to go into the new
structure member and save it with a simple byte-order conversion.

Make similar simplifications in rmnet_map_ipv6_ul_csum_header().

Finally, in rmnet_map_checksum_uplink_packet() a set of assignments
zeroes every field in the upload checksum header.  Replace that with
a single memset() operation.

Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: qualcomm: rmnet: don't use C bit-fields in rmnet checksum trailer
Alex Elder [Mon, 15 Mar 2021 21:51:50 +0000 (16:51 -0500)]
net: qualcomm: rmnet: don't use C bit-fields in rmnet checksum trailer

Replace the use of C bit-fields in the rmnet_map_dl_csum_trailer
structure with a single one-byte field, using constant field masks
to encode or get at embedded values.

Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: qualcomm: rmnet: use masks instead of C bit-fields
Alex Elder [Mon, 15 Mar 2021 21:51:49 +0000 (16:51 -0500)]
net: qualcomm: rmnet: use masks instead of C bit-fields

The actual layout of bits defined in C bit-fields (e.g. int foo : 3)
is implementation-defined.  Structures defined in <linux/if_rmnet.h>
address this by specifying all bit-fields twice, to cover two
possible layouts.

I think this pattern is repetitive and noisy, and I find the whole
notion of compiler "bitfield endianness" to be non-intuitive.

Stop using C bit-fields for the command/data flag and the pad length
fields in the rmnet_map structure, and define a single-byte flags
field instead.  Define a mask for the single-bit "command" flag,
and another mask for the encoded pad length.  The content of both
fields can be accessed using a simple bitwise AND operation.

Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: qualcomm: rmnet: kill RMNET_MAP_GET_*() accessor macros
Alex Elder [Mon, 15 Mar 2021 21:51:48 +0000 (16:51 -0500)]
net: qualcomm: rmnet: kill RMNET_MAP_GET_*() accessor macros

The following macros, defined in "rmnet_map.h", assume a socket
buffer is provided as an argument without any real indication this
is the case.
    RMNET_MAP_GET_MUX_ID()
    RMNET_MAP_GET_CD_BIT()
    RMNET_MAP_GET_PAD()
    RMNET_MAP_GET_CMD_START()
    RMNET_MAP_GET_LENGTH()
What they hide is pretty trivial accessing of fields in a structure,
and it's much clearer to see this if we do these accesses directly.

So rather than using these accessor macros, assign a local
variable of the map header pointer type to the socket buffer data
pointer, and derereference that pointer variable.

In "rmnet_map_data.c", use sizeof(object) rather than sizeof(type)
in one spot.  Also, there's no need to byte swap 0; it's all zeros
irrespective of endianness.

Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: qualcomm: rmnet: simplify some byte order logic
Alex Elder [Mon, 15 Mar 2021 21:51:47 +0000 (16:51 -0500)]
net: qualcomm: rmnet: simplify some byte order logic

In rmnet_map_ipv4_ul_csum_header() and rmnet_map_ipv6_ul_csum_header()
the offset within a packet at which checksumming should commence is
calculated.  This calculation involves byte swapping and a forced type
conversion that makes it hard to understand.

Simplify this by computing the offset in host byte order, then
converting the result when assigning it into the header field.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: qualcomm: rmnet: mark trailer field endianness
Alex Elder [Mon, 15 Mar 2021 21:51:46 +0000 (16:51 -0500)]
net: qualcomm: rmnet: mark trailer field endianness

The fields in the checksum trailer structure used for QMAP protocol
RX packets are all big-endian format, so define them that way.

It turns out these fields are never actually used by the RMNet code.
The start offset is always assumed to be zero, and the length is
taken from the other packet headers.  So making these fields
explicitly big endian has no effect on the behavior of the code.

Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoe1000e: Remove the runtime suspend restriction on CNP+
Chen Yu [Tue, 1 Dec 2020 01:22:09 +0000 (09:22 +0800)]
e1000e: Remove the runtime suspend restriction on CNP+

Although there is platform issue of runtime suspend support
on CNP, it would be more flexible to let the user decide whether
to disable runtime or not because:
1. This can be done in userspace via
   echo on > /sys/devices/pci0000\:00/0000\:00\:1f.d/power/control
2. More and more NICs would support runtime suspend, disabling the
   runtime suspend on them by default would impact the validation.

Only disable runtime suspend on CNP in case of any user space regression.

Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Tested-by: Dvora Fuxbrumer <dvorax.fuxbrumer@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoe1000e: Leverage direct_complete to speed up s2ram
Chen Yu [Tue, 1 Dec 2020 01:21:50 +0000 (09:21 +0800)]
e1000e: Leverage direct_complete to speed up s2ram

The NIC is put in runtime suspend status when there is no cable connected.
As a result, it is safe to keep non-wakeup NIC in runtime suspended during
s2ram because the system does not rely on the NIC plug event nor WoL to wake
up the system. Besides that, unlike the s2idle, s2ram does not need to
manipulate S0ix settings during suspend.

This patch introduces the .prepare() for e1000e so that if the NIC is runtime
suspended the subsequent suspend/resume hooks will be skipped so as to speed
up the s2ram. The pm core will check whether the NIC is a wake up device so
there's no need to check it again in .prepare(). DPM_FLAG_SMART_PREPARE flag
should be set during probe to ask the pci subsystem to honor the driver's
prepare() result. Besides, the NIC remains runtime suspended after resumed
from s2ram as there is no need to resume it.

Tested on i7-2600K with 82579V NIC
Before the patch:
e1000e 0000:00:19.0: pci_pm_suspend+0x0/0x160 returned 0 after 225146 usecs
e1000e 0000:00:19.0: pci_pm_resume+0x0/0x90 returned 0 after 140588 usecs

After the patch:
echo disabled > //sys/devices/pci0000\:00/0000\:00\:19.0/power/wakeup
becomes 0 usecs because the hooks will be skipped.

Suggested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Tested-by: Dvora Fuxbrumer <dvorax.fuxbrumer@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agonet: ipa: make ipa_table_hash_support() inline
Alex Elder [Mon, 15 Mar 2021 15:01:18 +0000 (10:01 -0500)]
net: ipa: make ipa_table_hash_support() inline

In review, Alexander Duyck suggested that ipa_table_hash_support()
was trivial enough that it could be implemented as a static inline
function in the header file.  But the patch had already been
accepted.  Implement his suggestion.

Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: phy: add Marvell 88X2222 transceiver support
Ivan Bornyakov [Mon, 15 Mar 2021 14:19:26 +0000 (17:19 +0300)]
net: phy: add Marvell 88X2222 transceiver support

Add basic support for the Marvell 88X2222 multi-speed ethernet
transceiver.

This PHY provides data transmission over fiber-optic as well as Twinax
copper links. The 88X2222 supports 2 ports of 10GBase-R and 1000Base-X
on the line-side interface. The host-side interface supports 4 ports of
10GBase-R, RXAUI, 1000Base-X and 2 ports of XAUI.

This driver, however, supports only XAUI on the host-side and
1000Base-X/10GBase-R on the line-side, for now. The SGMII is also
supported over 1000Base-X. Interrupts are not supported.

Internal registers access compliant with the Clause 45 specification.

Signed-off-by: Ivan Bornyakov <i.bornyakov@metrotek.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'stmmac-clocks'
David S. Miller [Mon, 15 Mar 2021 21:46:34 +0000 (14:46 -0700)]
Merge branch 'stmmac-clocks'

Joakim Zhang says:

====================
net: stmmac: implement clocks management

This patch set tries to implement clocks management, and takes i.MX platform as an example.

---
ChangeLogs:
V1->V2:
* change to pm runtime mechanism.
* rename function: _enable() -> _config()
* take MDIO bus into account, it needs clocks when interface
is closed.
* reverse Christmass tree.
V2->V3:
* slightly simple the code according to Andrew's suggesstion
and also add tag: Reviewed-by: Andrew Lunn <andrew@lunn.ch>
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: stmmac: dwmac-imx: add platform level clocks management for i.MX
Joakim Zhang [Mon, 15 Mar 2021 12:16:48 +0000 (20:16 +0800)]
net: stmmac: dwmac-imx: add platform level clocks management for i.MX

Split clocks settings from init callback into clks_config callback,
which could support platform level clocks management.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: stmmac: add platform level clocks management
Joakim Zhang [Mon, 15 Mar 2021 12:16:47 +0000 (20:16 +0800)]
net: stmmac: add platform level clocks management

This patch intends to add platform level clocks management. Some
platforms may have their own special clocks, they also need to be
managed dynamically. If you want to manage such clocks, please implement
clks_config callback.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: stmmac: add clocks management for gmac driver
Joakim Zhang [Mon, 15 Mar 2021 12:16:46 +0000 (20:16 +0800)]
net: stmmac: add clocks management for gmac driver

This patch intends to add clocks management for stmmac driver:

If CONFIG_PM enabled:
1. Keep clocks disabled after driver probed.
2. Enable clocks when up the net device, and disable clocks when down
the net device.

If CONFIG_PM disabled:
Keep clocks always enabled after driver probed.

Note:
1. It is fine for ethtool, since the way of implementing ethtool_ops::begin
in stmmac is only can be accessed when interface is enabled, so the clocks
are ticked.
2. The MDIO bus has a different life cycle to the MAC, need ensure
clocks are enabled when _mdio_read/write() need clocks, because these
functions can be called while the interface it not opened.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'net-pcs-stmmac=add-C37-AN-SGMII-support'
David S. Miller [Mon, 15 Mar 2021 19:53:12 +0000 (12:53 -0700)]
Merge branch 'net-pcs-stmmac=add-C37-AN-SGMII-support'

Ong Boon Leong says:

====================
net: pcs, stmmac: add C37 AN SGMII support

This patch series adds MAC-side SGMII support to stmmac driver and it is
changed as follow:-

1/6: Refactor the current C73 implementation in pcs-xpcs to prepare for
     adding C37 AN later.
2/6: Add MAC-side SGMII C37 AN support to pcs-xpcs
3,4/6: make phylink_parse_mode() to work for non-DT platform so that
       we can use stmmac platform_data to set it.
5/6: Make stmmac_open() to only skip PHY init if C73 is used, otherwise
     C37 AN will need phydev to be connected to phylink.
6/6: Finally, add pcs-xpcs SGMII interface support to Intel mGbE
     controller.

The patch series have been tested on EHL CRB PCH TSN (eth2) controller
that has Marvell 88E1512 PHY attached over SGMII interface and the
iterative tests of speed change (AN) + ping test have been successful.

[63446.009295] intel-eth-pci 0000:00:1e.4 eth2: Link is Down
[63449.986365] intel-eth-pci 0000:00:1e.4 eth2: Link is Up - 1Gbps/Full - flow control off
[63449.987625] IPv6: ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready
[63451.248064] intel-eth-pci 0000:00:1e.4 eth2: Link is Down
[63454.082366] intel-eth-pci 0000:00:1e.4 eth2: Link is Up - 100Mbps/Full - flow control off
[63454.083650] IPv6: ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready
[63456.465179] intel-eth-pci 0000:00:1e.4 eth2: Link is Down
[63459.202367] intel-eth-pci 0000:00:1e.4 eth2: Link is Up - 10Mbps/Full - flow control off
[63459.203639] IPv6: ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready
[63460.882832] intel-eth-pci 0000:00:1e.4 eth2: Link is Down
[63464.322366] intel-eth-pci 0000:00:1e.4 eth2: Link is Up - 1Gbps/Full - flow control off
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agostmmac: intel: add pcs-xpcs for Intel mGbE controller
Ong Boon Leong [Mon, 15 Mar 2021 05:27:11 +0000 (13:27 +0800)]
stmmac: intel: add pcs-xpcs for Intel mGbE controller

Intel mGbE controller such as those in EHL & TGL uses pcs-xpcs driver for
SGMII interface. To ensure mdio bus scanning does not assign phy_device
to MDIO-addressable entities like intel serdes and pcs-xpcs, we set up
to phy_mask to skip them.

Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: stmmac: ensure phydev is attached to phylink for C37 AN
Ong Boon Leong [Mon, 15 Mar 2021 05:27:10 +0000 (13:27 +0800)]
net: stmmac: ensure phydev is attached to phylink for C37 AN

As the support for MAC-side SGMII C37 AN is added to pcs-xpcs, phydev
should be attached to phylink during driver's open(). So, we change the
condition to "Not C73 AN" instead.

Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: stmmac: make in-band AN mode parsing is supported for non-DT
Ong Boon Leong [Mon, 15 Mar 2021 05:27:09 +0000 (13:27 +0800)]
net: stmmac: make in-band AN mode parsing is supported for non-DT

Not all platform uses DT, so phylink_parse_mode() will skip in-band setup
of pl->supported and pl->link_config.advertising entirely. So, we add the
setting of ovr_an_inband flag to make it works for non-DT platform.

Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: phylink: make phylink_parse_mode() support non-DT platform
Ong Boon Leong [Mon, 15 Mar 2021 05:27:08 +0000 (13:27 +0800)]
net: phylink: make phylink_parse_mode() support non-DT platform

Certain platform does not support DT, so we make phylink_parse_mode() to
allow non-DT platform to use it to setup in-band AN advertising.

Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: pcs: add C37 SGMII AN support for intel mGbE controller
Ong Boon Leong [Mon, 15 Mar 2021 05:27:07 +0000 (13:27 +0800)]
net: pcs: add C37 SGMII AN support for intel mGbE controller

XPCS IP supports C37 SGMII AN process and it is used in intel multi-GbE
controller as MAC-side SGMII.

Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: pcs: rearrange C73 functions to prepare for C37 support later
Ong Boon Leong [Mon, 15 Mar 2021 05:27:06 +0000 (13:27 +0800)]
net: pcs: rearrange C73 functions to prepare for C37 support later

The current implementation for XPCS is validated for C73, so we rename them
to have _c73 suffix and introduce a set of functions to use an_mode flag
to switch between C73 and C37 AN later.

Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ethernet: neterion: Fix a typo in the file s2io.c
Bhaskar Chowdhury [Mon, 15 Mar 2021 01:53:22 +0000 (07:23 +0530)]
net: ethernet: neterion: Fix a typo in the file s2io.c

s/structue/structure/

Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ethernet: intel: igb: Typo fix in the file igb_main.c
Bhaskar Chowdhury [Mon, 15 Mar 2021 01:48:47 +0000 (07:18 +0530)]
net: ethernet: intel: igb: Typo fix in the file igb_main.c

s/structue/structure/

Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoethernet: amazon: ena: A typo fix in the file ena_com.h
Bhaskar Chowdhury [Sun, 14 Mar 2021 22:22:21 +0000 (03:52 +0530)]
ethernet: amazon: ena: A typo fix in the file ena_com.h

Mundane typo fix.

Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoatm: delete include/linux/atm_suni.h
Alexey Dobriyan [Sun, 14 Mar 2021 15:24:02 +0000 (18:24 +0300)]
atm: delete include/linux/atm_suni.h

This file has been effectively empty since 2.3.99-pre3 !

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agobonding: Added -ENODEV interpret for slaves option
Jianlin Lv [Sun, 14 Mar 2021 14:56:29 +0000 (22:56 +0800)]
bonding: Added -ENODEV interpret for slaves option

When the incorrect interface name is stored in the slaves/active_slave
option of the bonding sysfs, the kernel does not record the log that
interface does not exist.

This patch adds a log for -ENODEV error, which will facilitate users to
figure out such issue.

Signed-off-by: Jianlin Lv <Jianlin.Lv@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: export dev_set_threaded symbol
Lorenzo Bianconi [Sun, 14 Mar 2021 14:49:19 +0000 (15:49 +0100)]
net: export dev_set_threaded symbol

For wireless devices (e.g. mt76 driver) multiple net_devices belongs to
the same wireless phy and the napi object is registered in a dummy
netdevice related to the wireless phy.
Export dev_set_threaded in order to be reused in device drivers enabling
threaded NAPI.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: hellcreek: Offload bridge port flags
Kurt Kanzenbach [Sun, 14 Mar 2021 12:52:08 +0000 (13:52 +0100)]
net: dsa: hellcreek: Offload bridge port flags

The switch implements unicast and multicast filtering per port.
Add support for it. By default filtering is disabled.

Signed-off-by: Kurt Kanzenbach <kurt@kmk-computers.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'psample-Add-additional-metadata-attributes'
David S. Miller [Sun, 14 Mar 2021 22:00:44 +0000 (15:00 -0700)]
Merge branch 'psample-Add-additional-metadata-attributes'

Ido Schimmel says:

====================
psample: Add additional metadata attributes

This series extends the psample module to expose additional metadata to
user space for packets sampled via act_sample. The new metadata (e.g.,
transit delay) can then be consumed by applications such as hsflowd [1]
for better network observability.

netdevsim is extended with a dummy psample implementation that
periodically reports "sampled" packets to the psample module. In
addition to testing of the psample module, it enables the development
and demonstration of user space applications (e.g., hsflowd) that are
interested in the new metadata even without access to specialized
hardware (e.g., Spectrum ASIC) that can provide it.

mlxsw is also extended to provide the new metadata to psample.

A Wireshark dissector for psample netlink packets [2] will be submitted
upstream after the kernel patches are accepted. In addition, a libpcap
capture module for psample is currently in the works. Eventually, users
should be able to run:

 # tshark -i psample

In order to consume sampled packets along with their metadata.

Series overview:

Patch #1 makes it easier to extend the metadata provided to psample

Patch #2 adds the new metadata attributes to psample

Patch #3 extends netdevsim to periodically report "sampled" packets to
psample. Various debugfs knobs are added to control the reporting

Patch #4 adds a selftest over netdevsim

Patches #5-#10 gradually add support for the new metadata in mlxsw

Patch #11 adds a selftest over mlxsw

[1] https://sflow.org/draft4_sflow_transit.txt
[2] https://gitlab.com/amitcohen1/wireshark/-/commit/3d711143024e032aef1b056dd23f0266c54fab56
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoselftests: mlxsw: Add tc sample tests
Ido Schimmel [Sun, 14 Mar 2021 12:19:40 +0000 (14:19 +0200)]
selftests: mlxsw: Add tc sample tests

Test that packets are sampled when tc-sample is used and that reported
metadata is correct. Two sets of hosts (with and without LAG) are used,
since metadata extraction in mlxsw is a bit different when LAG is
involved.

 # ./tc_sample.sh
 TEST: tc sample rate (forward)                                      [ OK ]
 TEST: tc sample rate (local receive)                                [ OK ]
 TEST: tc sample maximum rate                                        [ OK ]
 TEST: tc sample group conflict test                                 [ OK ]
 TEST: tc sample iif                                                 [ OK ]
 TEST: tc sample lag iif                                             [ OK ]
 TEST: tc sample oif                                                 [ OK ]
 TEST: tc sample lag oif                                             [ OK ]
 TEST: tc sample out-tc                                              [ OK ]
 TEST: tc sample out-tc-occ                                          [ OK ]

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agomlxsw: spectrum: Report extra metadata to psample module
Ido Schimmel [Sun, 14 Mar 2021 12:19:39 +0000 (14:19 +0200)]
mlxsw: spectrum: Report extra metadata to psample module

Make use of the previously added metadata and report it to the psample
module. The metadata is read from the skb's control block, which was
initialized by the bus driver (i.e., 'mlxsw_pci') after decoding the
packet's Completion Queue Element (CQE).

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agomlxsw: spectrum: Remove mlxsw_sp_sample_receive()
Ido Schimmel [Sun, 14 Mar 2021 12:19:38 +0000 (14:19 +0200)]
mlxsw: spectrum: Remove mlxsw_sp_sample_receive()

The function resolves the psample sampling group from the Rx port
because this is the only form of sampling the driver currently supports.
Subsequent patches are going to add support for Tx-based and
policy-based sampling, in which case the sampling group would not be
resolved from the Rx port.

Therefore, move this code to the Rx-specific sampling listener.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agomlxsw: spectrum: Remove unnecessary RCU read-side critical section
Ido Schimmel [Sun, 14 Mar 2021 12:19:37 +0000 (14:19 +0200)]
mlxsw: spectrum: Remove unnecessary RCU read-side critical section

Since commit 7d8e8f3433dc ("mlxsw: core: Increase scope of RCU read-side
critical section"), all Rx handlers are called from an RCU read-side
critical section.

Remove the unnecessary rcu_read_lock() / rcu_read_unlock().

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agomlxsw: pci: Set extra metadata in skb control block
Ido Schimmel [Sun, 14 Mar 2021 12:19:36 +0000 (14:19 +0200)]
mlxsw: pci: Set extra metadata in skb control block

Packets that are mirrored / sampled to the CPU have extra metadata
encoded in their corresponding Completion Queue Element (CQE). Retrieve
this metadata from the CQE and set it in the skb control block so that
it could be accessed by the switch driver (i.e., 'mlxsw_spectrum').

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agomlxsw: Create dedicated field for Rx metadata in skb control block
Ido Schimmel [Sun, 14 Mar 2021 12:19:35 +0000 (14:19 +0200)]
mlxsw: Create dedicated field for Rx metadata in skb control block

Next patch will need to encode more Rx metadata in the skb control
block, so create a dedicated field for it and move the cookie index
there.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agomlxsw: pci: Add more metadata fields to CQEv2
Ido Schimmel [Sun, 14 Mar 2021 12:19:34 +0000 (14:19 +0200)]
mlxsw: pci: Add more metadata fields to CQEv2

The Completion Queue Element version 2 (CQEv2) includes various metadata
fields for packets that are mirrored / sampled to the CPU.

Add these fields so that they could be used by a later patch.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoselftests: netdevsim: Test psample functionality
Ido Schimmel [Sun, 14 Mar 2021 12:19:33 +0000 (14:19 +0200)]
selftests: netdevsim: Test psample functionality

Test various aspects of psample functionality over netdevsim and in
particular test that the psample module correctly reports the provided
metadata.

Example:

 # ./psample.sh
 TEST: psample enable / disable                                      [ OK ]
 TEST: psample group number                                          [ OK ]
 TEST: psample metadata                                              [ OK ]

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonetdevsim: Add dummy psample implementation
Ido Schimmel [Sun, 14 Mar 2021 12:19:32 +0000 (14:19 +0200)]
netdevsim: Add dummy psample implementation

Allow netdevsim to report "sampled" packets to the psample module by
periodically generating packets from a work queue. The behavior can be
enabled / disabled (default) and the various meta data attributes can be
controlled via debugfs knobs.

This implementation enables both testing of the psample module with all
the optional attributes as well as development of user space
applications on top of psample such as hsflowd and a Wireshark dissector
for psample generic netlink packets.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agopsample: Add additional metadata attributes
Ido Schimmel [Sun, 14 Mar 2021 12:19:31 +0000 (14:19 +0200)]
psample: Add additional metadata attributes

Extend psample to report the following attributes when available:

* Output traffic class as a 16-bit value
* Output traffic class occupancy in bytes as a 64-bit value
* End-to-end latency of the packet in nanoseconds resolution
* Software timestamp in nanoseconds resolution (always available)
* Packet's protocol. Needed for packet dissection in user space (always
  available)

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agopsample: Encapsulate packet metadata in a struct
Ido Schimmel [Sun, 14 Mar 2021 12:19:30 +0000 (14:19 +0200)]
psample: Encapsulate packet metadata in a struct

Currently, callers of psample_sample_packet() pass three metadata
attributes: Ingress port, egress port and truncated size. Subsequent
patches are going to add more attributes (e.g., egress queue occupancy),
which also need an indication whether they are valid or not.

Encapsulate packet metadata in a struct in order to keep the number of
arguments reasonable.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'skbuff-micro-optimize-flow-dissection'
David S. Miller [Sun, 14 Mar 2021 21:48:26 +0000 (14:48 -0700)]
Merge branch 'skbuff-micro-optimize-flow-dissection'

Alexander Lobakin says:

====================
skbuff: micro-optimize flow dissection

This little number makes all of the flow dissection functions take
raw input data pointer as const (1-5) and shuffles the branches in
__skb_header_pointer() according to their hit probability.

The result is +20 Mbps per flow/core with one Flow Dissector pass
per packet. This affects RPS (with software hashing), drivers that
use eth_get_headlen() on their Rx path and so on.

From v2 [1]:
 - reword some commit messages as a potential fix for NIPA;
 - no functional changes.

From v1 [0]:
 - rebase on top of the latest net-next. This was super-weird, but
   I double-checked that the series applies with no conflicts, and
   then on Patchwork it didn't;
 - no other changes.

[0] https://lore.kernel.org/netdev/20210312194538.337504-1-alobakin@pm.me
[1] https://lore.kernel.org/netdev/20210313113645.5949-1-alobakin@pm.me
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoskbuff: micro-optimize {,__}skb_header_pointer()
Alexander Lobakin [Sun, 14 Mar 2021 11:11:50 +0000 (11:11 +0000)]
skbuff: micro-optimize {,__}skb_header_pointer()

{,__}skb_header_pointer() helpers exist mainly for preventing
accesses-beyond-end of the linear data.
In the vast majorify of cases, they bail out on the first condition.
All code going after is mostly a fallback.
Mark the most common branch as 'likely' one to move it in-line.
Also, skb_copy_bits() can return negative values only when the input
arguments are invalid, e.g. offset is greater than skb->len. It can
be safely marked as 'unlikely' branch, assuming that hotpath code
provides sane input to not fail here.

These two bump the throughput with a single Flow Dissector pass on
every packet (e.g. with RPS or driver that uses eth_get_headlen())
on 20 Mbps per flow/core.

Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoethernet: constify eth_get_headlen()'s data argument
Alexander Lobakin [Sun, 14 Mar 2021 11:11:41 +0000 (11:11 +0000)]
ethernet: constify eth_get_headlen()'s data argument

It's used only for flow dissection, which now takes constant data
pointers.

Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agolinux/etherdevice.h: misc trailing whitespace cleanup
Alexander Lobakin [Sun, 14 Mar 2021 11:11:32 +0000 (11:11 +0000)]
linux/etherdevice.h: misc trailing whitespace cleanup

Caught by the text editor. Fix it separately from the actual changes.

Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoflow_dissector: constify raw input data argument
Alexander Lobakin [Sun, 14 Mar 2021 11:11:23 +0000 (11:11 +0000)]
flow_dissector: constify raw input data argument

Flow Dissector code never modifies the input buffer, neither skb nor
raw data.
Make 'data' argument const for all of the Flow dissector's functions.

Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoskbuff: make __skb_header_pointer()'s data argument const
Alexander Lobakin [Sun, 14 Mar 2021 11:11:14 +0000 (11:11 +0000)]
skbuff: make __skb_header_pointer()'s data argument const

The function never modifies the input buffer, so 'data' argument
can be marked as const.
This implies one harmless cast-away.

Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoflow_dissector: constify bpf_flow_dissector's data pointers
Alexander Lobakin [Sun, 14 Mar 2021 11:11:00 +0000 (11:11 +0000)]
flow_dissector: constify bpf_flow_dissector's data pointers

BPF Flow dissection programs are read-only and don't touch input
buffers.
Mark 'data' and 'data_end' in struct bpf_flow_dissector as const
in preparation for global input constifying.

Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'gro-micro-optimize-dev_gro_receive'
David S. Miller [Sun, 14 Mar 2021 21:41:09 +0000 (14:41 -0700)]
Merge branch 'gro-micro-optimize-dev_gro_receive'

Alexander Lobakin says:

====================
gro: micro-optimize dev_gro_receive()

This random series addresses some of suboptimal constructions used
in the main GRO entry point.
The main body is gro_list_prepare() simplification and pointer usage
optimization in dev_gro_receive() itself. Being mostly cosmetic, it
gives like +10 Mbps on my setup to both TCP and UDP (both single- and
multi-flow).

Since v1 [0]:
 - drop the replacement of bucket index calculation with
   reciprocal_scale() since it makes absolutely no sense (Eric);
 - improve stack usage in dev_gro_receive() (Eric);
 - reverse the order of patches to avoid changes superseding.

[0] https://lore.kernel.org/netdev/20210312162127.239795-1-alobakin@pm.me
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agogro: give 'hash' variable in dev_gro_receive() a less confusing name
Alexander Lobakin [Sat, 13 Mar 2021 20:30:14 +0000 (20:30 +0000)]
gro: give 'hash' variable in dev_gro_receive() a less confusing name

'hash' stores not the flow hash, but the index of the GRO bucket
corresponding to it.
Change its name to 'bucket' to avoid confusion while reading lines
like '__set_bit(hash, &napi->gro_bitmask)'.

Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agogro: consistentify napi->gro_hash[x] access in dev_gro_receive()
Alexander Lobakin [Sat, 13 Mar 2021 20:30:10 +0000 (20:30 +0000)]
gro: consistentify napi->gro_hash[x] access in dev_gro_receive()

GRO bucket index doesn't change through the entire function.
Store a pointer to the corresponding bucket instead of its member
and use it consistently through the function.
It is performance-safe since &gro_list->list == gro_list.

Misc: remove superfluous braces around single-line branches.

Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agogro: simplify gro_list_prepare()
Alexander Lobakin [Sat, 13 Mar 2021 20:30:05 +0000 (20:30 +0000)]
gro: simplify gro_list_prepare()

gro_list_prepare() always returns &napi->gro_hash[bucket].list,
without any variations. Moreover, it uses 'napi' argument only to
have access to this list, and calculates the bucket index for the
second time (firstly it happens at the beginning of
dev_gro_receive()) to do that.
Given that dev_gro_receive() already has an index to the needed
list, just pass it as the first argument to eliminate redundant
calculations, and make gro_list_prepare() return void.
Also, both arguments of gro_list_prepare() can be constified since
this function can only modify the skbs from the bucket list.

Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: bcm_sf2: Fill in BCM4908 CFP entries
Florian Fainelli [Fri, 12 Mar 2021 21:11:01 +0000 (13:11 -0800)]
net: dsa: bcm_sf2: Fill in BCM4908 CFP entries

The BCM4908 switch has 256 CFP entrie, update that setting so CFP can be
used.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agohv_netvsc: Add a comment clarifying batching logic
Shachar Raindel [Fri, 12 Mar 2021 23:45:27 +0000 (15:45 -0800)]
hv_netvsc: Add a comment clarifying batching logic

The batching logic in netvsc_send is non-trivial, due to
a combination of the Linux API and the underlying hypervisor
interface. Add a comment explaining why the code is written this
way.

Signed-off-by: Shachar Raindel <shacharr@microsoft.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'pktgen-scripts-improvements'
David S. Miller [Sun, 14 Mar 2021 21:22:38 +0000 (14:22 -0700)]
Merge branch 'pktgen-scripts-improvements'

Igor Russkikh says:

====================
pktgen: scripts improvements

Please consider small improvements to pktgen scripts we use in our environment.

Adding delay parameter through command line,
Adding new -a (append) parameter to make flex runs

v3: change us to ns in docs
v2: Review comments from Jesper

CC: Jesper Dangaard Brouer <brouer@redhat.com>
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agosamples: pktgen: new append mode
Igor Russkikh [Thu, 11 Mar 2021 10:32:53 +0000 (11:32 +0100)]
samples: pktgen: new append mode

To configure various complex flows we for sure can create custom
pktgen init scripts, but sometimes thats not that easy.

New "-a" (append) option in all the existing sample scripts allows
to append more "devices" into pktgen threads.

The most straightforward usecases for that are:
- using multiple devices. We have to generate full linerate on
all physical functions (ports) of our multiport device.
- pushing multiple flows (with different packet options)

Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agosamples: pktgen: allow to specify delay parameter via new opt
Igor Russkikh [Thu, 11 Mar 2021 10:32:52 +0000 (11:32 +0100)]
samples: pktgen: allow to specify delay parameter via new opt

DELAY may now be explicitly specified via common parameter -w

Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodocs: net: add missing devlink health cmd - trigger
Jakub Kicinski [Sat, 13 Mar 2021 00:30:26 +0000 (16:30 -0800)]
docs: net: add missing devlink health cmd - trigger

Documentation is missing and it's not very clear what
this callback is for - presumably testing the recovery?

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodocs: net: tweak devlink health documentation
Jakub Kicinski [Sat, 13 Mar 2021 00:30:25 +0000 (16:30 -0800)]
docs: net: tweak devlink health documentation

Minor tweaks and improvement of wording about the diagnose callback.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: stmmac: Set FIFO sizes for ipq806x
Jonathan McDowell [Sat, 13 Mar 2021 13:18:26 +0000 (13:18 +0000)]
net: stmmac: Set FIFO sizes for ipq806x

Commit eaf4fac47807 ("net: stmmac: Do not accept invalid MTU values")
started using the TX FIFO size to verify what counts as a valid MTU
request for the stmmac driver.  This is unset for the ipq806x variant.
Looking at older patches for this it seems the RX + TXs buffers can be
up to 8k, so set appropriately.

(I sent this as an RFC patch in June last year, but received no replies.
I've been running with this on my hardware (a MikroTik RB3011) since
then with larger MTUs to support both the internal qca8k switch and
VLANs with no problems. Without the patch it's impossible to set the
larger MTU required to support this.)

Signed-off-by: Jonathan McDowell <noodles@earth.li>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodrivers: net: vxlan.c: Fix declaration issue
Sanjana Srinidhi [Sat, 13 Mar 2021 08:36:49 +0000 (14:06 +0530)]
drivers: net: vxlan.c: Fix declaration issue

Added a blank line after structure declaration.
This is done to maintain code uniformity.

Signed-off-by: Sanjana Srinidhi <sanjanasrinidhi1810@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ethernet: marvell: Fixed typo in the file sky2.c
Bhaskar Chowdhury [Sat, 13 Mar 2021 05:45:36 +0000 (11:15 +0530)]
net: ethernet: marvell: Fixed typo in the file sky2.c

s/calclation/calculation/

Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'dsa-hewllcreek-dumps'
David S. Miller [Sat, 13 Mar 2021 22:30:48 +0000 (14:30 -0800)]
Merge branch 'dsa-hewllcreek-dumps'

Kurt Kanzenbach says:

====================
net: dsa: hellcreek: Add support for dumping tables

add support for dumping the VLAN and FDB table via devlink. As the driver uses
internal VLANs and static FDB entries, this is a useful debugging feature.

Changes since v1:

 * Drop memory reporting as there are better APIs to expose this
 * Move comment to VLAN patch

Previous versions:

 * https://lkml.kernel.org/netdev/20210311175344.3084-1-kurt@kmk-computers.de/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: hellcreek: Add devlink FDB region
Kurt Kanzenbach [Sat, 13 Mar 2021 09:39:39 +0000 (10:39 +0100)]
net: dsa: hellcreek: Add devlink FDB region

Allow to dump the FDB table via devlink. This is a useful debugging feature.

Signed-off-by: Kurt Kanzenbach <kurt@kmk-computers.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: hellcreek: Move common code to helper
Kurt Kanzenbach [Sat, 13 Mar 2021 09:39:38 +0000 (10:39 +0100)]
net: dsa: hellcreek: Move common code to helper

There are two functions which need to populate fdb entries. Move that to a
helper function.

Signed-off-by: Kurt Kanzenbach <kurt@kmk-computers.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: hellcreek: Use boolean value
Kurt Kanzenbach [Sat, 13 Mar 2021 09:39:37 +0000 (10:39 +0100)]
net: dsa: hellcreek: Use boolean value

hellcreek_select_vlan() takes a boolean instead of an integer.
So, use false accordingly.

Signed-off-by: Kurt Kanzenbach <kurt@kmk-computers.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: hellcreek: Add devlink VLAN region
Kurt Kanzenbach [Sat, 13 Mar 2021 09:39:36 +0000 (10:39 +0100)]
net: dsa: hellcreek: Add devlink VLAN region

Allow to dump the VLAN table via devlink. This especially useful, because the
driver internally leverages VLANs for the port separation. These are not visible
via the bridge utility.

Signed-off-by: Kurt Kanzenbach <kurt@kmk-computers.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge tag 'batadv-next-pullrequest-20210312' of git://git.open-mesh.org/linux-merge
David S. Miller [Sat, 13 Mar 2021 22:27:56 +0000 (14:27 -0800)]
Merge tag 'batadv-next-pullrequest-20210312' of git://git.open-mesh.org/linux-merge

Simon Wunderlich says:

====================
There is only a single patch this time:

 - Use netif_rx_any_context(), by Sebastian Andrzej Siewior
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'pps-policing'
David S. Miller [Sat, 13 Mar 2021 22:18:10 +0000 (14:18 -0800)]
Merge branch 'pps-policing'

Simon Horman says:

====================
net/sched: act_police: add support for packet-per-second policing

This series enhances the TC policer action implementation to allow a
policer action instance to enforce a rate-limit based on
packets-per-second, configurable using a packet-per-second rate and burst
parameters.

In the hope of aiding review this is broken up into three patches.

* [PATCH 1/3] flow_offload: add support for packet-per-second policing

  Add support for this feature to the flow_offload API that is used to allow
  programming flows, including TC rules and their actions, into hardware.

* [PATCH 2/3] flow_offload: reject configuration of packet-per-second policing in offload drivers

  Teach all exiting users of the flow_offload API that allow offload of
  policer action instances to reject offload if packet-per-second rate
  limiting is configured: none support it at this time

* [PATCH 3/3] net/sched: act_police: add support for packet-per-second policing

  With the above ground-work in place add the new feature to the TC policer
  action itself

With the above in place the feature may be used.

As follow-ups we plan to provide:
* Corresponding updates to iproute2
* Corresponding self tests (which depend on the iproute2 changes)
* Hardware offload support for the NFP driver

Key changes since v2:
* Added patches 1 and 2, which makes adding patch 3 safe for existing
  hardware offload of the policer action
* Re-worked patch 3 so that a TC policer action instance may be configured
  for packet-per-second or byte-per-second rate limiting, but not both.
* Corrected kdoc usage
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet/sched: act_police: add support for packet-per-second policing
Baowen Zheng [Fri, 12 Mar 2021 14:08:31 +0000 (15:08 +0100)]
net/sched: act_police: add support for packet-per-second policing

Allow a policer action to enforce a rate-limit based on packets-per-second,
configurable using a packet-per-second rate and burst parameters.

e.g.
tc filter add dev tap1 parent ffff: u32 match \
        u32 0 0 police pkts_rate 3000 pkts_burst 1000

Testing was unable to uncover a performance impact of this change on
existing features.

Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Louis Peens <louis.peens@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoflow_offload: reject configuration of packet-per-second policing in offload drivers
Baowen Zheng [Fri, 12 Mar 2021 14:08:30 +0000 (15:08 +0100)]
flow_offload: reject configuration of packet-per-second policing in offload drivers

A follow-up patch will allow users to configures packet-per-second policing
in the software datapath. In preparation for this, teach all drivers that
support offload of the policer action to reject such configuration as
currently none of them support it.

Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Louis Peens <louis.peens@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoflow_offload: add support for packet-per-second policing
Xingfeng Hu [Fri, 12 Mar 2021 14:08:29 +0000 (15:08 +0100)]
flow_offload: add support for packet-per-second policing

Allow flow_offload API to configure packet-per-second policing using rate
and burst parameters.

Dummy implementations of tcf_police_rate_pkt_ps() and
tcf_police_burst_pkt() are supplied which return 0, the unconfigured state.
This is to facilitate splitting the offload, driver, and TC code portion of
this feature into separate patches with the aim of providing a logical flow
for review. And the implementation of these helpers will be filled out by a
follow-up patch.

Signed-off-by: Xingfeng Hu <xingfeng.hu@corigine.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Louis Peens <louis.peens@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'hns3-imp-phys'
David S. Miller [Sat, 13 Mar 2021 22:11:29 +0000 (14:11 -0800)]
Merge branch 'hns3-imp-phys'

Huazhong Tan says:

====================
net: hns3: support imp-controlled PHYs

This series adds support for imp-controlled PHYs in the HNS3
ethernet driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: hns3: add phy loopback support for imp-controlled PHYs
Guangbin Huang [Fri, 12 Mar 2021 08:50:16 +0000 (16:50 +0800)]
net: hns3: add phy loopback support for imp-controlled PHYs

If the imp-controlled PHYs feature is enabled, driver can not
call phy driver interface to set loopback anymore and needs
to send command to firmware to start phy loopback.

Driver reuses the existing firmware command 0x0315 to start
phy loopback, just add a setting bit in this command. As this
command is not only for serdes loopback anymore, rename this
command to "xxx_COMMON_LOOPBACK", and modify function name,
macro name and logs related to it.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: hns3: add ioctl support for imp-controlled PHYs
Guangbin Huang [Fri, 12 Mar 2021 08:50:15 +0000 (16:50 +0800)]
net: hns3: add ioctl support for imp-controlled PHYs

When the imp-controlled PHYs feature is enabled, driver will not
register mdio bus. In order to support ioctl ops for phy tool to
read or write phy register in this case, the firmware implement
a new command for driver and driver implement ioctl by using this
new command.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: hns3: add get/set pause parameters support for imp-controlled PHYs
Guangbin Huang [Fri, 12 Mar 2021 08:50:14 +0000 (16:50 +0800)]
net: hns3: add get/set pause parameters support for imp-controlled PHYs

When the imp-controlled PHYs feature is enabled, phydev is NULL.
In this case, the autoneg is always off when user uses ethtool -a
command to get pause parameters because  hclge_get_pauseparam()
uses phydev to check whether device is TP port. To fit this new
feature, use media type to check whether device is TP port.

And when user set pause parameters, these parameters need to
always set to mac, no matter whether autoneg is off.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: hns3: add support for imp-controlled PHYs
Guangbin Huang [Fri, 12 Mar 2021 08:50:13 +0000 (16:50 +0800)]
net: hns3: add support for imp-controlled PHYs

IMP(Intelligent Management Processor) firmware add a new feature
to take control of PHYs for some new devices, PF driver adds
support for this feature.

Driver queries device's capability to check whether IMP supports
this feature, it will tell IMP to enable this feature by firmware
compatible command if it is supported.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'sh_eth-reg-defs'
David S. Miller [Sat, 13 Mar 2021 01:50:42 +0000 (17:50 -0800)]
Merge branch 'sh_eth-reg-defs'

Sergey Shtylyov says:

====================
sh_eth: Improve the register/bit definitions in the Ether driver

Here are 4 patches against DaveM's 'net-next' repo. Mainly I'm renaming the register *enum*
tags/entries to match the SoC manuals,and also moving the RX-TX descriptor *enum*s closer to
the corresponding *struct*s...
====================

Signed-off-by: David S. Miller <davem@davemloft.net>