platform/kernel/linux-starfive.git
4 years agovhost/vsock: split packets to send using multiple buffers
Stefano Garzarella [Tue, 30 Jul 2019 15:43:33 +0000 (17:43 +0200)]
vhost/vsock: split packets to send using multiple buffers

If the packets to sent to the guest are bigger than the buffer
available, we can split them, using multiple buffers and fixing
the length in the packet header.
This is safe since virtio-vsock supports only stream sockets.

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agovsock/virtio: fix locking in virtio_transport_inc_tx_pkt()
Stefano Garzarella [Tue, 30 Jul 2019 15:43:32 +0000 (17:43 +0200)]
vsock/virtio: fix locking in virtio_transport_inc_tx_pkt()

fwd_cnt and last_fwd_cnt are protected by rx_lock, so we should use
the same spinlock also if we are in the TX path.

Move also buf_alloc under the same lock.

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agovsock/virtio: reduce credit update messages
Stefano Garzarella [Tue, 30 Jul 2019 15:43:31 +0000 (17:43 +0200)]
vsock/virtio: reduce credit update messages

In order to reduce the number of credit update messages,
we send them only when the space available seen by the
transmitter is less than VIRTIO_VSOCK_MAX_PKT_BUF_SIZE.

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agovsock/virtio: limit the memory used per-socket
Stefano Garzarella [Tue, 30 Jul 2019 15:43:30 +0000 (17:43 +0200)]
vsock/virtio: limit the memory used per-socket

Since virtio-vsock was introduced, the buffers filled by the host
and pushed to the guest using the vring, are directly queued in
a per-socket list. These buffers are preallocated by the guest
with a fixed size (4 KB).

The maximum amount of memory used by each socket should be
controlled by the credit mechanism.
The default credit available per-socket is 256 KB, but if we use
only 1 byte per packet, the guest can queue up to 262144 of 4 KB
buffers, using up to 1 GB of memory per-socket. In addition, the
guest will continue to fill the vring with new 4 KB free buffers
to avoid starvation of other sockets.

This patch mitigates this issue copying the payload of small
packets (< 128 bytes) into the buffer of last packet queued, in
order to avoid wasting memory.

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: Remove dev_err() usage after platform_get_irq()
Stephen Boyd [Tue, 30 Jul 2019 18:15:51 +0000 (11:15 -0700)]
net: Remove dev_err() usage after platform_get_irq()

We don't need dev_err() messages when platform_get_irq() fails now that
platform_get_irq() prints an error message itself when something goes
wrong. Let's remove these prints with a simple semantic patch.

// <smpl>
@@
expression ret;
struct platform_device *E;
@@

ret =
(
platform_get_irq(E, ...)
|
platform_get_irq_byname(E, ...)
);

if ( \( ret < 0 \| ret <= 0 \) )
{
(
-if (ret != -EPROBE_DEFER)
-{ ...
-dev_err(...);
-... }
|
...
-dev_err(...);
)
...
}
// </smpl>

While we're here, remove braces on if statements that only have one
statement (manually).

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Lorenzo Bianconi <lorenzo@kernel.org>
Cc: netdev@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'Finish-conversion-of-skb_frag_t-to-bio_vec'
David S. Miller [Tue, 30 Jul 2019 21:21:32 +0000 (14:21 -0700)]
Merge branch 'Finish-conversion-of-skb_frag_t-to-bio_vec'

Jonathan Lemon says:

====================
Finish conversion of skb_frag_t to bio_vec

The recent conversion of skb_frag_t to bio_vec did not include
skb_frag's page_offset.  Add accessor functions for this field,
utilize them, and remove the union, restoring the original structure.

v2:
  - rename accessors
  - follow kdoc conventions
====================

Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agolinux: Remove bvec page_offset, use bv_offset
Jonathan Lemon [Tue, 30 Jul 2019 14:40:34 +0000 (07:40 -0700)]
linux: Remove bvec page_offset, use bv_offset

Now that page_offset is referenced through accessors, remove
the union, and use bv_offset.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: Use skb_frag_off accessors
Jonathan Lemon [Tue, 30 Jul 2019 14:40:33 +0000 (07:40 -0700)]
net: Use skb_frag_off accessors

Use accessor functions for skb fragment's page_offset instead
of direct references, in preparation for bvec conversion.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agolinux: Add skb_frag_t page_offset accessors
Jonathan Lemon [Tue, 30 Jul 2019 14:40:32 +0000 (07:40 -0700)]
linux: Add skb_frag_t page_offset accessors

Add skb_frag_off(), skb_frag_off_add(), skb_frag_off_set(),
and skb_frag_off_copy() accessors for page_offset.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'sctp-clean-up-sctp_connect-function'
David S. Miller [Tue, 30 Jul 2019 21:18:14 +0000 (14:18 -0700)]
Merge branch 'sctp-clean-up-sctp_connect-function'

Xin Long says:

====================
sctp: clean up __sctp_connect function

This patchset is to factor out some common code for
sctp_sendmsg_new_asoc() and __sctp_connect() into 2
new functioins.

v1->v2:
  - add the patch 1/5 to avoid a slab-out-of-bounds warning.
  - add some code comment for the check change in patch 2/5.
  - remove unused 'addrcnt' as Marcelo noticed in patch 3/5.
====================

Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agosctp: factor out sctp_connect_add_peer
Xin Long [Tue, 30 Jul 2019 12:38:23 +0000 (20:38 +0800)]
sctp: factor out sctp_connect_add_peer

In this function factored out from sctp_sendmsg_new_asoc() and
__sctp_connect(), it adds a peer with the other addr into the
asoc after this asoc is created with the 1st addr.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agosctp: factor out sctp_connect_new_asoc
Xin Long [Tue, 30 Jul 2019 12:38:22 +0000 (20:38 +0800)]
sctp: factor out sctp_connect_new_asoc

In this function factored out from sctp_sendmsg_new_asoc() and
__sctp_connect(), it creates the asoc and adds a peer with the
1st addr.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agosctp: clean up __sctp_connect
Xin Long [Tue, 30 Jul 2019 12:38:21 +0000 (20:38 +0800)]
sctp: clean up __sctp_connect

__sctp_connect is doing quit similar things as sctp_sendmsg_new_asoc.
To factor out common functions, this patch is to clean up their code
to make them look more similar:

  1. create the asoc and add a peer with the 1st addr.
  2. add peers with the other addrs into this asoc one by one.

while at it, also remove the unused 'addrcnt'.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agosctp: check addr_size with sa_family_t size in __sctp_setsockopt_connectx
Xin Long [Tue, 30 Jul 2019 12:38:20 +0000 (20:38 +0800)]
sctp: check addr_size with sa_family_t size in __sctp_setsockopt_connectx

Now __sctp_connect() is called by __sctp_setsockopt_connectx() and
sctp_inet_connect(), the latter has done addr_size check with size
of sa_family_t.

In the next patch to clean up __sctp_connect(), we will remove
addr_size check with size of sa_family_t from __sctp_connect()
for the 1st address.

So before doing that, __sctp_setsockopt_connectx() should do
this check first, as sctp_inet_connect() does.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agosctp: only copy the available addr data in sctp_transport_init
Xin Long [Tue, 30 Jul 2019 12:38:19 +0000 (20:38 +0800)]
sctp: only copy the available addr data in sctp_transport_init

'addr' passed to sctp_transport_init is not always a whole size
of union sctp_addr, like the path:

  sctp_sendmsg() ->
  sctp_sendmsg_new_asoc() ->
  sctp_assoc_add_peer() ->
  sctp_transport_new() -> sctp_transport_init()

In the next patches, we will also pass the address length of data
only to sctp_assoc_add_peer().

So sctp_transport_init() should copy the only available data from
addr to peer->ipaddr, instead of 'peer->ipaddr = *addr' which may
cause slab-out-of-bounds.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agorxrpc: Fix -Wframe-larger-than= warnings from on-stack crypto
David Howells [Tue, 30 Jul 2019 14:56:57 +0000 (15:56 +0100)]
rxrpc: Fix -Wframe-larger-than= warnings from on-stack crypto

rxkad sometimes triggers a warning about oversized stack frames when
building with clang for a 32-bit architecture:

net/rxrpc/rxkad.c:243:12: error: stack frame size of 1088 bytes in function 'rxkad_secure_packet' [-Werror,-Wframe-larger-than=]
net/rxrpc/rxkad.c:501:12: error: stack frame size of 1088 bytes in function 'rxkad_verify_packet' [-Werror,-Wframe-larger-than=]

The problem is the combination of SYNC_SKCIPHER_REQUEST_ON_STACK() in
rxkad_verify_packet()/rxkad_secure_packet() with the relatively large
scatterlist in rxkad_verify_packet_1()/rxkad_secure_packet_encrypt().

The warning does not show up when using gcc, which does not inline the
functions as aggressively, but the problem is still the same.

Allocate the cipher buffers from the slab instead, caching the allocated
packet crypto request memory used for DATA packet crypto in the rxrpc_call
struct.

Fixes: 17926a79320a ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both")
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'bnxt_en-TPA-57500'
David S. Miller [Mon, 29 Jul 2019 21:19:09 +0000 (14:19 -0700)]
Merge branch 'bnxt_en-TPA-57500'

Michael Chan says:

====================
bnxt_en: Add TPA (GRO_HW and LRO) on 57500 chips.

This patchset adds TPA v2 support on the 57500 chips.  TPA v2 is
different from the legacy TPA scheme on older chips and requires major
refactoring and restructuring of the existing TPA logic.  The main
difference is that the new TPA v2 has on-the-fly aggregation buffer
completions before a TPA packet is completed.  The larger aggregation
ID space also requires a new ID mapping logic to make it more
memory efficient.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agobnxt_en: Add PCI IDs for 57500 series NPAR devices.
Michael Chan [Mon, 29 Jul 2019 10:10:33 +0000 (06:10 -0400)]
bnxt_en: Add PCI IDs for 57500 series NPAR devices.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agobnxt_en: Support all variants of the 5750X chip family.
Michael Chan [Mon, 29 Jul 2019 10:10:32 +0000 (06:10 -0400)]
bnxt_en: Support all variants of the 5750X chip family.

Define the 57508, 57504, and 57502 chip IDs that are all part of the
BNXT_CHIP_P5 family of chips.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agobnxt_en: Refactor bnxt_init_one() and turn on TPA support on 57500 chips.
Michael Chan [Mon, 29 Jul 2019 10:10:31 +0000 (06:10 -0400)]
bnxt_en: Refactor bnxt_init_one() and turn on TPA support on 57500 chips.

With the new TPA feature in the 57500 chips, we need to discover the
feature first before setting up the netdev features.  Refactor the
the firmware probe and init logic more cleanly into 2 functions and
and make these calls before setting up the netdev features.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agobnxt_en: Support TPA counters on 57500 chips.
Michael Chan [Mon, 29 Jul 2019 10:10:30 +0000 (06:10 -0400)]
bnxt_en: Support TPA counters on 57500 chips.

Support the new expanded TPA v2 counters on 57500 B0 chips for
ethtool -S.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agobnxt_en: Allocate the larger per-ring statistics block for 57500 chips.
Michael Chan [Mon, 29 Jul 2019 10:10:29 +0000 (06:10 -0400)]
bnxt_en: Allocate the larger per-ring statistics block for 57500 chips.

The new TPA implemantation has additional TPA counters that extend the
per-ring statistics block.  Allocate the proper size accordingly.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agobnxt_en: Refactor ethtool ring statistics logic.
Michael Chan [Mon, 29 Jul 2019 10:10:28 +0000 (06:10 -0400)]
bnxt_en: Refactor ethtool ring statistics logic.

The current code assumes that the per ring statistics counters are
fixed.  In newer chips that support a newer version of TPA, the
TPA counters are also changed.  Refactor the code by defining these
counter names in arrays so that it is easy to add a new array for
a new set of counters supported by the newer chips.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agobnxt_en: Add hardware GRO setup function for 57500 chips.
Michael Chan [Mon, 29 Jul 2019 10:10:27 +0000 (06:10 -0400)]
bnxt_en: Add hardware GRO setup function for 57500 chips.

Add a more optimized hardware GRO function to setup the SKB on 57500
chips.  Some workaround code is no longer needed on 57500 chips and
the pseudo checksum is also calculated in hardware, so no need to
do the software pseudo checksum in the driver.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agobnxt_en: Add TPA ID mapping logic for 57500 chips.
Michael Chan [Mon, 29 Jul 2019 10:10:26 +0000 (06:10 -0400)]
bnxt_en: Add TPA ID mapping logic for 57500 chips.

The new TPA feature on 57500 supports a larger number of concurrent TPAs
(up to 1024) divided among the functions.  We need to add some logic to
map the hardware TPA ID to a software index that keeps track of each TPA
in progress.  A 1:1 direct mapping without translation would be too
wasteful as we would have to allocate 1024 TPA structures for each RX
ring on each PCI function.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agobnxt_en: Add fast path logic for TPA on 57500 chips.
Michael Chan [Mon, 29 Jul 2019 10:10:25 +0000 (06:10 -0400)]
bnxt_en: Add fast path logic for TPA on 57500 chips.

With all the previous refactoring, the TPA fast path can now be
modified slightly to support TPA on the new chips.  The main
difference is that the agg completions are retrieved differently using
the bnxt_get_tpa_agg_p5() function on the new chips.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agobnxt_en: Set TPA GRO mode flags on 57500 chips properly.
Michael Chan [Mon, 29 Jul 2019 10:10:24 +0000 (06:10 -0400)]
bnxt_en: Set TPA GRO mode flags on 57500 chips properly.

On 57500 chips, hardware GRO mode cannot be determined from the TPA
end, so we need to check bp->flags to determine if we are in hardware
GRO mode or not.  Modify bnxt_set_features so that the TPA flags
in bp->flags don't change until the device is closed.  This will ensure
that the fast path can safely rely on bp->flags to determine the
TPA mode.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agobnxt_en: Refactor tunneled hardware GRO logic.
Michael Chan [Mon, 29 Jul 2019 10:10:23 +0000 (06:10 -0400)]
bnxt_en: Refactor tunneled hardware GRO logic.

The 2 GRO functions to set up the hardware GRO SKB fields for 2
different hardware chips have practically identical logic for
tunneled packets.  Refactor the logic into a separate bnxt_gro_tunnel()
function that can be used by both functions.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agobnxt_en: Handle standalone RX_AGG completions.
Michael Chan [Mon, 29 Jul 2019 10:10:22 +0000 (06:10 -0400)]
bnxt_en: Handle standalone RX_AGG completions.

On the new 57500 chips, these new RX_AGG completions are not coalesced
at the TPA_END completion.  Handle these by storing them in the
array in the bnxt_tpa_info struct, as they are seen when processing
the CMPL ring.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agobnxt_en: Expand bnxt_tpa_info struct to support 57500 chips.
Michael Chan [Mon, 29 Jul 2019 10:10:21 +0000 (06:10 -0400)]
bnxt_en: Expand bnxt_tpa_info struct to support 57500 chips.

Add an aggregation array to bnxt_tpa_info struct to keep track of the
aggregation completions.  The aggregation completions are not
completed at the TPA_END completion on 57500 chips so we need to
keep track of them.  The array is only allocated on the new chips
when required.  An agg_count field is also added to keep track of the
number of these completions.

The maximum concurrent TPA is now discovered from firmware instead of
the hardcoded 64.  Add a new bp->max_tpa to keep track of maximum
configured TPA.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agobnxt_en: Refactor TPA logic.
Michael Chan [Mon, 29 Jul 2019 10:10:20 +0000 (06:10 -0400)]
bnxt_en: Refactor TPA logic.

Refactor the TPA logic slightly, so that the code can be more easily
extended to support TPA on the new 57500 chips.  In particular, the
logic to get the next aggregation completion is refactored into a
new function bnxt_get_agg() so that this operation is made more
generalized.  This operation will be different on the new chip in TPA
mode.  The logic to recycle the aggregation buffers has a new start
index parameter added for the same purpose.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agobnxt_en: Add TPA structure definitions for BCM57500 chips.
Michael Chan [Mon, 29 Jul 2019 10:10:19 +0000 (06:10 -0400)]
bnxt_en: Add TPA structure definitions for BCM57500 chips.

The new chips have a slightly modified TPA interface for LRO/GRO_HW.
Modify the TPA structures so that the same structures can also be
used on the new chips.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agobnxt_en: Update firmware interface spec. to 1.10.0.89.
Michael Chan [Mon, 29 Jul 2019 10:10:18 +0000 (06:10 -0400)]
bnxt_en: Update firmware interface spec. to 1.10.0.89.

Among the changes are new CoS discard counters and new ctx_hw_stats_ext
struct for the latest 5750X B0 chips.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agocan: fix ioctl function removal
Oliver Hartkopp [Mon, 29 Jul 2019 20:40:56 +0000 (22:40 +0200)]
can: fix ioctl function removal

Commit 60649d4e0af ("can: remove obsolete empty ioctl() handler") replaced the
almost empty can_ioctl() function with sock_no_ioctl() which always returns
-EOPNOTSUPP.

Even though we don't have any ioctl() functions on socket/network layer we need
to return -ENOIOCTLCMD to be able to forward ioctl commands like SIOCGIFINDEX
to the network driver layer.

This patch fixes the wrong return codes in the CAN network layer protocols.

Reported-by: kernel test robot <rong.a.chen@intel.com>
Fixes: 60649d4e0af ("can: remove obsolete empty ioctl() handler")
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: dsa: mv88e6xxx: avoid some redundant vtu load/purge operations
Rasmus Villemoes [Mon, 22 Jul 2019 23:37:26 +0000 (23:37 +0000)]
net: dsa: mv88e6xxx: avoid some redundant vtu load/purge operations

We have an ERPS (Ethernet Ring Protection Switching) setup involving
mv88e6250 switches which we're in the process of switching to a BSP
based on the mainline driver. Breaking any link in the ring works as
expected, with the ring reconfiguring itself quickly and traffic
continuing with almost no noticable drops. However, when plugging back
the cable, we see 5+ second stalls.

This has been tracked down to the userspace application in charge of
the protocol missing a few CCM messages on the good link (the one that
was not unplugged), causing it to broadcast a "signal fail". That
message eventually reaches its link partner, which responds by
blocking the port. Meanwhile, the first node has continued to block
the port with the just plugged-in cable, breaking the network. And the
reason for those missing CCM messages has in turn been tracked down to
the VTU apparently being too busy servicing load/purge operations that
the normal lookups are delayed.

Initial state, the link between C and D is blocked in software.

     _____________________
    /                     \
   |                       |
   A ----- B ----- C *---- D

Unplug the cable between C and D.

     _____________________
    /                     \
   |                       |
   A ----- B ----- C *   * D

Reestablish the link between C and D.
     _____________________
    /                     \
   |                       |
   A ----- B ----- C *---- D

Somehow, enough VTU/ATU operations happen inside C that prevents
the application from receving the CCM messages from B in a timely
manner, so a Signal Fail message is sent by C. When B receives
that, it responds by blocking its port.

     _____________________
    /                     \
   |                       |
   A ----- B *---* C *---- D

Very shortly after this, the signal fail condition clears on the
BC link (some CCM messages finally make it through), so C
unblocks the port. However, a guard timer inside B prevents it
from removing the blocking before 5 seconds have elapsed.

It is not unlikely that our userspace ERPS implementation could be
smarter and/or is simply buggy. However, this patch fixes the symptoms
we see, and is a small optimization that should not break anything
(knock wood). The idea is simply to avoid doing an VTU load of an
entry identical to the one already present. To do that, we need to
know whether mv88e6xxx_vtu_get() actually found an existing entry, or
has just prepared a struct mv88e6xxx_vtu_entry for us to load. To that
end, let vlan->valid be an output parameter. The other two callers of
mv88e6xxx_vtu_get() are not affected by this patch since they pass
new=false.

Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agor8169: make use of xmit_more
Heiner Kallweit [Sun, 28 Jul 2019 09:25:19 +0000 (11:25 +0200)]
r8169: make use of xmit_more

There was a previous attempt to use xmit_more, but the change had to be
reverted because under load sometimes a transmit timeout occurred [0].
Maybe this was caused by a missing memory barrier, the new attempt
keeps the memory barrier before the call to netif_stop_queue like it
is used by the driver as of today. The new attempt also changes the
order of some calls as suggested by Eric.

[0] https://lkml.org/lkml/2019/2/10/39

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agostaging/octeon: Allow test build on !MIPS
Matthew Wilcox (Oracle) [Fri, 26 Jul 2019 17:44:25 +0000 (10:44 -0700)]
staging/octeon: Allow test build on !MIPS

Add compile test support by moving all includes of files under
asm/octeon into octeon-ethernet.h, and if we're not on MIPS,
stub out all the calls into the octeon support code in octeon-stubs.h

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: ag71xx: use resource_size for the ioremap size
Ding Xiang [Mon, 29 Jul 2019 09:01:22 +0000 (17:01 +0800)]
net: ag71xx: use resource_size for the ioremap size

use resource_size to calcuate ioremap size and make
the code simpler.

Signed-off-by: Ding Xiang <dingxiang@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'nfc-next'
David S. Miller [Mon, 29 Jul 2019 15:56:27 +0000 (08:56 -0700)]
Merge branch 'nfc-next'

Andy Shevchenko says:

====================
NFC: nxp-nci: clean up and new device support

Few people reported that some laptops are coming with new ACPI ID for the
devices should be supported by nxp-nci driver.

This series adds new ID (patch 2), cleans up the driver from legacy platform
data and unifies GPIO request for Device Tree and ACPI (patches 3-6), removes
dead or unneeded code (patches 7, 9, 11), constifies ID table (patch 8),
removes comma in terminator line for better maintenance (patch 10) and
rectifies Kconfig entry (patches 12-14).

It also contains a fix for NFC subsystem as suggested by Sedat.

Series has been tested by Sedat.

Changelog v4:
- rebased on top of latest linux-next
- appended cover letter
- elaborated removal of pr_fmt() in the patch 11 (David)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoNFC: nxp-nci: Fix recommendation for NFC_NXP_NCI_I2C Kconfig
Sedat Dilek [Mon, 29 Jul 2019 13:35:14 +0000 (16:35 +0300)]
NFC: nxp-nci: Fix recommendation for NFC_NXP_NCI_I2C Kconfig

This is a simple cleanup to the Kconfig help text as discussed in [1].

[1] https://marc.info/?t=155774435600001&r=1&w=2

Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Suggested-by: Oleg Zhurakivskyy <oleg.zhurakivskyy@intel.com>
Signed-off-by: Sedat Dilek <sedat.dilek@credativ.de>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoNFC: nxp-nci: Clarify on supported chips
Sedat Dilek [Mon, 29 Jul 2019 13:35:13 +0000 (16:35 +0300)]
NFC: nxp-nci: Clarify on supported chips

This patch clarifies on the supported NXP NCI chips and families
and lists PN547 and PN548 separately which are known as NPC100
respectively NPC300.

This helps to find informations and identify drivers on vendor's
support websites.

For details see the discussion in [1] and [2].

[1] https://marc.info/?t=155774435600001&r=1&w=2
[2] https://patchwork.kernel.org/project/linux-wireless/list/?submitter=33142

Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Suggested-by: Oleg Zhurakivskyy <oleg.zhurakivskyy@intel.com>
Signed-off-by: Sedat Dilek <sedat.dilek@credativ.de>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Oleg Zhurakivskyy <oleg.zhurakivskyy@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoNFC: nxp-nci: Remove 'default n' for the core
Andy Shevchenko [Mon, 29 Jul 2019 13:35:12 +0000 (16:35 +0300)]
NFC: nxp-nci: Remove 'default n' for the core

It seems contributors follow the style of Kconfig entries where explicit
'default n' is present.  The default 'default' is 'n' already, thus, drop
these lines from Kconfig to make it more clear.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoNFC: nxp-nci: Remove unused macro pr_fmt()
Andy Shevchenko [Mon, 29 Jul 2019 13:35:11 +0000 (16:35 +0300)]
NFC: nxp-nci: Remove unused macro pr_fmt()

The macro had never been used.

The driver uses mostly the nfc_err(), which, with other macros in the family,
is backed by corresponding dev_err(). pr_fmt() is not used for dev_err()
macro. Moreover, there is no need to print the module name which is part of the
device instance name anyway.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoNFC: nxp-nci: Drop comma in terminator lines
Andy Shevchenko [Mon, 29 Jul 2019 13:35:10 +0000 (16:35 +0300)]
NFC: nxp-nci: Drop comma in terminator lines

There is no need to have a comma after terminator entry
in the arrays of IDs.

This may prevent the misguided addition behind the terminator
without compiler notice.

Drop the comma in terminator lines for good.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoNFC: nxp-nci: Drop of_match_ptr() use
Andy Shevchenko [Mon, 29 Jul 2019 13:35:09 +0000 (16:35 +0300)]
NFC: nxp-nci: Drop of_match_ptr() use

There is no need to guard OF device ID table with of_match_ptr().
Otherwise we would get a defined but not used data.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoNFC: nxp-nci: Constify acpi_device_id
Andy Shevchenko [Mon, 29 Jul 2019 13:35:08 +0000 (16:35 +0300)]
NFC: nxp-nci: Constify acpi_device_id

The content of acpi_device_id is not supposed to change at runtime.
All functions working with acpi_device_id provided by <linux/acpi.h>
work with const acpi_device_id. So mark the non-const structs as const.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoNFC: nxp-nci: Get rid of useless label
Andy Shevchenko [Mon, 29 Jul 2019 13:35:07 +0000 (16:35 +0300)]
NFC: nxp-nci: Get rid of useless label

Return directly in ->probe() since there no special cleaning is needed.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoNFC: nxp-nci: Get rid of code duplication in ->probe()
Andy Shevchenko [Mon, 29 Jul 2019 13:35:06 +0000 (16:35 +0300)]
NFC: nxp-nci: Get rid of code duplication in ->probe()

Since OF and ACPI case almost the same get rid of code duplication
by moving gpiod_get() calls directly to ->probe().

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoNFC: nxp-nci: Add GPIO ACPI mapping table
Andy Shevchenko [Mon, 29 Jul 2019 13:35:05 +0000 (16:35 +0300)]
NFC: nxp-nci: Add GPIO ACPI mapping table

In order to unify GPIO resource request prepare gpiod_get_index()
to behave correctly when there is no mapping provided by firmware.

Here we add explicit mapping between _CRS GpioIo() resources and
their names used in the driver.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoNFC: nxp-nci: Convert to use GPIO descriptor
Andy Shevchenko [Mon, 29 Jul 2019 13:35:04 +0000 (16:35 +0300)]
NFC: nxp-nci: Convert to use GPIO descriptor

Since we got rid of platform data, the driver may use
GPIO descriptor directly.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoNFC: nxp-nci: Get rid of platform data
Andy Shevchenko [Mon, 29 Jul 2019 13:35:03 +0000 (16:35 +0300)]
NFC: nxp-nci: Get rid of platform data

Legacy platform data must go away. We are on the safe side here since
there are no users of it in the kernel.

If anyone by any odd reason needs it the GPIO lookup tables and
built-in device properties at your service.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoNFC: nxp-nci: Add NXP1001 to the ACPI ID table
Andy Shevchenko [Mon, 29 Jul 2019 13:35:02 +0000 (16:35 +0300)]
NFC: nxp-nci: Add NXP1001 to the ACPI ID table

It seems a lot of laptops are equipped with NXP NFC300 chip with
the ACPI ID NXP1001 as per DSDT.

Append it to the driver's ACPI ID table.

Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoNFC: fix attrs checks in netlink interface
Andrey Konovalov [Mon, 29 Jul 2019 13:35:01 +0000 (16:35 +0300)]
NFC: fix attrs checks in netlink interface

nfc_genl_deactivate_target() relies on the NFC_ATTR_TARGET_INDEX
attribute being present, but doesn't check whether it is actually
provided by the user. Same goes for nfc_genl_fw_download() and
NFC_ATTR_FIRMWARE_NAME.

This patch adds appropriate checks.

Found with syzkaller.

Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'hns3-next'
David S. Miller [Mon, 29 Jul 2019 15:23:41 +0000 (08:23 -0700)]
Merge branch 'hns3-next'

Huazhong Tan says:

====================
net: hns3: some code optimizations & bugfixes & features

This patch-set includes code optimizations, bugfixes and features for
the HNS3 ethernet controller driver.

[patch 1/10] checks reset status before setting channel.

[patch 2/10] adds a NULL pointer checking.

[patch 3/10] removes reset level upgrading when current reset fails.

[patch 4/10] fixes a GFP flags errors when holding spin_lock.

[patch 5/10] modifies firmware version format.

[patch 6/10] adds some print information which is off by default.

[patch 7/10 - 8/10] adds two code optimizations about interrupt handler
and work task.

[patch 9/10] adds support for using order 1 pages with a 4K buffer.

[patch 10/10] modifies messages prints with dev_info() instead of
pr_info().

Change log:
V3->V4: replace netif_info with netif_dbg in [patch 6/10]
V2->V3: fixes comments from Saeed Mahameed and Joe Perches.
V1->V2: fixes comments from Saeed Mahameed and
removes previous [patch 4/11] and [patch 11/11]
which needs further discussion, and adds a new
patch [10/10] suggested by Saeed Mahameed.
====================

Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: hns3: use dev_info() instead of pr_info()
Huazhong Tan [Mon, 29 Jul 2019 02:53:31 +0000 (10:53 +0800)]
net: hns3: use dev_info() instead of pr_info()

dev_info() is more appropriate for printing messages when driver
initialization done, so switch to dev_info().

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: hns3: Add support for using order 1 pages with a 4K buffer
Yunsheng Lin [Mon, 29 Jul 2019 02:53:30 +0000 (10:53 +0800)]
net: hns3: Add support for using order 1 pages with a 4K buffer

Hardware supports 0.5K, 1K, 2K, 4K RX buffer size, the
RX buffer can not be reused because the hns3_page_order
return 0 when page size and RX buffer size are both 4096.

So this patch changes the hns3_page_order to return 1 when
RX buffer is greater than half of the page size and page size
is less the 8192, and dev_alloc_pages has already been used
to allocate the compound page for RX buffer.

This patch also changes hnae3_* to hns3_* for page order
and RX buffer size calculation because they are used in
hns3 module.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: hns3: add interrupt affinity support for misc interrupt
Yunsheng Lin [Mon, 29 Jul 2019 02:53:29 +0000 (10:53 +0800)]
net: hns3: add interrupt affinity support for misc interrupt

The misc interrupt is used to schedule the reset and mailbox
subtask, and service_task delayed_work is used to do periodic
management work each second.

This patch sets the above three subtask's affinity using the
misc interrupt' affinity.

Also this patch setups a affinity notify for misc interrupt to
allow user to change the above three subtask's affinity.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: hns3: make hclge_service use delayed workqueue
Yunsheng Lin [Mon, 29 Jul 2019 02:53:28 +0000 (10:53 +0800)]
net: hns3: make hclge_service use delayed workqueue

Use delayed work instead of using timers to trigger the
hclge_serive.

Simplify the code with one less middle function and in order
to support misc irq affinity.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: hns3: add debug messages to identify eth down cause
Yonglong Liu [Mon, 29 Jul 2019 02:53:27 +0000 (10:53 +0800)]
net: hns3: add debug messages to identify eth down cause

Some times just see the eth interface have been down/up via
dmesg, but can not know why the eth down. So adds some debug
messages to identify the cause for this.

Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: hns3: modify firmware version display format
Yufeng Mo [Mon, 29 Jul 2019 02:53:26 +0000 (10:53 +0800)]
net: hns3: modify firmware version display format

This patch modifies firmware version display format in
hclge(vf)_cmd_init() and hns3_get_drvinfo(). Also, adds
some optimizations for firmware version display format.

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: hns3: change GFP flag during lock period
Yufeng Mo [Mon, 29 Jul 2019 02:53:25 +0000 (10:53 +0800)]
net: hns3: change GFP flag during lock period

When allocating memory, the GFP_KERNEL cannot be used during the
spin_lock period. This is because it may cause scheduling when holding
spin_lock. This patch changes GFP flag to GFP_ATOMIC in this case.

Fixes: dd74f815dd41 ("net: hns3: Add support for rule add/delete for flow director")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: lipeng 00277521 <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: hns3: remove upgrade reset level when reset fail
Huazhong Tan [Mon, 29 Jul 2019 02:53:24 +0000 (10:53 +0800)]
net: hns3: remove upgrade reset level when reset fail

Currently, hclge_reset_err_handle() will assert a global reset
when the failing count is smaller than MAX_RESET_FAIL_CNT, which
will affect other running functions.

So this patch removes this upgrading, and uses re-scheduling reset
task to do it.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: hns3: add a check for get_reset_level
Guangbin Huang [Mon, 29 Jul 2019 02:53:23 +0000 (10:53 +0800)]
net: hns3: add a check for get_reset_level

For some cases, ops->get_reset_level may not be implemented, so we
should check whether it is NULL before calling get_reset_level.

Signed-off-by: Guangbin Huang <huangguangbin@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: hns3: add reset checking before set channels
Jian Shen [Mon, 29 Jul 2019 02:53:22 +0000 (10:53 +0800)]
net: hns3: add reset checking before set channels

hns3_set_channels() should check the resetting status firstly,
since the device will reinitialize when resetting. If the
reset has not completed, the hns3_set_channels() may access
invalid memory.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'mlxsw-spectrum_acl-Forbid-unsupported-filters'
David S. Miller [Sat, 27 Jul 2019 21:32:31 +0000 (14:32 -0700)]
Merge branch 'mlxsw-spectrum_acl-Forbid-unsupported-filters'

Ido Schimmel says:

====================
mlxsw: spectrum_acl: Forbid unsupported filters

Patches #1-#2 make mlxsw reject unsupported egress filters. These
include filters that match on VLAN and filters associated with a
redirect action. Patch #1 rejects such filters when they are configured
on egress and patch #2 rejects such filters when they are configured in
a shared block that user tries to bind to egress.

Patch #3 forbids matching on reserved TCP flags as this is not supported
by the current keys that mlxsw uses.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: spectrum_flower: Forbid to offload match on reserved TCP flags bits
Jiri Pirko [Sat, 27 Jul 2019 17:32:57 +0000 (20:32 +0300)]
mlxsw: spectrum_flower: Forbid to offload match on reserved TCP flags bits

Matching on reserved TCP flags bits is only supported using custom
parser. Since the usecase for that is not known now, just forbid to
offload rules that match on these bits.

Reported-by: Alex Kushnarov <alexanderk@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: spectrum_acl: Track rules that forbid egress block bind
Jiri Pirko [Sat, 27 Jul 2019 17:32:56 +0000 (20:32 +0300)]
mlxsw: spectrum_acl: Track rules that forbid egress block bind

Some matches and actions are not supported on egress. Track such rules
and forbid a bind of block which contains them to egress.

With this patch, the kernel tells the user he cannot do that:
$ tc qdisc add dev ens16np1 ingress_block 22 clsact
$ tc filter add block 22 protocol 802.1q pref 2 handle 101 flower vlan_id 100 skip_sw action pass
$ tc qdisc add dev ens16np2 egress_block 22 clsact
Error: mlxsw_spectrum: Block cannot be bound to egress because it contains unsupported rules.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: spectrum_flower: Forbid to offload mirred redirect on egress
Jiri Pirko [Sat, 27 Jul 2019 17:32:55 +0000 (20:32 +0300)]
mlxsw: spectrum_flower: Forbid to offload mirred redirect on egress

Spectrum ASIC does not support redirection on egress, so refuse to
insert such flows:

$ tc qdisc add dev ens16np1 clsact
$ tc filter add dev ens16np1 egress protocol all pref 1 handle 101 flower skip_sw action mirred egress redirect dev ens16np2
Error: mlxsw_spectrum: Redirect action is not supported on egress.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'r8169-improve-HW-csum-and-TSO-handling'
David S. Miller [Sat, 27 Jul 2019 21:25:07 +0000 (14:25 -0700)]
Merge branch 'r8169-improve-HW-csum-and-TSO-handling'

Heiner Kallweit says:

====================
r8169: improve HW csum and TSO handling

This series:
- delegates more tasks from the driver to the core
- enables HW csum and TSO per default
- copies quirks for buggy chip versions from vendor driver
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agor8169: enable HW csum and TSO
Heiner Kallweit [Fri, 26 Jul 2019 19:51:36 +0000 (21:51 +0200)]
r8169: enable HW csum and TSO

Enable HW csum and TSO per default except on known buggy chip versions.
Realtek confirmed that RTL8168evl has a HW issue with TSO.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agor8169: remove r8169_csum_workaround
Heiner Kallweit [Fri, 26 Jul 2019 19:50:34 +0000 (21:50 +0200)]
r8169: remove r8169_csum_workaround

The loop in r8169_csum_workaround is called only if in
msdn_giant_send_check a copy of the skb header needs to be made and
we don't have enough memory. Let's simply drop the packet in that case
so that we can remove r8169_csum_workaround.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agor8169: implement callback ndo_features_check
Heiner Kallweit [Fri, 26 Jul 2019 19:49:22 +0000 (21:49 +0200)]
r8169: implement callback ndo_features_check

Implement callback ndo_features_check and move all feature checks there.
This will allow us to get rid of r8169_csum_workaround() completely in
a subsequent step. Like in the vendor driver disable HW csum for short
packets on RTL8168b.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agor8169: set GSO size and segment limits
Heiner Kallweit [Fri, 26 Jul 2019 19:48:32 +0000 (21:48 +0200)]
r8169: set GSO size and segment limits

Set GSO max size and max segment number as in the vendor driver.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv6: remove printk
Jonathan Lemon [Fri, 26 Jul 2019 19:16:09 +0000 (12:16 -0700)]
ipv6: remove printk

ipv6_find_hdr() prints a non-rate limited error message
when it cannot find an ipv6 header at a specific offset.
This could be used as a DoS, so just remove it.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agor8169: align setting PME with vendor driver
Heiner Kallweit [Fri, 26 Jul 2019 18:56:20 +0000 (20:56 +0200)]
r8169: align setting PME with vendor driver

Align setting PME with the vendor driver. PMEnable is writable on
RTL8169 only, on later chip versions it's read-only. PME_SIGNAL is
used on chip versions from RTL8168evl with the exception of the
RTL8168f family.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlx4/en_netdev: allow offloading VXLAN over VLAN
Davide Caratti [Fri, 26 Jul 2019 18:18:12 +0000 (20:18 +0200)]
mlx4/en_netdev: allow offloading VXLAN over VLAN

ConnectX-3 Pro can offload transmission of VLAN packets with VXLAN inside:
enable tunnel offloads in dev->vlan_features, like it's done with other
NIC drivers (e.g. be2net and ixgbe).

It's no more necessary to change dev->hw_enc_features when VXLAN are added
or removed, since .ndo_features_check() already checks for VXLAN packet
where the UDP destination port matches the configured value. Just set
dev->hw_enc_features when the NIC is initialized, so that overlying VLAN
can correctly inherit the tunnel offload capabilities.

Changes since v1:
- avoid flipping hw_enc_features, instead of calling netdev notifiers,
  thanks to Saeed Mahameed
- squash two patches into a single one

CC: Paolo Abeni <pabeni@redhat.com>
CC: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodrivers: net: xgene: Move status variable declaration into CONFIG_ACPI block
Nathan Chancellor [Fri, 26 Jul 2019 16:20:37 +0000 (09:20 -0700)]
drivers: net: xgene: Move status variable declaration into CONFIG_ACPI block

When CONFIG_ACPI is unset (arm allyesconfig), status is unused.

drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.c:383:14: warning:
unused variable 'status' [-Wunused-variable]
        acpi_status status;
                    ^
drivers/net/ethernet/apm/xgene/xgene_enet_sgmac.c:440:14: warning:
unused variable 'status' [-Wunused-variable]
        acpi_status status;
                    ^
drivers/net/ethernet/apm/xgene/xgene_enet_hw.c:697:14: warning: unused
variable 'status' [-Wunused-variable]
        acpi_status status;
                    ^

Move the declaration into the CONFIG_ACPI block so that there are no
compiler warnings.

Fixes: 570d785ba46b ("drivers: net: xgene: Remove acpi_has_method() calls")
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: stmmac: Do not request stmmaceth clock
Thierry Reding [Fri, 26 Jul 2019 10:27:41 +0000 (12:27 +0200)]
net: stmmac: Do not request stmmaceth clock

The stmmaceth clock is specified by the slave_bus and apb_pclk clocks in
the device tree bindings for snps,dwc-qos-ethernet-4.10 compatible nodes
of this IP.

The subdrivers for these bindings will be requesting the stmmac clock
correctly at a later point, so there is no need to request it here and
cause an error message to be printed to the kernel log.

Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: stmmac: Make MDIO bus reset optional
Thierry Reding [Fri, 26 Jul 2019 10:27:40 +0000 (12:27 +0200)]
net: stmmac: Make MDIO bus reset optional

The Tegra EQOS driver already resets the MDIO bus at probe time via the
reset GPIO specified in the phy-reset-gpios device tree property. There
is no need to reset the bus again later on.

This avoids the need to query the device tree for the snps,reset GPIO,
which is not part of the Tegra EQOS device tree bindings. This quiesces
an error message from the generic bus reset code if it doesn't find the
snps,reset related delays.

Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: neigh: remove redundant assignment to variable bucket
Colin Ian King [Fri, 26 Jul 2019 09:46:11 +0000 (10:46 +0100)]
net: neigh: remove redundant assignment to variable bucket

The variable bucket is being initialized with a value that is never
read and it is being updated later with a new value in a following
for-loop. The initialization is redundant and can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agosis900: add support for ethtool's EEPROM dump
Sergej Benilov [Thu, 25 Jul 2019 19:48:06 +0000 (21:48 +0200)]
sis900: add support for ethtool's EEPROM dump

Implement ethtool's EEPROM dump command (ethtool -e|--eeprom-dump).

Thx to Andrew Lunn for comments.

Signed-off-by: Sergej Benilov <sergej.benilov@googlemail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agostaging: octeon: Fix build failure due to typo.
David S. Miller [Fri, 26 Jul 2019 21:10:30 +0000 (14:10 -0700)]
staging: octeon: Fix build failure due to typo.

drivers/staging/octeon/ethernet-tx.c:287:23: error: implicit declaration of function 'skb_drag_size'; did you mean 'skb_frag_size'? [-Werror=implicit-function-declaration]

From kernelci report:

https://kernelci.org/build/id/5d3943f859b514103f688918/logs/

Fixes: 92493a2f8a8d ("Build fixes for skb_frag_size conversion")
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: mvneta: use devm_platform_ioremap_resource() to simplify code
Jisheng Zhang [Thu, 25 Jul 2019 07:48:04 +0000 (07:48 +0000)]
net: mvneta: use devm_platform_ioremap_resource() to simplify code

devm_platform_ioremap_resource() wraps platform_get_resource() and
devm_ioremap_resource() in a single helper, let's use that helper to
simplify the code.

Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'tipc-link-changeover-issues'
David S. Miller [Thu, 25 Jul 2019 22:55:47 +0000 (15:55 -0700)]
Merge branch 'tipc-link-changeover-issues'

Tuong Lien says:

====================
tipc: link changeover issues

This patch series is to resolve some issues found with the current link
changeover mechanism, it also includes an optimization for the link
synching.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotipc: fix changeover issues due to large packet
Tuong Lien [Wed, 24 Jul 2019 01:56:12 +0000 (08:56 +0700)]
tipc: fix changeover issues due to large packet

In conjunction with changing the interfaces' MTU (e.g. especially in
the case of a bonding) where the TIPC links are brought up and down
in a short time, a couple of issues were detected with the current link
changeover mechanism:

1) When one link is up but immediately forced down again, the failover
procedure will be carried out in order to failover all the messages in
the link's transmq queue onto the other working link. The link and node
state is also set to FAILINGOVER as part of the process. The message
will be transmited in form of a FAILOVER_MSG, so its size is plus of 40
bytes (= the message header size). There is no problem if the original
message size is not larger than the link's MTU - 40, and indeed this is
the max size of a normal payload messages. However, in the situation
above, because the link has just been up, the messages in the link's
transmq are almost SYNCH_MSGs which had been generated by the link
synching procedure, then their size might reach the max value already!
When the FAILOVER_MSG is built on the top of such a SYNCH_MSG, its size
will exceed the link's MTU. As a result, the messages are dropped
silently and the failover procedure will never end up, the link will
not be able to exit the FAILINGOVER state, so cannot be re-established.

2) The same scenario above can happen more easily in case the MTU of
the links is set differently or when changing. In that case, as long as
a large message in the failure link's transmq queue was built and
fragmented with its link's MTU > the other link's one, the issue will
happen (there is no need of a link synching in advance).

3) The link synching procedure also faces with the same issue but since
the link synching is only started upon receipt of a SYNCH_MSG, dropping
the message will not result in a state deadlock, but it is not expected
as design.

The 1) & 3) issues are resolved by the last commit that only a dummy
SYNCH_MSG (i.e. without data) is generated at the link synching, so the
size of a FAILOVER_MSG if any then will never exceed the link's MTU.

For the 2) issue, the only solution is trying to fragment the messages
in the failure link's transmq queue according to the working link's MTU
so they can be failovered then. A new function is made to accomplish
this, it will still be a TUNNEL PROTOCOL/FAILOVER MSG but if the
original message size is too large, it will be fragmented & reassembled
at the receiving side.

Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotipc: optimize link synching mechanism
Tuong Lien [Wed, 24 Jul 2019 01:56:11 +0000 (08:56 +0700)]
tipc: optimize link synching mechanism

This commit along with the next one are to resolve the issues with the
link changeover mechanism. See that commit for details.

Basically, for the link synching, from now on, we will send only one
single ("dummy") SYNCH message to peer. The SYNCH message does not
contain any data, just a header conveying the synch point to the peer.

A new node capability flag ("TIPC_TUNNEL_ENHANCED") is introduced for
backward compatible!

Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Suggested-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoptp: ptp_dte: remove redundant dev_err message
Ding Xiang [Tue, 23 Jul 2019 08:54:05 +0000 (16:54 +0800)]
ptp: ptp_dte: remove redundant dev_err message

devm_ioremap_resource already contains error message, so remove
the redundant dev_err message

Signed-off-by: Ding Xiang <dingxiang@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'mlxsw-Two-small-updates'
David S. Miller [Thu, 25 Jul 2019 18:36:19 +0000 (11:36 -0700)]
Merge branch 'mlxsw-Two-small-updates'

Ido Schimmel says:

====================
mlxsw: Two small updates

Patch #1, from Amit, exposes the size of the key-value database (KVD)
where different entries (e.g., routes, neighbours) are stored in the
device. This allows users to understand how many entries can be
offloaded and is also useful for writing scale tests.

Patch #2 increases the number of IPv6 nexthop groups mlxsw can offload.
The problem and solution are explained in detail in the commit message.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: spectrum_router: Increase scale of IPv6 nexthop groups
Ido Schimmel [Tue, 23 Jul 2019 07:57:42 +0000 (10:57 +0300)]
mlxsw: spectrum_router: Increase scale of IPv6 nexthop groups

Unlike IPv4, the kernel does not consolidate IPv6 nexthop groups. To
avoid exhausting the device's adjacency table - where nexthops are
stored - the driver does this consolidation instead.

Each nexthop group is hashed by XOR-ing the interface indexes of all the
member nexthop devices. However, the ifindex itself is not hashed, which
can result in identical keys used for different groups and finally an
-EBUSY error from rhashtable due to too long objects list.

Improve the situation by hashing the ifindex itself.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: spectrum: Expose KVD size for Spectrum-2
Amit Cohen [Tue, 23 Jul 2019 07:57:41 +0000 (10:57 +0300)]
mlxsw: spectrum: Expose KVD size for Spectrum-2

Unlike Spectrum-1, the KVD (Key-value database) of Spectrum-2 is not
partitioned, so only expose the entire KVD size. This enables users to
query the total size of the KVD.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sfc: falcon: convert to i2c_new_dummy_device
Wolfram Sang [Mon, 22 Jul 2019 17:26:35 +0000 (19:26 +0200)]
net: sfc: falcon: convert to i2c_new_dummy_device

Move from i2c_new_dummy() to i2c_new_dummy_device(). So, we now get an
ERRPTR which we use in error handling.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlx4: avoid large stack usage in mlx4_init_hca()
Arnd Bergmann [Mon, 22 Jul 2019 15:01:55 +0000 (17:01 +0200)]
mlx4: avoid large stack usage in mlx4_init_hca()

The mlx4_dev_cap and mlx4_init_hca_param are really too large
to be put on the kernel stack, as shown by this clang warning:

drivers/net/ethernet/mellanox/mlx4/main.c:3304:12: error: stack frame size of 1088 bytes in function 'mlx4_load_one' [-Werror,-Wframe-larger-than=]

With gcc, the problem is the same, but it does not warn because
it does not inline this function, and therefore stays just below
the warning limit, while clang is just above it.

Use kzalloc for dynamic allocation instead of putting them
on stack. This gets the combined stack frame down to 424 bytes.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoqed: reduce maximum stack frame size
Arnd Bergmann [Mon, 22 Jul 2019 15:01:23 +0000 (17:01 +0200)]
qed: reduce maximum stack frame size

clang warns about an overly large stack frame in one function
when it decides to inline all __qed_get_vport_*() functions into
__qed_get_vport_stats():

drivers/net/ethernet/qlogic/qed/qed_l2.c:1889:13: error: stack frame size of 1128 bytes in function '_qed_get_vport_stats' [-Werror,-Wframe-larger-than=]

Use a noinline_for_stack annotation to prevent clang from inlining
these, which keeps the maximum stack usage at around half of that
in the worst case, similar to what we get with gcc.

Fixes: 86622ee75312 ("qed: Move statistics to L2 code")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agor8169: improve rtl_set_rx_mode
Heiner Kallweit [Wed, 24 Jul 2019 21:34:45 +0000 (23:34 +0200)]
r8169: improve rtl_set_rx_mode

This patch improves and simplifies rtl_set_rx_mode a little.
No functional change intended.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next...
David S. Miller [Wed, 24 Jul 2019 22:35:40 +0000 (15:35 -0700)]
Merge branch '1GbE' of git://git./linux/kernel/git/jkirsher/next-queue

Jeff Kirsher says:

====================
1GbE Intel Wired LAN Driver Updates 2019-07-24

This series contains updates to igc and e1000e client drivers only.

Sasha provides a couple of cleanups to remove code that is not needed
and reduce structure sizes.  Updated the MAC reset flow to use the
device reset flow instead of a port reset flow.  Added addition device
id's that will be supported.

Kai-Heng Feng provides a workaround for a possible stalled packet issue
in our ICH devices due to a clock recovery from the PCH being too slow.

v2: removed the last patch in the series that supposedly fixed a MAC/PHY
    de-sync potential issue while waiting for additional information from
    hardware engineers.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/ixgbevf: fix a compilation error of skb_frag_t
Qian Cai [Wed, 24 Jul 2019 16:17:59 +0000 (12:17 -0400)]
net/ixgbevf: fix a compilation error of skb_frag_t

The linux-next commit "net: Rename skb_frag_t size to bv_len" [1]
introduced a compilation error on powerpc as it forgot to deal with the
renaming from "size" to "bv_len" for ixgbevf.

[1] https://lore.kernel.org/netdev/20190723030831.11879-1-willy@infradead.org/T/#md052f1c7de965ccd1bdcb6f92e1990a52298eac5

In file included from ./include/linux/cache.h:5,
                 from ./include/linux/printk.h:9,
                 from ./include/linux/kernel.h:15,
                 from ./include/linux/list.h:9,
                 from ./include/linux/module.h:9,
                 from
drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c:12:
drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c: In function
'ixgbevf_xmit_frame_ring':
drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c:4138:51: error:
'skb_frag_t' {aka 'struct bio_vec'} has no member named 'size'
   count += TXD_USE_COUNT(skb_shinfo(skb)->frags[f].size);
                                                   ^
./include/uapi/linux/kernel.h:13:40: note: in definition of macro
'__KERNEL_DIV_ROUND_UP'
 #define __KERNEL_DIV_ROUND_UP(n, d) (((n) + (d) - 1) / (d))
                                        ^
drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c:4138:12: note: in
expansion of macro 'TXD_USE_COUNT'
   count += TXD_USE_COUNT(skb_shinfo(skb)->frags[f].size);

Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests: mlxsw: Fix typo in qos_mc_aware.sh
Masanari Iida [Wed, 24 Jul 2019 15:29:51 +0000 (00:29 +0900)]
selftests: mlxsw: Fix typo in qos_mc_aware.sh

This patch fix some spelling typo in qos_mc_aware.sh

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoqlge: Fix build error without CONFIG_ETHERNET
YueHaibing [Wed, 24 Jul 2019 13:01:26 +0000 (21:01 +0800)]
qlge: Fix build error without CONFIG_ETHERNET

Now if CONFIG_ETHERNET is not set, QLGE driver
building fails:

drivers/staging/qlge/qlge_main.o: In function `qlge_remove':
drivers/staging/qlge/qlge_main.c:4831: undefined reference to `unregister_netdev'

Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: 955315b0dc8c ("qlge: Move drivers/net/ethernet/qlogic/qlge/ to drivers/staging/qlge/")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agor8169: fix a typo in a comment
Corentin Musard [Wed, 24 Jul 2019 12:34:43 +0000 (14:34 +0200)]
r8169: fix a typo in a comment

Replace "additonal" by "additional" in a comment.
Typo found by checkpatch.pl.

Signed-off-by: Corentin Musard <corentinmusard@gmail.com>
Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoe1000e: add workaround for possible stalled packet
Kai-Heng Feng [Mon, 8 Jul 2019 04:55:45 +0000 (12:55 +0800)]
e1000e: add workaround for possible stalled packet

This works around a possible stalled packet issue, which may occur due to
clock recovery from the PCH being too slow, when the LAN is transitioning
from K1 at 1G link speed.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204057

Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>