platform/kernel/linux-rpi.git
4 years agos390/qeth: relax locking for ipato config data
Julian Wiedmann [Wed, 23 Sep 2020 08:36:53 +0000 (10:36 +0200)]
s390/qeth: relax locking for ipato config data

card->ipato is currently protected by the conf_mutex. But most users
also hold the ip_lock - in particular qeth_l3_add_ip().

So slightly expand the sections under ip_lock in a few places (to
effectively cover a few error & no-op cases), and then drop the
conf_mutex where it's no longer needed.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agos390/qeth: don't init refcount twice for mcast IPs
Julian Wiedmann [Wed, 23 Sep 2020 08:36:52 +0000 (10:36 +0200)]
s390/qeth: don't init refcount twice for mcast IPs

mcast IP objects are allocated within qeth_l3_add_mcast_rtnl(),
with .ref_counter already set to 1 via qeth_l3_init_ipaddr().

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: microchip: Make `lan743x_pm_suspend` function return right value
Zheng Yongjun [Wed, 23 Sep 2020 03:21:40 +0000 (11:21 +0800)]
net: microchip: Make `lan743x_pm_suspend` function return right value

drivers/net/ethernet/microchip/lan743x_main.c: In function lan743x_pm_suspend:

`ret` is set but not used. In fact, `pci_prepare_to_sleep` function value should
be the right value of `lan743x_pm_suspend` function, therefore, fix it.

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge tag 'mlx5-updates-2020-09-21' of git://git.kernel.org/pub/scm/linux/kernel...
David S. Miller [Wed, 23 Sep 2020 00:44:59 +0000 (17:44 -0700)]
Merge tag 'mlx5-updates-2020-09-21' of git://git./linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2020-09-21

Multi packet TX descriptor support for SKBs.

This series introduces some refactoring of the regular TX data path in
mlx5 and adds the Enhanced TX MPWQE feature support. MPWQE stands for
multi-packet work queue element, and it can serve multiple packets,
reducing the PCI bandwidth spent on control traffic. It should improve
performance in scenarios where PCI is the bottleneck, and xmit_more is
signaled by the kernel. The refactoring done in this series also
improves the packet rate on its own.

MPWQE is already implemented in the XDP tx path, this series adds the
support of MPWQE for regular kernel SKB tx path.

MPWQE is supported from ConnectX-5 and onward, for legacy devices we need
to keep backward compatibility for regular (Single packet) WQE descriptor.

MPWQE is not compatible with certain offloads and features, such as TLS
offload, TSO, nonlinear SKBs. If such incompatible features are in use,
the driver gracefully falls back to non-MPWQE per SKB.

Prior to the final patch "net/mlx5e: Enhanced TX MPWQE for SKBs" that adds
the actual support, Maxim did some refactoring to the tx data path to
split it into stages and smaller helper functions that can be utilized and
reused for both legacy and new MPWQE feature.

Performance testing:

UDP performance is improved in a single stream pktgen test:
  Packet rate: 16.86 Mpps (±0.15 Mpps) -> 20.94 Mpps (±0.33 Mpps)
  Instructions per packet: 434 -> 329
  Cycles per packet: 158 -> 123
  Instructions per cycle: 2.75 -> 2.67

TCP and XDP_TX single stream tests show no performance difference.

MPWQE can reduce PCI bandwidth:
  PCI Gen2, pktgen at fixed rate of 36864000 pps on 24 CPU cores:
    Inbound PCI utilization with MPWQE off: 80.3%
    Inbound PCI utilization with MPWQE on: 59.0%
  PCI Gen3, pktgen at fixed rate of 56064000 pps on 24 CPU cores:
    Inbound PCI utilization with MPWQE off: 65.4%
    Inbound PCI utilization with MPWQE on: 49.3%

MPWQE can also reduce CPU load, increasing the packet rate in case of
CPU bottleneck:
  PCI Gen2, pktgen at full rate on 24 CPU cores:
    Packet rate with MPWQE off: 37.5 Mpps
    Packet rate with MPWQE on: 49.0 Mpps
  PCI Gen3, pktgen at full rate on 24 CPU cores:
    Packet rate with MPWQE off: 57.0 Mpps
    Packet rate with MPWQE on: 66.8 Mpps

Burst size in all pktgen tests is 32.

CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz (x86_64)
NIC: Mellanox ConnectX-6 Dx
GCC 10.2.0
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'devlink-Use-nla_policy-to-validate-range'
David S. Miller [Wed, 23 Sep 2020 00:38:42 +0000 (17:38 -0700)]
Merge branch 'devlink-Use-nla_policy-to-validate-range'

Parav Pandit says:

====================
devlink: Use nla_policy to validate range

This two small patches uses nla_policy to validate user specified
fields are in valid range or not.

Patch summary:
Patch-1 checks the range of eswitch mode field
Patch-2 checks for the port type field. It eliminates a check in
code by using nla policy infrastructure.
====================

Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodevlink: Enhance policy to validate port type input value
Parav Pandit [Mon, 21 Sep 2020 16:41:30 +0000 (19:41 +0300)]
devlink: Enhance policy to validate port type input value

Use range checking facility of nla_policy to validate port type
attribute input value is valid or not.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodevlink: Enhance policy to validate eswitch mode value
Parav Pandit [Mon, 21 Sep 2020 16:41:29 +0000 (19:41 +0300)]
devlink: Enhance policy to validate eswitch mode value

Use range checking facility of nla_policy to validate eswitch mode input
attribute value is valid or not.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
David S. Miller [Tue, 22 Sep 2020 23:45:34 +0000 (16:45 -0700)]
Merge git://git./linux/kernel/git/netdev/net

Two minor conflicts:

1) net/ipv4/route.c, adding a new local variable while
   moving another local variable and removing it's
   initial assignment.

2) drivers/net/dsa/microchip/ksz9477.c, overlapping changes.
   One pretty prints the port mode differently, whilst another
   changes the driver to try and obtain the port mode from
   the port node rather than the switch node.

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Linus Torvalds [Tue, 22 Sep 2020 22:08:41 +0000 (15:08 -0700)]
Merge branch 'fixes' of git://git./linux/kernel/git/viro/vfs

Pull vfs fixes from Al Viro:
 "No common topic, just assorted fixes"

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fuse: fix the ->direct_IO() treatment of iov_iter
  fs: fix cast in fsparam_u32hex() macro
  vboxsf: Fix the check for the old binary mount-arguments struct

4 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Tue, 22 Sep 2020 21:43:50 +0000 (14:43 -0700)]
Merge git://git./linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:

 - fix failure to add bond interfaces to a bridge, the offload-handling
   code was too defensive there and recent refactoring unearthed that.
   Users complained (Ido)

 - fix unnecessarily reflecting ECN bits within TOS values / QoS marking
   in TCP ACK and reset packets (Wei)

 - fix a deadlock with bpf iterator. Hopefully we're in the clear on
   this front now... (Yonghong)

 - BPF fix for clobbering r2 in bpf_gen_ld_abs (Daniel)

 - fix AQL on mt76 devices with FW rate control and add a couple of AQL
   issues in mac80211 code (Felix)

 - fix authentication issue with mwifiex (Maximilian)

 - WiFi connectivity fix: revert IGTK support in ti/wlcore (Mauro)

 - fix exception handling for multipath routes via same device (David
   Ahern)

 - revert back to a BH spin lock flavor for nsid_lock: there are paths
   which do require the BH context protection (Taehee)

 - fix interrupt / queue / NAPI handling in the lantiq driver (Hauke)

 - fix ife module load deadlock (Cong)

 - make an adjustment to netlink reply message type for code added in
   this release (the sole change touching uAPI here) (Michal)

 - a number of fixes for small NXP and Microchip switches (Vladimir)

[ Pull request acked by David: "you can expect more of this in the
  future as I try to delegate more things to Jakub" ]

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (167 commits)
  net: mscc: ocelot: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries
  net: dsa: seville: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries
  net: dsa: felix: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries
  inet_diag: validate INET_DIAG_REQ_PROTOCOL attribute
  net: bridge: br_vlan_get_pvid_rcu() should dereference the VLAN group under RCU
  net: Update MAINTAINERS for MediaTek switch driver
  net/mlx5e: mlx5e_fec_in_caps() returns a boolean
  net/mlx5e: kTLS, Avoid kzalloc(GFP_KERNEL) under spinlock
  net/mlx5e: kTLS, Fix leak on resync error flow
  net/mlx5e: kTLS, Add missing dma_unmap in RX resync
  net/mlx5e: kTLS, Fix napi sync and possible use-after-free
  net/mlx5e: TLS, Do not expose FPGA TLS counter if not supported
  net/mlx5e: Fix using wrong stats_grps in mlx5e_update_ndo_stats()
  net/mlx5e: Fix multicast counter not up-to-date in "ip -s"
  net/mlx5e: Fix endianness when calculating pedit mask first bit
  net/mlx5e: Enable adding peer miss rules only if merged eswitch is supported
  net/mlx5e: CT: Fix freeing ct_label mapping
  net/mlx5e: Fix memory leak of tunnel info when rule under multipath not ready
  net/mlx5e: Use synchronize_rcu to sync with NAPI
  net/mlx5e: Use RCU to protect rq->xdp_prog
  ...

4 years agoMerge tag 'io_uring-5.9-2020-09-22' of git://git.kernel.dk/linux-block
Linus Torvalds [Tue, 22 Sep 2020 21:36:50 +0000 (14:36 -0700)]
Merge tag 'io_uring-5.9-2020-09-22' of git://git.kernel.dk/linux-block

Pull io_uring fixes from Jens Axboe:
 "A few fixes - most of them regression fixes from this cycle, but also
  a few stable heading fixes, and a build fix for the included demo tool
  since some systems now actually have gettid() available"

* tag 'io_uring-5.9-2020-09-22' of git://git.kernel.dk/linux-block:
  io_uring: fix openat/openat2 unified prep handling
  io_uring: mark statx/files_update/epoll_ctl as non-SQPOLL
  tools/io_uring: fix compile breakage
  io_uring: don't use retry based buffered reads for non-async bdev
  io_uring: don't re-setup vecs/iter in io_resumit_prep() is already there
  io_uring: don't run task work on an exiting task
  io_uring: drop 'ctx' ref on task work cancelation
  io_uring: grab any needed state during defer prep

4 years agoMerge tag 'block-5.9-2020-09-22' of git://git.kernel.dk/linux-block
Linus Torvalds [Tue, 22 Sep 2020 21:31:38 +0000 (14:31 -0700)]
Merge tag 'block-5.9-2020-09-22' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "A few NVMe fixes, and a dasd write zero fix"

* tag 'block-5.9-2020-09-22' of git://git.kernel.dk/linux-block:
  nvmet: get transport reference for passthru ctrl
  nvme-core: get/put ctrl and transport module in nvme_dev_open/release()
  nvme-tcp: fix kconfig dependency warning when !CRYPTO
  nvme-pci: disable the write zeros command for Intel 600P/P3100
  s390/dasd: Fix zero write for FBA devices

4 years agoMerge tag 'trace-v5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt...
Linus Torvalds [Tue, 22 Sep 2020 16:08:33 +0000 (09:08 -0700)]
Merge tag 'trace-v5.9-rc5' of git://git./linux/kernel/git/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:

 - Check kprobe is enabled before unregistering from ftrace as it isn't
   registered when disabled.

 - Remove kprobes enabled via command-line that is on init text when
   freed.

 - Add missing RCU synchronization for ftrace trampoline symbols removed
   from kallsyms.

 - Free trampoline on error path if ftrace_startup() fails.

 - Give more space for the longer PID numbers in trace output.

 - Fix a possible double free in the histogram code.

 - A couple of fixes that were discovered by sparse.

* tag 'trace-v5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  bootconfig: init: make xbc_namebuf static
  kprobes: tracing/kprobes: Fix to kill kprobes on initmem after boot
  tracing: fix double free
  ftrace: Let ftrace_enable_sysctl take a kernel pointer buffer
  tracing: Make the space reserved for the pid wider
  ftrace: Fix missing synchronize_rcu() removing trampoline from kallsyms
  ftrace: Free the trampoline when ftrace_startup() fails
  kprobes: Fix to check probe enabled before disarm_kprobe_ftrace()

4 years agonet/mlx5e: Enhanced TX MPWQE for SKBs
Maxim Mikityanskiy [Thu, 2 Jul 2020 09:37:29 +0000 (12:37 +0300)]
net/mlx5e: Enhanced TX MPWQE for SKBs

This commit adds support for Enhanced TX MPWQE feature in the regular
(SKB) data path. A MPWQE (multi-packet work queue element) can serve
multiple packets, reducing the PCI bandwidth on control traffic.

Two new stats (tx*_mpwqe_blks and tx*_mpwqe_pkts) are added. The feature
is on by default and controlled by the skb_tx_mpwqe private flag.

In a MPWQE, eseg is shared among all packets, so eseg-based offloads
(IPSEC, GENEVE, checksum) run on a separate eseg that is compared to the
eseg of the current MPWQE session to decide if the new packet can be
added to the same session.

MPWQE is not compatible with certain offloads and features, such as TLS
offload, TSO, nonlinear SKBs. If such incompatible features are in use,
the driver gracefully falls back to non-MPWQE.

This change has no performance impact in TCP single stream test and
XDP_TX single stream test.

UDP pktgen, 64-byte packets, single stream, MPWQE off:
  Packet rate: 16.96 Mpps (±0.12 Mpps) -> 17.01 Mpps (±0.20 Mpps)
  Instructions per packet: 421 -> 429
  Cycles per packet: 156 -> 161
  Instructions per cycle: 2.70 -> 2.67

UDP pktgen, 64-byte packets, single stream, MPWQE on:
  Packet rate: 16.96 Mpps (±0.12 Mpps) -> 20.94 Mpps (±0.33 Mpps)
  Instructions per packet: 421 -> 329
  Cycles per packet: 156 -> 123
  Instructions per cycle: 2.70 -> 2.67

Enabling MPWQE can reduce PCI bandwidth:
  PCI Gen2, pktgen at fixed rate of 36864000 pps on 24 CPU cores:
    Inbound PCI utilization with MPWQE off: 80.3%
    Inbound PCI utilization with MPWQE on: 59.0%
  PCI Gen3, pktgen at fixed rate of 56064000 pps on 24 CPU cores:
    Inbound PCI utilization with MPWQE off: 65.4%
    Inbound PCI utilization with MPWQE on: 49.3%

Enabling MPWQE can also reduce CPU load, increasing the packet rate in
case of CPU bottleneck:
  PCI Gen2, pktgen at full rate on 24 CPU cores:
    Packet rate with MPWQE off: 37.5 Mpps
    Packet rate with MPWQE on: 49.0 Mpps
  PCI Gen3, pktgen at full rate on 24 CPU cores:
    Packet rate with MPWQE off: 57.0 Mpps
    Packet rate with MPWQE on: 66.8 Mpps

Burst size in all pktgen tests is 32.

CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz (x86_64)
NIC: Mellanox ConnectX-6 Dx
GCC 10.2.0

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 years agonet/mlx5e: Move TX code into functions to be used by MPWQE
Maxim Mikityanskiy [Mon, 9 Dec 2019 13:07:57 +0000 (15:07 +0200)]
net/mlx5e: Move TX code into functions to be used by MPWQE

mlx5e_txwqe_complete performs some actions that can be taken to separate
functions:

1. Update the flags needed for hardware timestamping.

2. Stop the TX queue if it's full.

Take these actions into separate functions to be reused by the MPWQE
code in the following commit and to maintain clear responsibilities of
functions.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 years agonet/mlx5e: Rename xmit-related structs to generalize them
Maxim Mikityanskiy [Thu, 16 Apr 2020 08:30:14 +0000 (11:30 +0300)]
net/mlx5e: Rename xmit-related structs to generalize them

As preparation for the upcoming TX MPWQE support for SKBs, rename struct
mlx5e_xdp_mpwqe to mlx5e_tx_mpwqe and move it above struct mlx5e_txqsq.
This structure will be reused in the regular SQ and in the regular TX
data path. Also rename mlx5e_xdp_xmit_data to mlx5e_xmit_data - it will
be used in the upcoming TX MPWQE flow.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 years agonet/mlx5e: Generalize TX MPWQE checks for full session
Maxim Mikityanskiy [Thu, 4 Jun 2020 13:43:27 +0000 (16:43 +0300)]
net/mlx5e: Generalize TX MPWQE checks for full session

As preparation for the upcoming TX MPWQE for SKBs, create a function
(mlx5e_tx_mpwqe_is_full) to check whether an MPWQE session is full. This
function will be shared by MPWQE code for XDP and for SKBs. Defines are
renamed and moved to make them not XDP-specific.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 years agonet/mlx5e: Support multiple SKBs in a TX WQE
Maxim Mikityanskiy [Thu, 16 Apr 2020 08:30:33 +0000 (11:30 +0300)]
net/mlx5e: Support multiple SKBs in a TX WQE

TX MPWQE support for SKBs is coming in one of the following patches, and
a single MPWQE can send multiple SKBs. This commit prepares the TX path
code to handle such cases:

1. An additional FIFO for SKBs is added, just like the FIFO for DMA
chunks.

2. struct mlx5e_tx_wqe_info will contain num_fifo_pkts. If a given WQE
contains only one packet, num_fifo_pkts will be zero, and the SKB will
be stored in mlx5e_tx_wqe_info, as usual. If num_fifo_pkts > 0, the SKB
pointer will be NULL, and the SKBs will be stored in the FIFO.

This change has no performance impact in TCP single stream test and
XDP_TX single stream test.

When compiled with a recent GCC, this change shows no visible
performance impact on UDP pktgen (burst 32) single stream test either:
  Packet rate: 16.95 Mpps (±0.15 Mpps) -> 16.96 Mpps (±0.12 Mpps)
  Instructions per packet: 429 -> 421
  Cycles per packet: 160 -> 156
  Instructions per cycle: 2.69 -> 2.70

CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz (x86_64)
NIC: Mellanox ConnectX-6 Dx
GCC 10.2.0

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 years agonet/mlx5e: Move the TLS resync check out of the function
Maxim Mikityanskiy [Thu, 30 Jul 2020 14:53:46 +0000 (17:53 +0300)]
net/mlx5e: Move the TLS resync check out of the function

Before this patch, mlx5e_ktls_tx_handle_resync_dump_comp checked for
resync_dump_frag_page. It happened for all WQEs without an SKB,
including padding WQEs, and required a function call. Normally, padding
WQEs happen more often than TLS resyncs. Take this check out of the
function and put it to an inline function to save a call on all padding
WQEs.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 years agonet/mlx5e: Unify constants for WQE_EMPTY_DS_COUNT
Maxim Mikityanskiy [Thu, 9 Jul 2020 11:05:31 +0000 (14:05 +0300)]
net/mlx5e: Unify constants for WQE_EMPTY_DS_COUNT

A constant for the number of DS in an empty WQE (i.e. a WQE without data
segments) is needed in multiple places (normal TX data path, MPWQE in
XDP), but currently we have a constant for XDP and an inline formula in
normal TX. This patch introduces a common constant.

Additionally, mlx5e_xdp_mpwqe_session_start is converted to use struct
assignment, because the code nearby is touched.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 years agonet/mlx5e: Small improvements for XDP TX MPWQE logic
Maxim Mikityanskiy [Thu, 30 Jul 2020 13:14:58 +0000 (16:14 +0300)]
net/mlx5e: Small improvements for XDP TX MPWQE logic

Use MLX5E_XDP_MPW_MAX_WQEBBS to reserve space for a MPWQE, because it's
actually the maximal size a MPWQE can take.

Reorganize the logic that checks when to close the MPWQE session:

1. Put all checks into a single function.

2. When inline is on, make only one comparison - if it's false, the less
strict one will also be false. The compiler probably optimized it out
anyway, but it's clearer to also reflect it in the code.

The MLX5E_XDP_INLINE_WQE_* defines are also changed to make the
calculations more correct from the logical point of view. Though
MLX5E_XDP_INLINE_WQE_MAX_DS_CNT used to be 16 and didn't change its
value, the calculation used to be DIV_ROUND_UP(max inline packet size,
MLX5_SEND_WQE_DS), and the numerator should have included sizeof(struct
mlx5_wqe_inline_seg).

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 years agonet/mlx5e: Refactor xmit functions
Maxim Mikityanskiy [Fri, 14 Feb 2020 09:22:49 +0000 (11:22 +0200)]
net/mlx5e: Refactor xmit functions

A huge function mlx5e_sq_xmit was split into several to achieve multiple
goals:

1. Reuse the code in IPoIB.

2. Better intergrate with TLS, IPSEC, GENEVE and checksum offloads. Now
it's possible to reserve space in the WQ before running eseg-based
offloads, so:

2.1. It's not needed to copy cseg and eseg after mlx5e_fill_sq_frag_edge
anymore.

2.2. mlx5e_txqsq_get_next_pi will be used instead of the legacy
mlx5e_fill_sq_frag_edge for better code maintainability and reuse.

3. Prepare for the upcoming TX MPWQE for SKBs. It will intervene after
mlx5e_sq_calc_wqe_attr to check if it's possible to use MPWQE, and the
code flow will split into two paths: MPWQE and non-MPWQE.

Two high-level functions are provided to send packets:

* mlx5e_xmit is called by the networking stack, runs offloads and sends
the packet. In one of the following patches, MPWQE support will be added
to this flow.

* mlx5e_sq_xmit_simple is called by the TLS offload, runs only the
checksum offload and sends the packet.

This change has no performance impact in TCP single stream test and
XDP_TX single stream test.

When compiled with a recent GCC, this change shows no visible
performance impact on UDP pktgen (burst 32) single stream test either:
  Packet rate: 16.86 Mpps (±0.15 Mpps) -> 16.95 Mpps (±0.15 Mpps)
  Instructions per packet: 434 -> 429
  Cycles per packet: 158 -> 160
  Instructions per cycle: 2.75 -> 2.69

CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz (x86_64)
NIC: Mellanox ConnectX-6 Dx
GCC 10.2.0

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 years agonet/mlx5e: Move mlx5e_tx_wqe_inline_mode to en_tx.c
Maxim Mikityanskiy [Tue, 8 Sep 2020 08:03:51 +0000 (11:03 +0300)]
net/mlx5e: Move mlx5e_tx_wqe_inline_mode to en_tx.c

Move mlx5e_tx_wqe_inline_mode from en/txrx.h to en_tx.c as it's only
used there.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 years agonet/mlx5e: Use struct assignment to initialize mlx5e_tx_wqe_info
Maxim Mikityanskiy [Tue, 8 Sep 2020 07:46:09 +0000 (10:46 +0300)]
net/mlx5e: Use struct assignment to initialize mlx5e_tx_wqe_info

Struct assignment guarantees that all fields of the structure are
initialized (those that are not mentioned are zeroed). It makes code
mode robust and reduces chances for unpredictable behavior when one
forgets to reset some field and it holds an old value from previous
iterations of using the structure.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 years agonet/mlx5e: Refactor inline header size calculation in the TX path
Maxim Mikityanskiy [Mon, 9 Dec 2019 13:39:32 +0000 (15:39 +0200)]
net/mlx5e: Refactor inline header size calculation in the TX path

As preparation for the next patch, don't increase ihs to calculate
ds_cnt and then decrease it, but rather calculate the intermediate value
temporarily. This code has the same amount of arithmetic operations, but
now allows to split out ds_cnt calculation, which will be performed in
the next patch.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 years agoMerge branch 'Fix-broken-tc-flower-rules-for-mscc_ocelot-switches'
David S. Miller [Tue, 22 Sep 2020 00:40:53 +0000 (17:40 -0700)]
Merge branch 'Fix-broken-tc-flower-rules-for-mscc_ocelot-switches'

Vladimir Oltean says:

====================
Fix broken tc-flower rules for mscc_ocelot switches

All 3 switch drivers from the Ocelot family have the same bug in the
VCAP IS2 key offsets, which is that some keys are in the incorrect
order.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: mscc: ocelot: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries
Vladimir Oltean [Mon, 21 Sep 2020 22:56:38 +0000 (01:56 +0300)]
net: mscc: ocelot: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries

The IS2 IP4_TCP_UDP key offsets do not correspond to the VSC7514
datasheet. Whether they work or not is unknown to me. On VSC9959 and
VSC9953, with the same mistake and same discrepancy from the
documentation, tc-flower src_port and dst_port rules did not work, so I
am assuming the same is true here.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: dsa: seville: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries
Vladimir Oltean [Mon, 21 Sep 2020 22:56:37 +0000 (01:56 +0300)]
net: dsa: seville: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries

Since these were copied from the Felix VCAP IS2 code, and only the
offsets were adjusted, the order of the bit fields is still wrong.
Fix it.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: dsa: felix: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries
Xiaoliang Yang [Mon, 21 Sep 2020 22:56:36 +0000 (01:56 +0300)]
net: dsa: felix: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries

Some of the IS2 IP4_TCP_UDP keys are not correct, like L4_DPORT,
L4_SPORT and other L4 keys. This prevents offloaded tc-flower rules from
matching on src_port and dst_port for TCP and UDP packets.

Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoinet_diag: validate INET_DIAG_REQ_PROTOCOL attribute
Eric Dumazet [Mon, 21 Sep 2020 14:27:20 +0000 (07:27 -0700)]
inet_diag: validate INET_DIAG_REQ_PROTOCOL attribute

User space could send an invalid INET_DIAG_REQ_PROTOCOL attribute
as caught by syzbot.

BUG: KMSAN: uninit-value in inet_diag_lock_handler net/ipv4/inet_diag.c:55 [inline]
BUG: KMSAN: uninit-value in __inet_diag_dump+0x58c/0x720 net/ipv4/inet_diag.c:1147
CPU: 0 PID: 8505 Comm: syz-executor174 Not tainted 5.9.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x21c/0x280 lib/dump_stack.c:118
 kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:122
 __msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:219
 inet_diag_lock_handler net/ipv4/inet_diag.c:55 [inline]
 __inet_diag_dump+0x58c/0x720 net/ipv4/inet_diag.c:1147
 inet_diag_dump_compat+0x2a5/0x380 net/ipv4/inet_diag.c:1254
 netlink_dump+0xb73/0x1cb0 net/netlink/af_netlink.c:2246
 __netlink_dump_start+0xcf2/0xea0 net/netlink/af_netlink.c:2354
 netlink_dump_start include/linux/netlink.h:246 [inline]
 inet_diag_rcv_msg_compat+0x5da/0x6c0 net/ipv4/inet_diag.c:1288
 sock_diag_rcv_msg+0x24f/0x620 net/core/sock_diag.c:256
 netlink_rcv_skb+0x6d7/0x7e0 net/netlink/af_netlink.c:2470
 sock_diag_rcv+0x63/0x80 net/core/sock_diag.c:275
 netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline]
 netlink_unicast+0x11c8/0x1490 net/netlink/af_netlink.c:1330
 netlink_sendmsg+0x173a/0x1840 net/netlink/af_netlink.c:1919
 sock_sendmsg_nosec net/socket.c:651 [inline]
 sock_sendmsg net/socket.c:671 [inline]
 ____sys_sendmsg+0xc82/0x1240 net/socket.c:2353
 ___sys_sendmsg net/socket.c:2407 [inline]
 __sys_sendmsg+0x6d1/0x820 net/socket.c:2440
 __do_sys_sendmsg net/socket.c:2449 [inline]
 __se_sys_sendmsg+0x97/0xb0 net/socket.c:2447
 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2447
 do_syscall_64+0x9f/0x140 arch/x86/entry/common.c:48
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x441389
Code: e8 fc ab 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 1b 09 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fff3b02ce98 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000441389
RDX: 0000000000000000 RSI: 0000000020001500 RDI: 0000000000000003
RBP: 00000000006cb018 R08: 00000000004002c8 R09: 00000000004002c8
R10: 0000000000000004 R11: 0000000000000246 R12: 0000000000402130
R13: 00000000004021c0 R14: 0000000000000000 R15: 0000000000000000

Uninit was created at:
 kmsan_save_stack_with_flags mm/kmsan/kmsan.c:143 [inline]
 kmsan_internal_poison_shadow+0x66/0xd0 mm/kmsan/kmsan.c:126
 kmsan_slab_alloc+0x8a/0xe0 mm/kmsan/kmsan_hooks.c:80
 slab_alloc_node mm/slub.c:2907 [inline]
 __kmalloc_node_track_caller+0x9aa/0x12f0 mm/slub.c:4511
 __kmalloc_reserve net/core/skbuff.c:142 [inline]
 __alloc_skb+0x35f/0xb30 net/core/skbuff.c:210
 alloc_skb include/linux/skbuff.h:1094 [inline]
 netlink_alloc_large_skb net/netlink/af_netlink.c:1176 [inline]
 netlink_sendmsg+0xdb9/0x1840 net/netlink/af_netlink.c:1894
 sock_sendmsg_nosec net/socket.c:651 [inline]
 sock_sendmsg net/socket.c:671 [inline]
 ____sys_sendmsg+0xc82/0x1240 net/socket.c:2353
 ___sys_sendmsg net/socket.c:2407 [inline]
 __sys_sendmsg+0x6d1/0x820 net/socket.c:2440
 __do_sys_sendmsg net/socket.c:2449 [inline]
 __se_sys_sendmsg+0x97/0xb0 net/socket.c:2447
 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2447
 do_syscall_64+0x9f/0x140 arch/x86/entry/common.c:48
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: 3f935c75eb52 ("inet_diag: support for wider protocol numbers")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Christoph Paasch <cpaasch@apple.com>
Cc: Mat Martineau <mathew.j.martineau@linux.intel.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: bridge: br_vlan_get_pvid_rcu() should dereference the VLAN group under RCU
Vladimir Oltean [Mon, 21 Sep 2020 22:07:09 +0000 (01:07 +0300)]
net: bridge: br_vlan_get_pvid_rcu() should dereference the VLAN group under RCU

When calling the RCU brother of br_vlan_get_pvid(), lockdep warns:

=============================
WARNING: suspicious RCU usage
5.9.0-rc3-01631-g13c17acb8e38-dirty #814 Not tainted
-----------------------------
net/bridge/br_private.h:1054 suspicious rcu_dereference_protected() usage!

Call trace:
 lockdep_rcu_suspicious+0xd4/0xf8
 __br_vlan_get_pvid+0xc0/0x100
 br_vlan_get_pvid_rcu+0x78/0x108

The warning is because br_vlan_get_pvid_rcu() calls nbp_vlan_group()
which calls rtnl_dereference() instead of rcu_dereference(). In turn,
rtnl_dereference() calls rcu_dereference_protected() which assumes
operation under an RCU write-side critical section, which obviously is
not the case here. So, when the incorrect primitive is used to access
the RCU-protected VLAN group pointer, READ_ONCE() is not used, which may
cause various unexpected problems.

I'm sad to say that br_vlan_get_pvid() and br_vlan_get_pvid_rcu() cannot
share the same implementation. So fix the bug by splitting the 2
functions, and making br_vlan_get_pvid_rcu() retrieve the VLAN groups
under proper locking annotations.

Fixes: 7582f5b70f9a ("bridge: add br_vlan_get_pvid_rcu()")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge tag 'mlx5-fixes-2020-09-18' of git://git.kernel.org/pub/scm/linux/kernel/git...
David S. Miller [Tue, 22 Sep 2020 00:32:42 +0000 (17:32 -0700)]
Merge tag 'mlx5-fixes-2020-09-18' of git://git./linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5 fixes-2020-09-18

This series introduces some fixes to mlx5 driver.

Please pull and let me know if there is any problem.

v1->v2:
 Remove missing patch from -stable list.

For -stable v5.1
 ('net/mlx5: Fix FTE cleanup')

For -stable v5.3
 ('net/mlx5e: TLS, Do not expose FPGA TLS counter if not supported')
 ('net/mlx5e: Enable adding peer miss rules only if merged eswitch is supported')

For -stable v5.7
 ('net/mlx5e: Fix memory leak of tunnel info when rule under multipath not ready')

For -stable v5.8
 ('net/mlx5e: Use RCU to protect rq->xdp_prog')
 ('net/mlx5e: Fix endianness when calculating pedit mask first bit')
 ('net/mlx5e: Use synchronize_rcu to sync with NAPI')
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: Update MAINTAINERS for MediaTek switch driver
Sean Wang [Mon, 21 Sep 2020 23:09:23 +0000 (07:09 +0800)]
net: Update MAINTAINERS for MediaTek switch driver

Update maintainers for MediaTek switch driver with Landen Chao who is
familiar with MediaTek MT753x switch devices and will help maintenance
from the vendor side.

Cc: Steven Liu <steven.liu@mediatek.com>
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Landen Chao <Landen.Chao@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet/mlx5e: mlx5e_fec_in_caps() returns a boolean
Saeed Mahameed [Fri, 11 Sep 2020 18:00:06 +0000 (11:00 -0700)]
net/mlx5e: mlx5e_fec_in_caps() returns a boolean

Returning errno is a bug, fix that.

Also fixes smatch warnings:
drivers/net/ethernet/mellanox/mlx5/core/en/port.c:453
mlx5e_fec_in_caps() warn: signedness bug returning '(-95)'

Fixes: 2132b71f78d2 ("net/mlx5e: Advertise globaly supported FEC modes")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
4 years agonet/mlx5e: kTLS, Avoid kzalloc(GFP_KERNEL) under spinlock
Saeed Mahameed [Tue, 8 Sep 2020 05:54:43 +0000 (22:54 -0700)]
net/mlx5e: kTLS, Avoid kzalloc(GFP_KERNEL) under spinlock

The spinlock only needed when accessing the channel's icosq, grab the lock
after the buf allocation in resync_post_get_progress_params() to avoid
kzalloc(GFP_KERNEL) in atomic context.

Fixes: 0419d8c9d8f8 ("net/mlx5e: kTLS, Add kTLS RX resync support")
Reported-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
4 years agonet/mlx5e: kTLS, Fix leak on resync error flow
Saeed Mahameed [Fri, 11 Sep 2020 04:16:04 +0000 (21:16 -0700)]
net/mlx5e: kTLS, Fix leak on resync error flow

Resync progress params buffer and dma weren't released on error,
Add missing error unwinding for resync_post_get_progress_params().

Fixes: 0419d8c9d8f8 ("net/mlx5e: kTLS, Add kTLS RX resync support")
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
4 years agonet/mlx5e: kTLS, Add missing dma_unmap in RX resync
Saeed Mahameed [Tue, 8 Sep 2020 05:58:50 +0000 (22:58 -0700)]
net/mlx5e: kTLS, Add missing dma_unmap in RX resync

Progress params dma address is never unmapped, unmap it when completion
handling is over.

Fixes: 0419d8c9d8f8 ("net/mlx5e: kTLS, Add kTLS RX resync support")
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
4 years agonet/mlx5e: kTLS, Fix napi sync and possible use-after-free
Tariq Toukan [Mon, 10 Aug 2020 12:59:41 +0000 (15:59 +0300)]
net/mlx5e: kTLS, Fix napi sync and possible use-after-free

Using synchronize_rcu() is sufficient to wait until running NAPI quits.

See similar upstream fix with detailed explanation:
("net/mlx5e: Use synchronize_rcu to sync with NAPI")

This change also fixes a possible use-after-free as the NAPI
might be already released at this stage.

Fixes: 0419d8c9d8f8 ("net/mlx5e: kTLS, Add kTLS RX resync support")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: TLS, Do not expose FPGA TLS counter if not supported
Tariq Toukan [Sun, 28 Jun 2020 10:06:06 +0000 (13:06 +0300)]
net/mlx5e: TLS, Do not expose FPGA TLS counter if not supported

The set of TLS TX global SW counters in mlx5e_tls_sw_stats_desc
is updated from all rings by using atomic ops.
This set of stats is used only in the FPGA TLS use case, not in
the Connect-X TLS one, where regular per-ring counters are used.

Do not expose them in the Connect-X use case, as this would cause
counter duplication. For example, tx_tls_drop_no_sync_data would
appear twice in the ethtool stats.

Fixes: d2ead1f360e8 ("net/mlx5e: Add kTLS TX HW offload support")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 years agonet/mlx5e: Fix using wrong stats_grps in mlx5e_update_ndo_stats()
Alaa Hleihel [Tue, 25 Aug 2020 07:41:50 +0000 (10:41 +0300)]
net/mlx5e: Fix using wrong stats_grps in mlx5e_update_ndo_stats()

The cited commit started to reuse function mlx5e_update_ndo_stats() for
the representors as well.
However, the function is hard-coded to work on mlx5e_nic_stats_grps only.
Due to this issue, the representors statistics were not updated in the
output of "ip -s".

Fix it to work with the correct group by extracting it from the caller's
profile.

Also, while at it and since this function became generic, move it to
en_stats.c and rename it accordingly.

Fixes: 8a236b15144b ("net/mlx5e: Convert rep stats to mlx5e_stats_grp-based infra")
Signed-off-by: Alaa Hleihel <alaa@nvidia.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: Fix multicast counter not up-to-date in "ip -s"
Ron Diskin [Sun, 10 May 2020 11:39:51 +0000 (14:39 +0300)]
net/mlx5e: Fix multicast counter not up-to-date in "ip -s"

Currently the FW does not generate events for counters other than error
counters. Unlike ".get_ethtool_stats", ".ndo_get_stats64" (which ip -s
uses) might run in atomic context, while the FW interface is non atomic.
Thus, 'ip' is not allowed to issue FW commands, so it will only display
cached counters in the driver.

Add a SW counter (mcast_packets) in the driver to count rx multicast
packets. The counter also counts broadcast packets, as we consider it a
special case of multicast.
Use the counter value when calling "ip -s"/"ifconfig".

Fixes: f62b8bb8f2d3 ("net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality")
Signed-off-by: Ron Diskin <rondi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: Fix endianness when calculating pedit mask first bit
Maor Dickman [Wed, 2 Sep 2020 13:49:52 +0000 (16:49 +0300)]
net/mlx5e: Fix endianness when calculating pedit mask first bit

The field mask value is provided in network byte order and has to
be converted to host byte order before calculating pedit mask
first bit.

Fixes: 88f30bbcbaaa ("net/mlx5e: Bit sized fields rewrite support")
Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 years agonet/mlx5e: Enable adding peer miss rules only if merged eswitch is supported
Maor Dickman [Wed, 5 Aug 2020 14:56:04 +0000 (17:56 +0300)]
net/mlx5e: Enable adding peer miss rules only if merged eswitch is supported

The cited commit creates peer miss group during switchdev mode
initialization in order to handle miss packets correctly while in VF
LAG mode. This is done regardless of FW support of such groups which
could cause rules setups failure later on.

Fix by adding FW capability check before creating peer groups/rule.

Fixes: ac004b832128 ("net/mlx5e: E-Switch, Add peer miss rules")
Signed-off-by: Maor Dickman <maord@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Raed Salem <raeds@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: CT: Fix freeing ct_label mapping
Roi Dayan [Sun, 26 Jul 2020 13:37:47 +0000 (16:37 +0300)]
net/mlx5e: CT: Fix freeing ct_label mapping

Add missing mapping remove call when removing ct rule,
as the mapping was allocated when ct rule was adding with ct_label.
Also there is a missing mapping remove call in error flow.

Fixes: 54b154ecfb8c ("net/mlx5e: CT: Map 128 bits labels to 32 bit map ID")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Eli Britstein <elibr@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: Fix memory leak of tunnel info when rule under multipath not ready
Jianbo Liu [Tue, 7 Jul 2020 06:16:24 +0000 (06:16 +0000)]
net/mlx5e: Fix memory leak of tunnel info when rule under multipath not ready

When deleting vxlan flow rule under multipath, tun_info in parse_attr is
not freed when the rule is not ready.

Fixes: ef06c9ee8933 ("net/mlx5e: Allow one failure when offloading tc encap rules under multipath")
Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: Use synchronize_rcu to sync with NAPI
Maxim Mikityanskiy [Thu, 11 Jun 2020 11:25:19 +0000 (14:25 +0300)]
net/mlx5e: Use synchronize_rcu to sync with NAPI

As described in the previous commit, napi_synchronize doesn't quite fit
the purpose when we just need to wait until the currently running NAPI
quits. Its implementation waits until NAPI is not running by polling and
waiting for 1ms in between. In cases where we need to deactivate one
queue (e.g., recovery flows) or where we deactivate them one-by-one
(deactivate channel flow), we may get stuck in napi_synchronize forever
if other queues keep NAPI active, causing a soft lockup. Depending on
kernel configuration (CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC), it may result
in a kernel panic.

To fix the issue, use synchronize_rcu to wait for NAPI to quit, and wrap
the whole NAPI in rcu_read_lock.

Fixes: acc6c5953af1 ("net/mlx5e: Split open/close channels to stages")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: Use RCU to protect rq->xdp_prog
Maxim Mikityanskiy [Thu, 11 Jun 2020 10:55:19 +0000 (13:55 +0300)]
net/mlx5e: Use RCU to protect rq->xdp_prog

Currently, the RQs are temporarily deactivated while hot-replacing the
XDP program, and napi_synchronize is used to make sure rq->xdp_prog is
not in use. However, napi_synchronize is not ideal: instead of waiting
till the end of a NAPI cycle, it polls and waits until NAPI is not
running, sleeping for 1ms between the periodic checks. Under heavy
workloads, this loop will never end, which may even lead to a kernel
panic if the kernel detects the hangup. Such workloads include XSK TX
and possibly also heavy RX (XSK or normal).

The fix is inspired by commit 326fe02d1ed6 ("net/mlx4_en: protect
ring->xdp_prog with rcu_read_lock"). As mlx5e_xdp_handle is already
protected by rcu_read_lock, and bpf_prog_put uses call_rcu to free the
program, there is no need for additional synchronization if proper RCU
functions are used to access the pointer. This patch converts all
accesses to rq->xdp_prog to use RCU functions.

Fixes: 86994156c736 ("net/mlx5e: XDP fast RX drop bpf programs support")
Fixes: db05815b36cb ("net/mlx5e: Add XSK zero-copy support")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5: Fix FTE cleanup
Maor Gottlieb [Mon, 31 Aug 2020 17:50:42 +0000 (20:50 +0300)]
net/mlx5: Fix FTE cleanup

Currently, when an FTE is allocated, its refcount is decreased to 0
with the purpose it will not be a stand alone steering object and every
rule (destination) of the FTE would increase the refcount.
When mlx5_cleanup_fs is called while not all rules were deleted by the
steering users, it hit refcount underflow on the FTE once clean_tree
calls to tree_remove_node after the deleted rules already decreased
the refcount to 0.

FTE is no longer destroyed implicitly when the last rule (destination)
is deleted. mlx5_del_flow_rules avoids it by increasing the refcount on
the FTE and destroy it explicitly after all rules were deleted. So we
can avoid the refcount underflow by making FTE as stand alone object.
In addition need to set del_hw_func to FTE so the HW object will be
destroyed when the FTE is deleted from the cleanup_tree flow.

refcount_t: underflow; use-after-free.
WARNING: CPU: 2 PID: 15715 at lib/refcount.c:28 refcount_warn_saturate+0xd9/0xe0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
 tree_put_node+0xf2/0x140 [mlx5_core]
 clean_tree+0x4e/0xf0 [mlx5_core]
 clean_tree+0x4e/0xf0 [mlx5_core]
 clean_tree+0x4e/0xf0 [mlx5_core]
 clean_tree+0x5f/0xf0 [mlx5_core]
 clean_tree+0x4e/0xf0 [mlx5_core]
 clean_tree+0x5f/0xf0 [mlx5_core]
 mlx5_cleanup_fs+0x26/0x270 [mlx5_core]
 mlx5_unload+0x2e/0xa0 [mlx5_core]
 mlx5_unload_one+0x51/0x120 [mlx5_core]
 mlx5_devlink_reload_down+0x51/0x90 [mlx5_core]
 devlink_reload+0x39/0x120
 ? devlink_nl_cmd_reload+0x43/0x220
 genl_rcv_msg+0x1e4/0x420
 ? genl_family_rcv_msg_attrs_parse+0x100/0x100
 netlink_rcv_skb+0x47/0x110
 genl_rcv+0x24/0x40
 netlink_unicast+0x217/0x2f0
 netlink_sendmsg+0x30f/0x430
 sock_sendmsg+0x30/0x40
 __sys_sendto+0x10e/0x140
 ? handle_mm_fault+0xc4/0x1f0
 ? do_page_fault+0x33f/0x630
 __x64_sys_sendto+0x24/0x30
 do_syscall_64+0x48/0x130
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: 718ce4d601db ("net/mlx5: Consolidate update FTE for all removal changes")
Fixes: bd71b08ec2ee ("net/mlx5: Support multiple updates of steering rules in parallel")
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 years agonet: phy: bcm7xxx: Add an entry for BCM72113
Florian Fainelli [Mon, 21 Sep 2020 22:10:53 +0000 (15:10 -0700)]
net: phy: bcm7xxx: Add an entry for BCM72113

BCM72113 features a 28nm integrated EPHY, add an entry to the driver for
it.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge tag 'linux-can-next-for-5.10-20200921' of git://git.kernel.org/pub/scm/linux...
David S. Miller [Mon, 21 Sep 2020 21:57:05 +0000 (14:57 -0700)]
Merge tag 'linux-can-next-for-5.10-20200921' of git://git./linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
pull-request: can-next 2020-09-21

this is a pull request of 38 patches for net-next.

the first 5 patches are by Colin Ian King, Alexandre Belloni and me and they
fix various spelling mistakes.

The next patch is by me and fixes the indention in the CAN raw protocol
according to the kernel coding style.

Diego Elio Pettenò contributes two patches to fix dead links in CAN's Kconfig.

Masahiro Yamada's patch removes the "WITH Linux-syscall-note" from SPDX tag of
C files.

AThe next 4 patches are by me and target the CAN device infrastructure and add
error propagation and improve the output of various messages to ease driver
development and debugging.

YueHaibing's patch for the c_can driver removes an unused inline function.

Next follows another patch by Colin Ian King, which removes the unneeded
initialization of a variable in the mcba_usb driver.

A patch by me annotates a fallthrough in the mscan driver.

The ti_hecc driver is converted to use devm_platform_ioremap_resource_byname()
in a patch by Dejin Zheng.

Liu Shixin's patch converts the pcan_usb_pro driver to make use of
le32_add_cpu() instead of open coding it.

Wang Hai's patch for the peak_pciefd_main driver removes an unused makro.

Vaibhav Gupta's patch converts the pch_can driver to generic power management.

Stephane Grosjean improves the pcan_usb usb driver by first documenting the
commands sent to the device and by adding support of rxerr/txerr counters.

The next patch is by me and cleans up the Kconfig of the CAN SPI drivers.

The next 6 patches all target the mcp251x driver, they are by Timo Schlüßler,
Andy Shevchenko, Tim Harvey and me. They update the DT bindings documentation,
sort the include files alphabetically, add GPIO support, make use of the
readx_poll_timeout() helper, and add support for half duplex SPI-controllers.

Wolfram Sang contributes a patch to update the contact email address in the
mscan driver, while Zhang Changzhong updates the clock handling.

The next patch is by and updates the rx-offload infrastructure to support
callback less usage.

The last 6 patches add support for the mcp25xxfd CAN SPI driver. First the
dt-bindings are added by Oleksij Rempel, the regmap infrastructure and the main
driver is contributed by me. Kurt Van Dijck adds listen-only support,
Manivannan Sadhasivam adds himself as maintainer, and Thomas Kopp himself as a
reviewer.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge tag 'mac80211-next-for-net-next-2020-09-21' of git://git.kernel.org/pub/scm...
David S. Miller [Mon, 21 Sep 2020 21:55:09 +0000 (14:55 -0700)]
Merge tag 'mac80211-next-for-net-next-2020-09-21' of git://git./linux/kernel/git/jberg/mac80211-next

Johannes Berg says:

====================
This time we have:
 * some AP-side infrastructure for FILS discovery and
   unsolicited probe resonses
 * a major rework of the encapsulation/header conversion
   offload from Felix, to fit better with e.g. AP_VLAN
   interfaces
 * performance fix for VHT A-MPDU size, don't limit to HT
 * some initial patches for S1G (sub 1 GHz) support, more
   will come in this area
 * minor cleanups
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge tag 'mac80211-for-net-2020-09-21' of git://git.kernel.org/pub/scm/linux/kernel...
David S. Miller [Mon, 21 Sep 2020 21:54:35 +0000 (14:54 -0700)]
Merge tag 'mac80211-for-net-2020-09-21' of git://git./linux/kernel/git/jberg/mac80211

Johannes Berg says:

====================
Just a few fixes:
 * fix using HE on 2.4 GHz
 * AQL (airtime queue limit) estimation & VHT160 fix
 * do not oversize A-MPDUs if local capability is smaller than peer's
 * fix radiotap on 6 GHz to not put 2.4 GHz flag
 * fix Kconfig for lib80211
 * little fixlet for 6 GHz channel number / frequency conversion
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: natsemi: Remove set but not used variable
Zheng Yongjun [Mon, 21 Sep 2020 12:18:41 +0000 (20:18 +0800)]
net: natsemi: Remove set but not used variable

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/ethernet/natsemi/ns83820.c: In function ns83820_get_link_ksettings:
drivers/net/ethernet/natsemi/ns83820.c:1210:11: warning: variable ‘tanar’ set but not used [-Wunused-but-set-variable]

`tanar` is never used, so remove it.

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoipv6: route: convert comma to semicolon
Xu Wang [Mon, 21 Sep 2020 06:38:56 +0000 (06:38 +0000)]
ipv6: route: convert comma to semicolon

Replace a comma between expression statements by a semicolon.

Signed-off-by: Xu Wang <vulab@iscas.ac.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: unix: remove redundant assignment to variable 'err'
Jing Xiangfeng [Mon, 21 Sep 2020 03:29:52 +0000 (11:29 +0800)]
net: unix: remove redundant assignment to variable 'err'

After commit 37ab4fa7844a ("net: unix: allow bind to fail on mutex lock"),
the assignment to err is redundant. So remove it.

Signed-off-by: Jing Xiangfeng <jingxiangfeng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: phy: realtek: enable ALDPS to save power for RTL8211F
Jisheng Zhang [Mon, 21 Sep 2020 01:13:54 +0000 (09:13 +0800)]
net: phy: realtek: enable ALDPS to save power for RTL8211F

Enable ALDPS(Advanced Link Down Power Saving) to save power when
link down.

Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: dsa: rtl8366rb: Support all 4096 VLANs
Linus Walleij [Sun, 20 Sep 2020 20:37:33 +0000 (22:37 +0200)]
net: dsa: rtl8366rb: Support all 4096 VLANs

There is an off-by-one error in rtl8366rb_is_vlan_valid()
making VLANs 0..4094 valid while it should be 1..4095.
Fix it.

Fixes: d8652956cf37 ("net: dsa: realtek-smi: Add Realtek SMI driver")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: dsa: mt7530: Add some return-value checks
Alex Dewar [Sat, 19 Sep 2020 19:28:10 +0000 (20:28 +0100)]
net: dsa: mt7530: Add some return-value checks

In mt7531_cpu_port_config(), if the variable port is neither 5 nor 6,
then variable interface will be used uninitialised. Change the function
to return -EINVAL in this case.

As the return value of mt7531_cpu_port_config() is never checked
(even though it returns an int) add a check in the correct place so that
the error can be passed up the call stack. Now that we correctly handle
errors thrown in this function, also check the return value of
mt7531_mac_config() in case an error occurs here. Also add misisng
checks to mt7530_setup() and mt7531_setup(), which are another level
further up the call stack.

Fixes: c288575f7810 ("net: dsa: mt7530: Add the support of MT7531 switch")
Addresses-Coverity: 1496993 ("Uninitialized variables")
Signed-off-by: Alex Dewar <alex.dewar90@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet-sysfs: add backlog len and CPU id to softnet data
Paolo Abeni [Fri, 18 Sep 2020 16:00:46 +0000 (18:00 +0200)]
net-sysfs: add backlog len and CPU id to softnet data

Currently the backlog status in not exposed to user-space.
Since long backlogs (or simply not empty ones) can be a
source of noticeable latency, -RT deployments need some way
to inspect it.

Additionally, there isn't a direct match between 'softnet_stat'
lines and the related CPU - sd for offline CPUs are not dumped -
so this patch also includes the CPU id into such entry.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agosfc: Fix error code in probe
Dan Carpenter [Fri, 18 Sep 2020 14:33:11 +0000 (17:33 +0300)]
sfc: Fix error code in probe

This failure path should return a negative error code but it currently
returns success.

Fixes: 51b35a454efd ("sfc: skeleton EF100 PF driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'Update-license-and-polish-ENA-driver-code'
David S. Miller [Mon, 21 Sep 2020 20:54:23 +0000 (13:54 -0700)]
Merge branch 'Update-license-and-polish-ENA-driver-code'

Shay Agroskin says:

====================
Update license and polish ENA driver code

This series adds the following:
- Change driver's license into SPDX format
- Capitalize all log prints in ENA driver
- Fix issues raised by static checkers
- Improve code readability by adding functions, fix spelling
  mistakes etc.
- Update driver's documentation

Changed from previous version:
v1->v2: dropped patch that transforms pr_* log prints into dev_* prints
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: ena: update ena documentation
Shay Agroskin [Mon, 21 Sep 2020 08:37:42 +0000 (11:37 +0300)]
net: ena: update ena documentation

The PCI vendor IDs in the documentation inaccurately describe the ENA
devices. For example, the 1d0f:ec20 can have LLQ support. The driver
loads in LLQ mode by default, and a message is printed to the kernel
ring if the mode isn't supported by the device, so the device table
isn't needed.

Also, LLQ can support various entry sizes, so the documentation is
updated to reflect that.

Interrupt moderation description is also updated to be more accurate.

Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: ena: Fix all static chekers' warnings
Shay Agroskin [Mon, 21 Sep 2020 08:37:41 +0000 (11:37 +0300)]
net: ena: Fix all static chekers' warnings

After running Sparse checker on the driver using
    make C=1 M=drivers/net/ethernet/amazon/ena

the only error that is thrown is:
    sparse: sparse: Using plain integer as NULL pointer
about the line
    struct ena_calc_queue_size_ctx calc_queue_ctx = { 0 };

This patch fixes this warning, thus making our driver free (for now) of
Sparse errors/warnings.

To make a more complete work, this patch also fixes all static warnings
that were found using an internal static checker.

Signed-off-by: Ido Segev <idose@amazon.com>
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: ena: Change RSS related macros and variables names
Shay Agroskin [Mon, 21 Sep 2020 08:37:40 +0000 (11:37 +0300)]
net: ena: Change RSS related macros and variables names

The formal name changes to "ENA_ADMIN_RSS_INDIRECTION_TABLE_CONFIG".
Indirection is the ability to reference "something" using "something else"
instead of the value itself.
Indirection table, as the name implies, is the ability to reference
CPU/Queue value using hash-to-CPU table instead of CPU/Queue itself.

This patch renames the variable keys_num, which describes the number of
words in the RSS hash key, to key_parts which makes its purpose clearer
in RSS context.

Signed-off-by: Amit Bernstein <amitbern@amazon.com>
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: ena: Remove redundant print of placement policy
Shay Agroskin [Mon, 21 Sep 2020 08:37:39 +0000 (11:37 +0300)]
net: ena: Remove redundant print of placement policy

The placement policy is printed in the process of queue creation in
ena_up(). No need to print it in ena_probe().

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: ena: Capitalize all log strings and improve code readability
Shay Agroskin [Mon, 21 Sep 2020 08:37:38 +0000 (11:37 +0300)]
net: ena: Capitalize all log strings and improve code readability

Capitalize all log strings printed by the ena driver to make their
format uniform across it.

Also fix indentation, spelling mistakes and comments to improve code
readability. This also includes adding comments to macros/enums whose
purpose might be difficult to understand.
Separate some code into functions to make it easier to understand the
purpose of these lines.

Signed-off-by: Amit Bernstein <amitbern@amazon.com>
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: ena: Change log message to netif/dev function
Shay Agroskin [Mon, 21 Sep 2020 08:37:37 +0000 (11:37 +0300)]
net: ena: Change log message to netif/dev function

Make log prints in ena_netdev use the same log functions as the rest of
the driver.

For the sake of consistency, all prints in ena_netdev file were
converted into netif_* format except where netdev struct isn't yet
defined. For these places, dev_* log functions are used (similar to
the patch for ena_com files).

This commit leaves some corner cases which would be changed in a
future patch.

Signed-off-by: Amit Bernstein <amitbern@amazon.com>
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: ena: Change license into format to SPDX in all files
Shay Agroskin [Mon, 21 Sep 2020 08:37:36 +0000 (11:37 +0300)]
net: ena: Change license into format to SPDX in all files

All ena files should now use SPDX format in their license string. This
doesn't change the license of the files, but rather states the same
license in fewer words.

Also update the license years in some of the files.

Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agochelsio: simplify the return expression of t3_ael2020_phy_prep()
Qinglang Miao [Mon, 21 Sep 2020 13:10:05 +0000 (21:10 +0800)]
chelsio: simplify the return expression of t3_ael2020_phy_prep()

Simplify the return expression.

Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoconnector: simplify the return expression of cn_add_callback()
Qinglang Miao [Mon, 21 Sep 2020 13:10:06 +0000 (21:10 +0800)]
connector: simplify the return expression of cn_add_callback()

Simplify the return expression.

Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoenetc: simplify the return expression of enetc_vf_set_mac_addr()
Qinglang Miao [Mon, 21 Sep 2020 13:10:26 +0000 (21:10 +0800)]
enetc: simplify the return expression of enetc_vf_set_mac_addr()

Simplify the return expression.

Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com>
Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoice: simplify the return expression of ice_finalize_update()
Qinglang Miao [Mon, 21 Sep 2020 13:10:34 +0000 (21:10 +0800)]
ice: simplify the return expression of ice_finalize_update()

Simplify the return expression.

Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agomlxsw: spectrum_router: simplify the return expression of __mlxsw_sp_router_init()
Qinglang Miao [Mon, 21 Sep 2020 13:10:41 +0000 (21:10 +0800)]
mlxsw: spectrum_router: simplify the return expression of __mlxsw_sp_router_init()

Simplify the return expression.

Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: hns3: simplify the return expression of hclgevf_client_start()
Qinglang Miao [Mon, 21 Sep 2020 13:10:43 +0000 (21:10 +0800)]
net: hns3: simplify the return expression of hclgevf_client_start()

Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: qlcnic: simplify the return expression of qlcnic_83xx_shutdown
Qinglang Miao [Mon, 21 Sep 2020 13:10:46 +0000 (21:10 +0800)]
net: qlcnic: simplify the return expression of qlcnic_83xx_shutdown

Simplify the return expression.

Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck...
Linus Torvalds [Mon, 21 Sep 2020 19:42:31 +0000 (12:42 -0700)]
Merge branch 'rcu/urgent' of git://git./linux/kernel/git/paulmck/linux-rcu

Pull RCU fix from Paul McKenney:
 "This contains a single commit that fixes a bug that was introduced in
  the last merge window. This bug causes a compiler warning complaining
  about show_rcu_tasks_classic_gp_kthread() being an unused static
  function in !SMP kernels.

  The fix is straightforward, just adding an 'inline' to make this a
  static inline function, thus avoiding the warning.

  This bug was reported by Laurent Pinchart, who would like it fixed
  sooner rather than later"

* 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
  rcu-tasks: Prevent complaints of unused show_rcu_tasks_classic_gp_kthread()

4 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Mon, 21 Sep 2020 15:53:48 +0000 (08:53 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "ARM:
   - fix fault on page table writes during instruction fetch

  s390:
   - doc improvement

  x86:
   - The obvious patches are always the ones that turn out to be
     completely broken. /me hangs his head in shame"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  Revert "KVM: Check the allocation of pv cpu mask"
  KVM: arm64: Remove S1PTW check from kvm_vcpu_dabt_iswrite()
  KVM: arm64: Assume write fault on S1PTW permission fault on instruction fetch
  docs: kvm: add documentation for KVM_CAP_S390_DIAG318

4 years agoMerge tag 'libnvdimm-fixes-5.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Mon, 21 Sep 2020 15:46:20 +0000 (08:46 -0700)]
Merge tag 'libnvdimm-fixes-5.9-rc7' of git://git./linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm fix from Dan Williams:
 "Fix compilation for the new dax_supported() exported helper"

* tag 'libnvdimm-fixes-5.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  dax: Fix compilation for CONFIG_DAX && !CONFIG_FS_DAX

4 years agodax: Fix compilation for CONFIG_DAX && !CONFIG_FS_DAX
Jan Kara [Mon, 21 Sep 2020 09:33:23 +0000 (11:33 +0200)]
dax: Fix compilation for CONFIG_DAX && !CONFIG_FS_DAX

dax_supported() is defined whenever CONFIG_DAX is enabled. So dummy
implementation should be defined only in !CONFIG_DAX case, not in
!CONFIG_FS_DAX case.

Fixes: e2ec51282545 ("dm: Call proper helper to determine dax support")
Cc: <stable@vger.kernel.org>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
4 years agoio_uring: fix openat/openat2 unified prep handling
Jens Axboe [Sat, 19 Sep 2020 01:36:24 +0000 (19:36 -0600)]
io_uring: fix openat/openat2 unified prep handling

A previous commit unified how we handle prep for these two functions,
but this means that we check the allowed context (SQPOLL, specifically)
later than we should. Move the ring type checking into the two parent
functions, instead of doing it after we've done some setup work.

Fixes: ec65fea5a8d7 ("io_uring: deduplicate io_openat{,2}_prep()")
Reported-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agoio_uring: mark statx/files_update/epoll_ctl as non-SQPOLL
Jens Axboe [Fri, 18 Sep 2020 22:51:19 +0000 (16:51 -0600)]
io_uring: mark statx/files_update/epoll_ctl as non-SQPOLL

These will naturally fail when attempted through SQPOLL, but either
with -EFAULT or -EBADF. Make it explicit that these are not workable
through SQPOLL and return -EINVAL, just like other ops that need to
use ->files.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agotools/io_uring: fix compile breakage
Douglas Gilbert [Mon, 14 Sep 2020 21:36:09 +0000 (17:36 -0400)]
tools/io_uring: fix compile breakage

It would seem none of the kernel continuous integration does this:
    $ cd tools/io_uring
    $ make

Otherwise it may have noticed:
   cc -Wall -Wextra -g -D_GNU_SOURCE   -c -o io_uring-bench.o
 io_uring-bench.c
io_uring-bench.c:133:12: error: static declaration of ‘gettid’
 follows non-static declaration
  133 | static int gettid(void)
      |            ^~~~~~
In file included from /usr/include/unistd.h:1170,
                 from io_uring-bench.c:27:
/usr/include/x86_64-linux-gnu/bits/unistd_ext.h:34:16: note:
 previous declaration of ‘gettid’ was here
   34 | extern __pid_t gettid (void) __THROW;
      |                ^~~~~~
make: *** [<builtin>: io_uring-bench.o] Error 1

The problem on Ubuntu 20.04 (with lk 5.9.0-rc5) is that unistd.h
already defines gettid(). So prefix the local definition with
"lk_".

Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agoio_uring: don't use retry based buffered reads for non-async bdev
Jens Axboe [Mon, 14 Sep 2020 15:30:38 +0000 (09:30 -0600)]
io_uring: don't use retry based buffered reads for non-async bdev

Some block devices, like dm, bubble back -EAGAIN through the completion
handler. We check for this in io_read(), but don't honor it for when
we have copied the iov. Return -EAGAIN for this case before retrying,
to force punt to io-wq.

Fixes: bcf5a06304d6 ("io_uring: support true async buffered reads, if file provides it")
Reported-by: Zorro Lang <zlang@redhat.com>
Tested-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agoio_uring: don't re-setup vecs/iter in io_resumit_prep() is already there
Jens Axboe [Mon, 14 Sep 2020 15:28:14 +0000 (09:28 -0600)]
io_uring: don't re-setup vecs/iter in io_resumit_prep() is already there

If we already have mapped the necessary data for retry, then don't set
it up again. It's a pointless operation, and we leak the iovec if it's
a large (non-stack) vec.

Fixes: b63534c41e20 ("io_uring: re-issue block requests that failed because of resources")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agoMAINTAINERS: Add reviewer entry for microchip mcp25xxfd SPI-CAN network driver
Thomas Kopp [Fri, 18 Sep 2020 17:25:36 +0000 (19:25 +0200)]
MAINTAINERS: Add reviewer entry for microchip mcp25xxfd SPI-CAN network driver

This patch adds Thomas Kopp as a reviewer for the mcp25xxfd CAN driver.

Signed-off-by: Thomas Kopp <thomas.kopp@microchip.com>
Link: https://lore.kernel.org/r/20200916101334.1277-1-thomas.kopp@microchip.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
4 years agoMAINTAINERS: Add entry for Microchip MCP25XXFD SPI-CAN network driver
Manivannan Sadhasivam [Fri, 18 Sep 2020 17:25:35 +0000 (19:25 +0200)]
MAINTAINERS: Add entry for Microchip MCP25XXFD SPI-CAN network driver

Add MAINTAINERS entry for Microchip MCP25XXFD SPI-CAN network driver.

Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20200910133806.25077-7-manivannan.sadhasivam@linaro.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
4 years agocan: mcp25xxfd: add listen-only mode
Kurt Van Dijck [Fri, 18 Sep 2020 17:25:34 +0000 (19:25 +0200)]
can: mcp25xxfd: add listen-only mode

This commit enables listen-only mode, which works internally like CANFD mode.

Signed-off-by: Kurt Van Dijck <dev.kurt@vandijck-laurijssen.be>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Link: https://lore.kernel.org/r/20200918172536.2074504-5-mkl@pengutronix.de
4 years agocan: mcp25xxfd: add driver for Microchip MCP25xxFD SPI CAN
Marc Kleine-Budde [Fri, 18 Sep 2020 17:25:33 +0000 (19:25 +0200)]
can: mcp25xxfd: add driver for Microchip MCP25xxFD SPI CAN

This patch adds support for the Microchip MCP25xxFD SPI CAN controller family.

Tested-by: Kurt Van Dijck <dev.kurt@vandijck-laurijssen.be>
Tested-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Link: https://lore.kernel.org/r/20200918172536.2074504-4-mkl@pengutronix.de
4 years agocan: mcp25xxfd: add regmap infrastructure
Marc Kleine-Budde [Fri, 18 Sep 2020 17:25:32 +0000 (19:25 +0200)]
can: mcp25xxfd: add regmap infrastructure

This patch adds the regmap infrastructure for the Microchip MCP25xxFD SPI CAN
controller family. The actual driver is added in the next commit.

Tested-by: Kurt Van Dijck <dev.kurt@vandijck-laurijssen.be>
Tested-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Link: https://lore.kernel.org/r/20200918172536.2074504-3-mkl@pengutronix.de
4 years agodt-binding: can: mcp25xxfd: document device tree bindings
Oleksij Rempel [Fri, 18 Sep 2020 17:25:31 +0000 (19:25 +0200)]
dt-binding: can: mcp25xxfd: document device tree bindings

This patch adds the device-tree binding documentation for the Microchip
MCP25xxFD SPI CAN controller family.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Link: https://lore.kernel.org/r/20200918172536.2074504-2-mkl@pengutronix.de
4 years agocan: rx-offload: can_rx_offload_add_manual(): add new initialization function
Marc Kleine-Budde [Tue, 15 Sep 2020 22:35:22 +0000 (00:35 +0200)]
can: rx-offload: can_rx_offload_add_manual(): add new initialization function

This patch adds a new initialization function:
can_rx_offload_add_manual()

It should be used to add support rx-offload to a driver, if the callback
mechanism should not be used. Use e.g. can_rx_offload_queue_sorted() to queue
skbs into rx-offload.

Link: https://lore.kernel.org/r/20200915223527.1417033-33-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
4 years agocan: mscan: simplify clock enable/disable
Zhang Changzhong [Fri, 17 Jul 2020 08:01:15 +0000 (16:01 +0800)]
can: mscan: simplify clock enable/disable

All the NULL checks are pointless, clk_*() routines already deal with
NULL just fine.

Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Link: https://lore.kernel.org/r/1594972875-27631-1-git-send-email-zhangchangzhong@huawei.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
4 years agocan: mscan: mpc5xxx_can: update contact email
Wolfram Sang [Sat, 2 May 2020 14:26:56 +0000 (16:26 +0200)]
can: mscan: mpc5xxx_can: update contact email

The 'pengutronix' address is defunct for years. Use the proper contact
address.

Signed-off-by: Wolfram Sang <wsa@kernel.org>
Link: https://lore.kernel.org/r/20200502142657.19199-1-wsa@kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
4 years agocan: mcp251x: add support for half duplex controllers
Tim Harvey [Thu, 23 Jul 2020 14:57:55 +0000 (07:57 -0700)]
can: mcp251x: add support for half duplex controllers

Some SPI host controllers do not support full-duplex SPI and are
marked as such via the SPI_CONTROLLER_HALF_DUPLEX controller flag.

For such controllers use half duplex transactions but retain full
duplex transactions for the controllers that can handle those.

Signed-off-by: Tim Harvey <tharvey@gateworks.com>
Link: https://lore.kernel.org/r/1595516275-1179-1-git-send-email-tharvey@gateworks.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
4 years agocan: mcp251x: Use readx_poll_timeout() helper
Andy Shevchenko [Mon, 17 Feb 2020 16:10:38 +0000 (18:10 +0200)]
can: mcp251x: Use readx_poll_timeout() helper

We may use special helper macro to poll IO till condition or timeout occurs.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20200217161038.25009-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
4 years agocan: mcp251x: add GPIO support
Timo Schlüßler [Fri, 11 Oct 2019 13:38:21 +0000 (15:38 +0200)]
can: mcp251x: add GPIO support

The mcp251x variants feature 3 general purpose digital inputs and 2
outputs. With this patch they are accessible through the gpio framework.

Signed-off-by: Timo Schlüßler <schluessler@krause.de>
Tested-by: Timo Schlüßler <schluessler@krause.de>
Link: https://lore.kernel.org/r/20200915223527.1417033-28-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
4 years agocan: mcp251x: sort include files alphabetically
Marc Kleine-Budde [Tue, 15 Sep 2020 22:35:16 +0000 (00:35 +0200)]
can: mcp251x: sort include files alphabetically

This patch sorts the include files alphabetically.

Link: https://lore.kernel.org/r/20200915223527.1417033-27-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
4 years agodt-bindings: can: mcp251x: document GPIO support
Marc Kleine-Budde [Tue, 15 Sep 2020 22:35:15 +0000 (00:35 +0200)]
dt-bindings: can: mcp251x: document GPIO support

The next patch adds gpio controller support to the mcp251x driver. This
patch updates the binding accordingly.

Cc: Timo Schlüßler <schluessler@krause.de>
Link: https://lore.kernel.org/r/20200915223527.1417033-26-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
4 years agodt-bindings: can: mcp251x: change example interrupt type to IRQ_TYPE_LEVEL_LOW
Marc Kleine-Budde [Tue, 15 Sep 2020 22:35:14 +0000 (00:35 +0200)]
dt-bindings: can: mcp251x: change example interrupt type to IRQ_TYPE_LEVEL_LOW

The MCP2515 datasheet clearly describes a level-triggered interrupt pin.
Change example bindings accordingly.

Link: https://lore.kernel.org/r/20200915223527.1417033-25-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
4 years agocan: spi: Kconfig: remove unneeded dependencies form Kconfig symbols
Marc Kleine-Budde [Tue, 15 Sep 2020 22:35:13 +0000 (00:35 +0200)]
can: spi: Kconfig: remove unneeded dependencies form Kconfig symbols

Since commits

    653ee35ce6d5 ("can: hi311x: remove custom DMA mapped buffer")
Fixes: df58525df395 ("can: mcp251x: remove custom DMA mapped buffer")
both the hi3111x and the mcp251x driver don't make use of the DMA API
any more. So we can safely remove the HAS_DMA dependency.

While we're here, remove the unneeded CAN_DEV and SPI dependencies from
the CAN_HI311X symbol, as the parent menus already have these
dependencies.

Link: https://lore.kernel.org/r/20200915223527.1417033-24-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>