platform/kernel/linux-rpi.git
5 years agoiavf: finish renaming files to iavf
Jesse Brandeburg [Sat, 15 Sep 2018 00:37:56 +0000 (17:37 -0700)]
iavf: finish renaming files to iavf

This finishes the process of renaming the files that
make sense to rename (skipping adminq related files that
talk to i40e), and fixes up the build and the #includes
so that everything builds nicely.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoiavf: rename most of i40e strings
Jesse Brandeburg [Sat, 15 Sep 2018 00:37:55 +0000 (17:37 -0700)]
iavf: rename most of i40e strings

This is the big rename patch, it takes most of the i40e_
and I40E_ strings and renames them to iavf_ and IAVF_.

Some of the adminq code, as well as most of the client
interface code used by RDMA is left unchanged in order
to indicate that the driver is talking to non-internal to
iavf code.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoiavf: tracing infrastructure rename
Jesse Brandeburg [Sat, 15 Sep 2018 00:37:54 +0000 (17:37 -0700)]
iavf: tracing infrastructure rename

Rename the i40e_trace file and fix up all the callers
to the new names inside the iavf_trace.h file.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoiavf: replace i40e_debug with iavf version
Jesse Brandeburg [Sat, 15 Sep 2018 00:37:53 +0000 (17:37 -0700)]
iavf: replace i40e_debug with iavf version

Change another string (i40e_debug)

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoiavf: rename i40e_hw to iavf_hw
Jesse Brandeburg [Sat, 15 Sep 2018 00:37:52 +0000 (17:37 -0700)]
iavf: rename i40e_hw to iavf_hw

Fix up the i40e_hw names to new name, including versions
inside other strings.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoiavf: rename I40E_ADMINQ_DESC
Jesse Brandeburg [Sat, 15 Sep 2018 00:37:51 +0000 (17:37 -0700)]
iavf: rename I40E_ADMINQ_DESC

Take care of some renames containing I40E_ADMINQ_DESC.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoiavf: rename device ID defines
Jesse Brandeburg [Sat, 15 Sep 2018 00:37:50 +0000 (17:37 -0700)]
iavf: rename device ID defines

Rename the device ID defines to have IAVF in them
and remove all the unused defines.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoiavf: remove references to old names
Jesse Brandeburg [Sat, 15 Sep 2018 00:37:49 +0000 (17:37 -0700)]
iavf: remove references to old names

Remove the register name references to I40E_VF* and change to
IAVF_VF. Update the descriptor names and defines to the IAVF
name.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoiavf: move i40evf files to new name
Jesse Brandeburg [Sat, 15 Sep 2018 00:37:48 +0000 (17:37 -0700)]
iavf: move i40evf files to new name

Simply move the i40evf files to the new name, updating the #includes
to track the new names, and updating the Makefile as well.

A future patch will remove the i40e references (after the code
removal patches later in this series).

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoiavf: rename i40e_status to iavf_status
Jesse Brandeburg [Sat, 15 Sep 2018 00:37:47 +0000 (17:37 -0700)]
iavf: rename i40e_status to iavf_status

This is just a rename of an internal variable i40e_status, but
it was a pretty big change and so deserved it's own patch.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoiavf: rename functions and structs to new name
Jesse Brandeburg [Sat, 15 Sep 2018 00:37:46 +0000 (17:37 -0700)]
iavf: rename functions and structs to new name

This basically begins the internal portion of the rename of i40evf to iavf,
by renaming many of the functions, structs, variables and defines.

Most of the changes were made mechanically, which introduces some
alignment issues.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoiavf: diet and reformat
Jesse Brandeburg [Sat, 15 Sep 2018 00:37:45 +0000 (17:37 -0700)]
iavf: diet and reformat

Remove a bunch of unused code and reformat a few lines. Also
remove some now un-necessary files.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agointel-ethernet: rename i40evf to iavf
Jesse Brandeburg [Sat, 15 Sep 2018 00:37:44 +0000 (17:37 -0700)]
intel-ethernet: rename i40evf to iavf

Rename the Intel Ethernet Adaptive Virtual Function driver
(i40evf) to a new name (iavf) that is more consistent with
the ongoing maintenance of the driver as the universal VF driver
for multiple product lines.

This first patch fixes up the directory names and the .ko name,
intentionally ignoring the function names inside the driver
for now.  Basically this is the simplest patch that gets
the rename done and will be followed by other patches that
rename the internal functions.

This patch also addresses a couple of string/name issues
and updates the Copyright year.

Also, made sure to add a MODULE_ALIAS to the old name.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agonet: mdio: remove duplicated include from mdio_bus.c
YueHaibing [Tue, 18 Sep 2018 02:48:41 +0000 (10:48 +0800)]
net: mdio: remove duplicated include from mdio_bus.c

Remove duplicated include linux/gpio/consumer.h

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoqed: remove duplicated include from qed_cxt.c
YueHaibing [Tue, 18 Sep 2018 02:46:27 +0000 (10:46 +0800)]
qed: remove duplicated include from qed_cxt.c

Remove duplicated include.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoliquidio: remove duplicated include from lio_vf_rep.c
YueHaibing [Tue, 18 Sep 2018 02:43:43 +0000 (10:43 +0800)]
liquidio: remove duplicated include from lio_vf_rep.c

Remove duplicated include.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agocxgb4: remove duplicated include from cxgb4_main.c
YueHaibing [Tue, 18 Sep 2018 02:41:28 +0000 (10:41 +0800)]
cxgb4: remove duplicated include from cxgb4_main.c

Remove duplicated include.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agogianfar: remove duplicated include from gianfar.c
YueHaibing [Tue, 18 Sep 2018 02:17:18 +0000 (10:17 +0800)]
gianfar: remove duplicated include from gianfar.c

Remove duplicated include.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/ipv4: defensive cipso option parsing
Stefan Nuernberger [Mon, 17 Sep 2018 17:46:53 +0000 (19:46 +0200)]
net/ipv4: defensive cipso option parsing

commit 40413955ee26 ("Cipso: cipso_v4_optptr enter infinite loop") fixed
a possible infinite loop in the IP option parsing of CIPSO. The fix
assumes that ip_options_compile filtered out all zero length options and
that no other one-byte options beside IPOPT_END and IPOPT_NOOP exist.
While this assumption currently holds true, add explicit checks for zero
length and invalid length options to be safe for the future. Even though
ip_options_compile should have validated the options, the introduction of
new one-byte options can still confuse this code without the additional
checks.

Signed-off-by: Stefan Nuernberger <snu@amazon.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Simon Veith <sveith@amazon.de>
Cc: stable@vger.kernel.org
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: socionext: Fix two sleep-in-atomic-context bugs in ave_rxfifo_reset()
Jia-Ju Bai [Sat, 15 Sep 2018 04:02:46 +0000 (12:02 +0800)]
net: socionext: Fix two sleep-in-atomic-context bugs in ave_rxfifo_reset()

The driver may sleep with holding a spinlock.
The function call paths (from bottom to top) in Linux-4.17 are:

[FUNC] usleep_range
drivers/net/ethernet/socionext/sni_ave.c, 892:
usleep_range in ave_rxfifo_reset
drivers/net/ethernet/socionext/sni_ave.c, 932:
ave_rxfifo_reset in ave_irq_handler

[FUNC] usleep_range
drivers/net/ethernet/socionext/sni_ave.c, 888:
usleep_range in ave_rxfifo_reset
drivers/net/ethernet/socionext/sni_ave.c, 932:
ave_rxfifo_reset in ave_irq_handler

To fix these bugs, usleep_range() is replaced with udelay().

These bugs are found by my static analysis tool DSAC.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: caif: remove redundant null check on frontpkt
Colin Ian King [Fri, 14 Sep 2018 17:19:16 +0000 (18:19 +0100)]
net: caif: remove redundant null check on frontpkt

It is impossible for frontpkt to be null at the point of the null
check because it has been assigned from rearpkt and there is no
way rearpkt can be null at the point of the assignment because
of the sanity checking and exit paths taken previously. Remove
the redundant null check.

Detected by CoverityScan, CID#114434 ("Logically dead code")

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 's390-qeth-next'
David S. Miller [Mon, 17 Sep 2018 16:10:26 +0000 (09:10 -0700)]
Merge branch 's390-qeth-next'

Julian Wiedmann says:

====================
s390/qeth: updates 2018-09-17

please apply the following patchset to net-next. This brings more restructuring
of qeth's transmit code (eliminating its last usage of skb_realloc_headroom()),
and the usual mix of minor improvements & cleanups.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agos390/qeth: reduce 0-initializing when building IPA cmds
Julian Wiedmann [Mon, 17 Sep 2018 15:36:09 +0000 (17:36 +0200)]
s390/qeth: reduce 0-initializing when building IPA cmds

qeth_get_ipacmd_buffer() obtains its buffers for building IPA cmds from
__qeth_get_buffer(), where they are fully cleared. So get rid of all the
additional zero-ing in various other places.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agos390/qeth: fine-tune spinlocks
Julian Wiedmann [Mon, 17 Sep 2018 15:36:08 +0000 (17:36 +0200)]
s390/qeth: fine-tune spinlocks

For quite a lot of code paths it's obvious that they will never run in
IRQ context. So replace their spin_lock_irqsave() calls with
spin_lock_irq().

While at it, get rid of the redundant card pointer in struct qeth_reply
that was used by qeth_send_control_data() to access the card's lock.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agos390/qeth: fix typo in return value
Julian Wiedmann [Mon, 17 Sep 2018 15:36:07 +0000 (17:36 +0200)]
s390/qeth: fix typo in return value

Assuming this was just a typo, as returning an actual negative value
from a cmd callback would make no sense either.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agos390/qeth: invoke softirqs after napi_schedule()
Julian Wiedmann [Mon, 17 Sep 2018 15:36:06 +0000 (17:36 +0200)]
s390/qeth: invoke softirqs after napi_schedule()

Calling napi_schedule() from process context does not ensure that the
NET_RX softirq is run in a timely fashion. So trigger it manually.

This is no big issue with current code. A call to ndo_open() is usually
followed by a ndo_set_rx_mode() call, and for qeth this contains a
spin_unlock_bh(). Except for OSN, where qeth_l2_set_rx_mode() bails out
early.
Nevertheless it's best to not depend on this behaviour, and just fix
the issue at its source like all other drivers do. For instance see
commit 83a0c6e58901 ("i40e: Invoke softirqs after napi_reschedule").

Fixes: a1c3ed4c9ca0 ("qeth: NAPI support for l2 and l3 discipline")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agos390/qeth: uninstall IRQ handler on device removal
Julian Wiedmann [Mon, 17 Sep 2018 15:36:05 +0000 (17:36 +0200)]
s390/qeth: uninstall IRQ handler on device removal

When setting up, qeth installs its IRQ handler on the ccw devices. But
the IRQ handler is not cleared on removal - so even after qeth yields
control of the ccw devices, spurious interrupts would still be presented
to us.

Make (de-)installation of the IRQ handler part of the ccw channel
setup/removal helpers, and while at it also add the appropriate locking.
Shift around qeth_setup_channel() to avoid a forward declaration for
qeth_irq().

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agos390/qeth: remove qeth_hdr_chk_and_bounce()
Julian Wiedmann [Mon, 17 Sep 2018 15:36:04 +0000 (17:36 +0200)]
s390/qeth: remove qeth_hdr_chk_and_bounce()

Restructure the OSN xmit path to handle misaligned HW headers properly,
without shifting the packet data around.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agos390/qeth: speed up TSO transmission
Julian Wiedmann [Mon, 17 Sep 2018 15:36:03 +0000 (17:36 +0200)]
s390/qeth: speed up TSO transmission

Switch TSO over to the faster transmit path, and remove all the unused
old TSO code.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agos390/qeth: prepare for copy-free TSO transmission
Julian Wiedmann [Mon, 17 Sep 2018 15:36:02 +0000 (17:36 +0200)]
s390/qeth: prepare for copy-free TSO transmission

Add all the necessary TSO plumbing to the copy-less transmit path.
This includes calculating the right length of required protocol headers,
and always building a separate buffer element for the TSO headers.

A follow-up patch will then switch TSO traffic over to this path.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agos390/qeth: check size of required HW header cache object
Julian Wiedmann [Mon, 17 Sep 2018 15:36:01 +0000 (17:36 +0200)]
s390/qeth: check size of required HW header cache object

When qeth_add_hw_header() falls back to the header cache, ensure that
the requested length doesn't exceed the object size.

For current usage this is a no-brainer, but TSO transmission will
introduce protocol headers of varying length.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agos390/qeth: fix up protocol headers early
Julian Wiedmann [Mon, 17 Sep 2018 15:36:00 +0000 (17:36 +0200)]
s390/qeth: fix up protocol headers early

When qeth_add_hw_header() falls back to the HW header cache, it also
copies over the necessary protocol headers. Thus any manipulation to
the protocol headers needs to happen before adding the HW header.

For current usage this doesn't matter, but it becomes relevant when
moving TSO transmission over to the faster code path.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agos390/qeth: limit csum offload erratum to L3 devices
Julian Wiedmann [Mon, 17 Sep 2018 15:35:59 +0000 (17:35 +0200)]
s390/qeth: limit csum offload erratum to L3 devices

Combined L3+L4 csum offload is only required for some L3 HW. So for
L2 devices, don't offload the IP header csum calculation.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reference-ID: JUP 394553
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agos390/qeth: remove qeth_get_elements_no()
Julian Wiedmann [Mon, 17 Sep 2018 15:35:58 +0000 (17:35 +0200)]
s390/qeth: remove qeth_get_elements_no()

Convert the last remaining user of qeth_get_elements_no() to
qeth_count_elements(), so this helper can be removed.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agos390/qeth: remove unused L3 xmit code
Julian Wiedmann [Mon, 17 Sep 2018 15:35:57 +0000 (17:35 +0200)]
s390/qeth: remove unused L3 xmit code

qeth_l3_xmit() is now only used for TSOv4 traffic, shrink it down.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agos390/qeth: run non-offload L3 traffic over common xmit path
Julian Wiedmann [Mon, 17 Sep 2018 15:35:56 +0000 (17:35 +0200)]
s390/qeth: run non-offload L3 traffic over common xmit path

L3 OSAs can only offload IPv4 traffic, use the common L2 transmit path
for all other traffic.
In particular there's no support for TX VLAN offload, so any such packet
needs to be manually de-accelerated via ndo_features_check().

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agos390/qeth: move L2 xmit code to core module
Julian Wiedmann [Mon, 17 Sep 2018 15:35:55 +0000 (17:35 +0200)]
s390/qeth: move L2 xmit code to core module

We need the exact same transmit path for non-offload-eligible traffic on
L3 OSAs. So make it accessible from both sub-drivers.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoliquidio: Add the features to show FEC settings and set FEC settings
Weilin Chang [Mon, 17 Sep 2018 05:43:32 +0000 (22:43 -0700)]
liquidio: Add the features to show FEC settings and set FEC settings

1. Add functions for get_fecparam and set_fecparam.
2. Modify lio_get_link_ksettings to display FEC setting.

Signed-off-by: Weilin Chang <weilin.chang@cavium.com>
Acked-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ethernet: remove redundant null pointer check before of_node_put
zhong jiang [Sun, 16 Sep 2018 13:13:42 +0000 (21:13 +0800)]
net: ethernet: remove redundant null pointer check before of_node_put

of_node_put has taken the null pointer check into account. So it is
safe to remove the duplicated check before of_node_put.

Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Reviewed-by: Vladimir Zapolskiy <vz@mleia.com>
Acked-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: remove redundant null pointer check before put_device
zhong jiang [Sun, 16 Sep 2018 13:45:02 +0000 (21:45 +0800)]
net: dsa: remove redundant null pointer check before put_device

put_device has taken the null pinter check into account. So it is
safe to remove the duplicated check before put_device.

Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: remove redundant null pointer check before of_node_put
zhong jiang [Sun, 16 Sep 2018 13:22:31 +0000 (21:22 +0800)]
net: dsa: remove redundant null pointer check before of_node_put

of_node_put has taken the null pointer check into account. So it is
safe to remove the duplicated check before of_node_put.

Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: usb: remove redundant null pointer check before of_node_put
zhong jiang [Sun, 16 Sep 2018 13:20:17 +0000 (21:20 +0800)]
net: usb: remove redundant null pointer check before of_node_put

of_node_put has taken the null pointer check into account. So it is
safe to remove the duplicated check before of_node_put.

Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: rds: use memset to optimize the recv
Zhu Yanjun [Mon, 17 Sep 2018 02:49:30 +0000 (22:49 -0400)]
net: rds: use memset to optimize the recv

The function rds_inc_init is in recv process. To use memset can optimize
the function rds_inc_init.
The test result:

     Before:
     1) + 24.950 us   |        rds_inc_init [rds]();
     After:
     1) + 10.990 us   |        rds_inc_init [rds]();

Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests/tls: Add MSG_WAITALL in recv() syscall
Vakul Garg [Sun, 16 Sep 2018 04:34:28 +0000 (10:04 +0530)]
selftests/tls: Add MSG_WAITALL in recv() syscall

A number of tls selftests rely upon recv() to return an exact number of
data bytes. When tls record crypto is done using an async accelerator,
it is possible that recv() returns lesser than expected number bytes.
This leads to failure of many test cases. To fix it, MSG_WAITALL has
been used in flags passed to recv() syscall.

Signed-off-by: Vakul Garg <vakul.garg@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: aquantia: memory corruption on jumbo frames
Friedemann Gerold [Sat, 15 Sep 2018 15:03:39 +0000 (18:03 +0300)]
net: aquantia: memory corruption on jumbo frames

This patch fixes skb_shared area, which will be corrupted
upon reception of 4K jumbo packets.

Originally build_skb usage purpose was to reuse page for skb to eliminate
needs of extra fragments. But that logic does not take into account that
skb_shared_info should be reserved at the end of skb data area.

In case packet data consumes all the page (4K), skb_shinfo location
overflows the page. As a consequence, __build_skb zeroed shinfo data above
the allocated page, corrupting next page.

The issue is rarely seen in real life because jumbo are normally larger
than 4K and that causes another code path to trigger.
But it 100% reproducible with simple scapy packet, like:

    sendp(IP(dst="192.168.100.3") / TCP(dport=443) \
          / Raw(RandString(size=(4096-40))), iface="enp1s0")

Fixes: 018423e90bee ("net: ethernet: aquantia: Add ring support code")
Reported-by: Friedemann Gerold <f.gerold@b-c-s.de>
Reported-by: Michael Rauch <michael@rauch.be>
Signed-off-by: Friedemann Gerold <f.gerold@b-c-s.de>
Tested-by: Nikita Danilov <nikita.danilov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'lantiq-Minor-fixes-for-vrx200-and-gswip'
David S. Miller [Mon, 17 Sep 2018 15:12:12 +0000 (08:12 -0700)]
Merge branch 'lantiq-Minor-fixes-for-vrx200-and-gswip'

Hauke Mehrtens says:

====================
net: lantiq: Minor fixes for vrx200 and gswip

These are mostly minor fixes to problems addresses in the latests round
of the review of the original series adding these driver, which were not
applied before the patches got merged into net-next.
In addition it fixes a data bus error on poweroff.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: tag_gswip: Add gswip to dsa_tag_protocol_to_str()
Hauke Mehrtens [Sat, 15 Sep 2018 12:08:49 +0000 (14:08 +0200)]
net: dsa: tag_gswip: Add gswip to dsa_tag_protocol_to_str()

The gswip tag was missing in the dsa_tag_protocol_to_str() function, add it.

Fixes: 7969119293f5 ("net: dsa: Add Lantiq / Intel GSWIP tag support")
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: lantiq_gswip: Minor code style improvements
Hauke Mehrtens [Sat, 15 Sep 2018 12:08:48 +0000 (14:08 +0200)]
net: dsa: lantiq_gswip: Minor code style improvements

Use one code block when returning because the interface type is
unsupported and also check if some unsupported port gets configured.
In addition fix a double the and use dsa_is_cpu_port() instated of
manually getting the CPU port.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: lantiq: lantiq_xrx200: Move clock prepare to probe function
Hauke Mehrtens [Sat, 15 Sep 2018 12:08:47 +0000 (14:08 +0200)]
net: lantiq: lantiq_xrx200: Move clock prepare to probe function

The switch and the MAC are in one IP core and they use the same clock
signal from the clock generation unit.
Currently the clock architecture in the lantiq SoC code does not allow
to easily share the same clocks, this has to be fixed by switching to
the common clock framework.
As a workaround the clock of the switch and MAC should be activated when
the MAC gets probed and only disabled when the MAC gets removed. This
way it is ensured that the clock is always enabled when the switch or
MAC is used. The switch can not be used without the MAC.

This fixes a data bus error when rebooting the system and deactivating
the switch and mac and later accessing some registers in the cleanup
while the clocks are disabled.

Fixes: fe1a56420cf2 ("net: lantiq: Add Lantiq / Intel VRX200 Ethernet driver")
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodt-bindings: net: dsa: lantiq, xrx200-gswip: Fix minor style fixes
Hauke Mehrtens [Sat, 15 Sep 2018 12:08:46 +0000 (14:08 +0200)]
dt-bindings: net: dsa: lantiq, xrx200-gswip: Fix minor style fixes

* Use one compatible line per line in the documentation
* Remove SoC revision depended compatible lines, we can detect that in
  the driver
* Use lower case letters in hex addresses
* Fix the size of the address ranges in the example, this now matches
  the sizes used by the SoC. The old ones will also work, this just adds
  some empty address space.
* Change the reg size of the gphy-fw node

Fixes: 86ce2bc73c7a ("dt-bindings: net: dsa: Add lantiq, xrx200-gswip DT bindings")
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Cc: devicetree@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodt-bindings: net: lantiq, xrx200-net: Use lower case in hex
Hauke Mehrtens [Sat, 15 Sep 2018 12:08:45 +0000 (14:08 +0200)]
dt-bindings: net: lantiq, xrx200-net: Use lower case in hex

Use lower case letters in the addresses of the device tree binding.
In addition replace eth with ethernet and fix the size of the reg
element in the example. The additional range does not contain any
registers but is used for the IP block on the this SoC.

Fixes: 839790e88a3c ("dt-bindings: net: Add lantiq, xrx200-net DT bindings")
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Cc: devicetree@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns: make function hns_gmac_wait_fifo_clean() static
Wei Yongjun [Sat, 15 Sep 2018 01:42:09 +0000 (01:42 +0000)]
net: hns: make function hns_gmac_wait_fifo_clean() static

Fixes the following sparse warning:

drivers/net/ethernet/hisilicon/hns/hns_dsaf_gmac.c:322:5: warning:
 symbol 'hns_gmac_wait_fifo_clean' was not declared. Should it be static?

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: lantiq: Fix return value check in xrx200_probe()
Wei Yongjun [Sat, 15 Sep 2018 01:33:50 +0000 (01:33 +0000)]
net: lantiq: Fix return value check in xrx200_probe()

In case of error, the function devm_ioremap_resource() returns ERR_PTR()
and never returns NULL. The NULL test in the return value check should
be replaced with IS_ERR().

Fixes: fe1a56420cf2 ("net: lantiq: Add Lantiq / Intel VRX200 Ethernet driver")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: gswip: Fix copy-paste error in gswip_gphy_fw_probe()
Wei Yongjun [Sat, 15 Sep 2018 01:33:38 +0000 (01:33 +0000)]
net: dsa: gswip: Fix copy-paste error in gswip_gphy_fw_probe()

The return value from of_reset_control_array_get_exclusive() is not
checked correctly. The test is done against a wrong variable. This
patch fix it.

Fixes: 14fceff4771e ("net: dsa: Add Lantiq / Intel DSA driver for vrx200")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: gswip: Fix return value check in gswip_probe()
Wei Yongjun [Sat, 15 Sep 2018 01:33:21 +0000 (01:33 +0000)]
net: dsa: gswip: Fix return value check in gswip_probe()

In case of error, the function devm_ioremap_resource() returns ERR_PTR()
and never returns NULL. The NULL test in the return value check should
be replaced with IS_ERR().

Fixes: 14fceff4771e ("net: dsa: Add Lantiq / Intel DSA driver for vrx200")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotls: async support causes out-of-bounds access in crypto APIs
John Fastabend [Fri, 14 Sep 2018 20:01:46 +0000 (13:01 -0700)]
tls: async support causes out-of-bounds access in crypto APIs

When async support was added it needed to access the sk from the async
callback to report errors up the stack. The patch tried to use space
after the aead request struct by directly setting the reqsize field in
aead_request. This is an internal field that should not be used
outside the crypto APIs. It is used by the crypto code to define extra
space for private structures used in the crypto context. Users of the
API then use crypto_aead_reqsize() and add the returned amount of
bytes to the end of the request memory allocation before posting the
request to encrypt/decrypt APIs.

So this breaks (with general protection fault and KASAN error, if
enabled) because the request sent to decrypt is shorter than required
causing the crypto API out-of-bounds errors. Also it seems unlikely the
sk is even valid by the time it gets to the callback because of memset
in crypto layer.

Anyways, fix this by holding the sk in the skb->sk field when the
callback is set up and because the skb is already passed through to
the callback handler via void* we can access it in the handler. Then
in the handler we need to be careful to NULL the pointer again before
kfree_skb. I added comments on both the setup (in tls_do_decryption)
and when we clear it from the crypto callback handler
tls_decrypt_done(). After this selftests pass again and fixes KASAN
errors/warnings.

Fixes: 94524d8fc965 ("net/tls: Add support for async decryption of tls records")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Reviewed-by: Vakul Garg <Vakul.garg@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoip6_gre: simplify gre header parsing in ip6gre_err
Haishuang Yan [Fri, 14 Sep 2018 04:26:48 +0000 (12:26 +0800)]
ip6_gre: simplify gre header parsing in ip6gre_err

Same as ip_gre, use gre_parse_header to parse gre header in gre error
handler code.

Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoip_gre: fix parsing gre header in ipgre_err
Haishuang Yan [Fri, 14 Sep 2018 04:26:47 +0000 (12:26 +0800)]
ip_gre: fix parsing gre header in ipgre_err

gre_parse_header stops parsing when csum_err is encountered, which means
tpi->key is undefined and ip_tunnel_lookup will return NULL improperly.

This patch introduce a NULL pointer as csum_err parameter. Even when
csum_err is encountered, it won't return error and continue parsing gre
header as expected.

Fixes: 9f57c67c379d ("gre: Remove support for sharing GRE protocol hook.")
Reported-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: et011c: Remove incorrect PHY_POLL flags
Florian Fainelli [Thu, 13 Sep 2018 18:36:30 +0000 (11:36 -0700)]
net: phy: et011c: Remove incorrect PHY_POLL flags

PHY_POLL is defined as -1 which means that we would be setting all flags of the
PHY driver, this is also not a valid flag to tell PHYLIB about, just remove it.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'act_police-lockless-data-path'
David S. Miller [Sun, 16 Sep 2018 22:30:23 +0000 (15:30 -0700)]
Merge branch 'act_police-lockless-data-path'

Davide Caratti says:

====================
net/sched: act_police: lockless data path

the data path of 'police' action can be faster if we avoid using spinlocks:
 - patch 1 converts act_police to use per-cpu counters
 - patch 2 lets act_police use RCU to access its configuration data.

test procedure (using pktgen from https://github.com/netoptimizer):
 # ip link add name eth1 type dummy
 # ip link set dev eth1 up
 # tc qdisc add dev eth1 clsact
 # tc filter add dev eth1 egress matchall action police \
 > rate 2gbit burst 100k conform-exceed pass/pass index 100
 # for c in 1 2 4; do
 > ./pktgen_bench_xmit_mode_queue_xmit.sh -v -s 64 -t $c -n 5000000 -i eth1
 > done

test results (avg. pps/thread):

  $c | before patch |  after patch | improvement
 ----+--------------+--------------+-------------
   1 |      3518448 |      3591240 |  irrelevant
   2 |      3070065 |      3383393 |         10%
   4 |      1540969 |      3238385 |        110%
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/sched: act_police: don't use spinlock in the data path
Davide Caratti [Thu, 13 Sep 2018 17:29:13 +0000 (19:29 +0200)]
net/sched: act_police: don't use spinlock in the data path

use RCU instead of spinlocks, to protect concurrent read/write on
act_police configuration. This reduces the effects of contention in the
data path, in case multiple readers are present.

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/sched: act_police: use per-cpu counters
Davide Caratti [Thu, 13 Sep 2018 17:29:12 +0000 (19:29 +0200)]
net/sched: act_police: use per-cpu counters

use per-CPU counters, instead of sharing a single set of stats with all
cores. This removes the need of using spinlock when statistics are read
or updated.

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agocxgb4: update supported DCB version
Ganesh Goudar [Fri, 14 Sep 2018 12:05:55 +0000 (17:35 +0530)]
cxgb4: update supported DCB version

- In CXGB4_DCB_STATE_FW_INCOMPLETE state check if the dcb
  version is changed and update the dcb supported version.

- Also, fill the priority code point value for priority
  based flow control.

Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agocxgb4: add per rx-queue counter for packet errors
Ganesh Goudar [Fri, 14 Sep 2018 09:16:04 +0000 (14:46 +0530)]
cxgb4: add per rx-queue counter for packet errors

print per rx-queue packet errors in sge_qinfo

Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agocxgb4: Fix endianness issue in t4_fwcache()
Ganesh Goudar [Fri, 14 Sep 2018 09:06:27 +0000 (14:36 +0530)]
cxgb4: Fix endianness issue in t4_fwcache()

Do not put host-endian 0 or 1 into big endian feild.

Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: move definition of pcpu_lstats to header file
Li RongQing [Fri, 14 Sep 2018 08:00:51 +0000 (16:00 +0800)]
net: move definition of pcpu_lstats to header file

pcpu_lstats is defined in several files, so unify them as one
and move to header file

Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/ibm/emac: Remove VLA usage
Kees Cook [Thu, 13 Sep 2018 21:23:56 +0000 (14:23 -0700)]
net/ibm/emac: Remove VLA usage

In the quest to remove all stack VLA usage from the kernel[1], this
removes the VLA used for the emac xaht registers size. Since the size
of registers can only ever be 4 or 8, as detected in emac_init_config(),
the max can be hardcoded and a runtime test added for robustness.

[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Christian Lamparter <chunkeey@gmail.com>
Cc: Ivan Mikhaylov <ivan@de.ibm.com>
Cc: netdev@vger.kernel.org
Co-developed-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agopktgen: Fix fall-through annotation
Gustavo A. R. Silva [Thu, 13 Sep 2018 19:03:20 +0000 (14:03 -0500)]
pktgen: Fix fall-through annotation

Replace "fallthru" with a proper "fall through" annotation.

This fix is part of the ongoing efforts to enabling
-Wimplicit-fallthrough

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotg3: Fix fall-through annotations
Gustavo A. R. Silva [Thu, 13 Sep 2018 18:39:22 +0000 (13:39 -0500)]
tg3: Fix fall-through annotations

Replace "fallthru" with a proper "fall through" annotation.

This fix is part of the ongoing efforts to enabling
-Wimplicit-fallthrough

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agogso_segment: Reset skb->mac_len after modifying network header
Toke Høiland-Jørgensen [Thu, 13 Sep 2018 14:43:07 +0000 (16:43 +0200)]
gso_segment: Reset skb->mac_len after modifying network header

When splitting a GSO segment that consists of encapsulated packets, the
skb->mac_len of the segments can end up being set wrong, causing packet
drops in particular when using act_mirred and ifb interfaces in
combination with a qdisc that splits GSO packets.

This happens because at the time skb_segment() is called, network_header
will point to the inner header, throwing off the calculation in
skb_reset_mac_len(). The network_header is subsequently adjust by the
outer IP gso_segment handlers, but they don't set the mac_len.

Fix this by adding skb_reset_mac_len() calls to both the IPv4 and IPv6
gso_segment handlers, after they modify the network_header.

Many thanks to Eric Dumazet for his help in identifying the cause of
the bug.

Acked-by: Dave Taht <dave.taht@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agovxlan: Remove duplicated include from vxlan.h
YueHaibing [Thu, 13 Sep 2018 13:32:23 +0000 (21:32 +0800)]
vxlan: Remove duplicated include from vxlan.h

Remove duplicated include.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: b53: Do not fail when IRQ are not initialized
Florian Fainelli [Thu, 13 Sep 2018 16:50:45 +0000 (09:50 -0700)]
net: dsa: b53: Do not fail when IRQ are not initialized

When the Device Tree is not providing the per-port interrupts, do not fail
during b53_srab_irq_enable() but instead bail out gracefully. The SRAB driver
is used on the BCM5301X (Northstar) platforms which do not yet have the SRAB
interrupts wired up.

Fixes: 16994374a6fc ("net: dsa: b53: Make SRAB driver manage port interrupts")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'vhost_net-TX-batching'
David S. Miller [Thu, 13 Sep 2018 16:25:41 +0000 (09:25 -0700)]
Merge branch 'vhost_net-TX-batching'

Jason Wang says:

====================
vhost_net TX batching

This series tries to batch submitting packets to underlayer socket
through msg_control during sendmsg(). This is done by:

1) Doing userspace copy inside vhost_net
2) Build XDP buff
3) Batch at most 64 (VHOST_NET_BATCH) XDP buffs and submit them once
   through msg_control during sendmsg().
4) Underlayer sockets can use XDP buffs directly when XDP is enalbed,
   or build skb based on XDP buff.

For the packet that can not be built easily with XDP or for the case
that batch submission is hard (e.g sndbuf is limited). We will go for
the previous slow path, passing iov iterator to underlayer socket
through sendmsg() once per packet.

This can help to improve cache utilization and avoid lots of indirect
calls with sendmsg(). It can also co-operate with the batching support
of the underlayer sockets (e.g the case of XDP redirection through
maps).

Testpmd(txonly) in guest shows obvious improvements:

Test                /+pps%
XDP_DROP on TAP     /+44.8%
XDP_REDIRECT on TAP /+29%
macvtap (skb)       /+26%

Netperf TCP_STREAM TX from guest shows obvious improvements on small
packet:

    size/session/+thu%/+normalize%
       64/     1/   +2%/    0%
       64/     2/   +3%/   +1%
       64/     4/   +7%/   +5%
       64/     8/   +8%/   +6%
      256/     1/   +3%/    0%
      256/     2/  +10%/   +7%
      256/     4/  +26%/  +22%
      256/     8/  +27%/  +23%
      512/     1/   +3%/   +2%
      512/     2/  +19%/  +14%
      512/     4/  +43%/  +40%
      512/     8/  +45%/  +41%
     1024/     1/   +4%/    0%
     1024/     2/  +27%/  +21%
     1024/     4/  +38%/  +73%
     1024/     8/  +15%/  +24%
     2048/     1/  +10%/   +7%
     2048/     2/  +16%/  +12%
     2048/     4/    0%/   +2%
     2048/     8/    0%/   +2%
     4096/     1/  +36%/  +60%
     4096/     2/  -11%/  -26%
     4096/     4/    0%/  +14%
     4096/     8/    0%/   +4%
    16384/     1/   -1%/   +5%
    16384/     2/    0%/   +2%
    16384/     4/    0%/   -3%
    16384/     8/    0%/   +4%
    65535/     1/    0%/  +10%
    65535/     2/    0%/   +8%
    65535/     4/    0%/   +1%
    65535/     8/    0%/   +3%

Please review.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agovhost_net: batch submitting XDP buffers to underlayer sockets
Jason Wang [Wed, 12 Sep 2018 03:17:09 +0000 (11:17 +0800)]
vhost_net: batch submitting XDP buffers to underlayer sockets

This patch implements XDP batching for vhost_net. The idea is first to
try to do userspace copy and build XDP buff directly in vhost. Instead
of submitting the packet immediately, vhost_net will batch them in an
array and submit every 64 (VHOST_NET_BATCH) packets to the under layer
sockets through msg_control of sendmsg().

When XDP is enabled on the TUN/TAP, TUN/TAP can process XDP inside a
loop without caring GUP thus it can do batch map flushing. When XDP is
not enabled or not supported, the underlayer socket need to build skb
and pass it to network core. The batched packet submission allows us
to do batching like netif_receive_skb_list() in the future.

This saves lots of indirect calls for better cache utilization. For
the case that we can't so batching e.g when sndbuf is limited or
packet size is too large, we will go for usual one packet per
sendmsg() way.

Doing testpmd on various setups gives us:

Test                /+pps%
XDP_DROP on TAP     /+44.8%
XDP_REDIRECT on TAP /+29%
macvtap (skb)       /+26%

Netperf tests shows obvious improvements for small packet transmission:

size/session/+thu%/+normalize%
   64/     1/   +2%/    0%
   64/     2/   +3%/   +1%
   64/     4/   +7%/   +5%
   64/     8/   +8%/   +6%
  256/     1/   +3%/    0%
  256/     2/  +10%/   +7%
  256/     4/  +26%/  +22%
  256/     8/  +27%/  +23%
  512/     1/   +3%/   +2%
  512/     2/  +19%/  +14%
  512/     4/  +43%/  +40%
  512/     8/  +45%/  +41%
 1024/     1/   +4%/    0%
 1024/     2/  +27%/  +21%
 1024/     4/  +38%/  +73%
 1024/     8/  +15%/  +24%
 2048/     1/  +10%/   +7%
 2048/     2/  +16%/  +12%
 2048/     4/    0%/   +2%
 2048/     8/    0%/   +2%
 4096/     1/  +36%/  +60%
 4096/     2/  -11%/  -26%
 4096/     4/    0%/  +14%
 4096/     8/    0%/   +4%
16384/     1/   -1%/   +5%
16384/     2/    0%/   +2%
16384/     4/    0%/   -3%
16384/     8/    0%/   +4%
65535/     1/    0%/  +10%
65535/     2/    0%/   +8%
65535/     4/    0%/   +1%
65535/     8/    0%/   +3%

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotap: accept an array of XDP buffs through sendmsg()
Jason Wang [Wed, 12 Sep 2018 03:17:08 +0000 (11:17 +0800)]
tap: accept an array of XDP buffs through sendmsg()

This patch implement TUN_MSG_PTR msg_control type. This type allows
the caller to pass an array of XDP buffs to tuntap through ptr field
of the tun_msg_control. Tap will build skb through those XDP buffers.

This will avoid lots of indirect calls thus improves the icache
utilization and allows to do XDP batched flushing when doing XDP
redirection.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotuntap: accept an array of XDP buffs through sendmsg()
Jason Wang [Wed, 12 Sep 2018 03:17:07 +0000 (11:17 +0800)]
tuntap: accept an array of XDP buffs through sendmsg()

This patch implement TUN_MSG_PTR msg_control type. This type allows
the caller to pass an array of XDP buffs to tuntap through ptr field
of the tun_msg_control. If an XDP program is attached, tuntap can run
XDP program directly. If not, tuntap will build skb and do a fast
receiving since part of the work has been done by vhost_net.

This will avoid lots of indirect calls thus improves the icache
utilization and allows to do XDP batched flushing when doing XDP
redirection.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotun: switch to new type of msg_control
Jason Wang [Wed, 12 Sep 2018 03:17:06 +0000 (11:17 +0800)]
tun: switch to new type of msg_control

This patch introduces to a new tun/tap specific msg_control:

#define TUN_MSG_UBUF 1
#define TUN_MSG_PTR  2
struct tun_msg_ctl {
       int type;
       void *ptr;
};

This allows us to pass different kinds of msg_control through
sendmsg(). The first supported type is ubuf (TUN_MSG_UBUF) which will
be used by the existed vhost_net zerocopy code. The second is XDP
buff, which allows vhost_net to pass XDP buff to TUN. This could be
used to implement accepting an array of XDP buffs from vhost_net in
the following patches.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotuntap: move XDP flushing out of tun_do_xdp()
Jason Wang [Wed, 12 Sep 2018 03:17:05 +0000 (11:17 +0800)]
tuntap: move XDP flushing out of tun_do_xdp()

This will allow adding batch flushing on top.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotuntap: split out XDP logic
Jason Wang [Wed, 12 Sep 2018 03:17:04 +0000 (11:17 +0800)]
tuntap: split out XDP logic

This patch split out XDP logic into a single function. This make it to
be reused by XDP batching path in the following patch.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotuntap: tweak on the path of skb XDP case in tun_build_skb()
Jason Wang [Wed, 12 Sep 2018 03:17:03 +0000 (11:17 +0800)]
tuntap: tweak on the path of skb XDP case in tun_build_skb()

If we're sure not to go native XDP, there's no need for several things
like bh and rcu stuffs. So this patch introduces a helper to build skb
and hold page refcnt. When we found we will go through skb path, build
skb directly.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotuntap: simplify error handling in tun_build_skb()
Jason Wang [Wed, 12 Sep 2018 03:17:02 +0000 (11:17 +0800)]
tuntap: simplify error handling in tun_build_skb()

There's no need to duplicate page get logic in each action. So this
patch tries to get page and calculate the offset before processing XDP
actions (except for XDP_DROP), and undo them when meet errors (we
don't care the performance on errors). This will be used for factoring
out XDP logic.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotuntap: enable bh early during processing XDP
Jason Wang [Wed, 12 Sep 2018 03:17:01 +0000 (11:17 +0800)]
tuntap: enable bh early during processing XDP

This patch move the bh enabling a little bit earlier, this will be
used for factoring out the core XDP logic of tuntap.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotuntap: switch to use XDP_PACKET_HEADROOM
Jason Wang [Wed, 12 Sep 2018 03:17:00 +0000 (11:17 +0800)]
tuntap: switch to use XDP_PACKET_HEADROOM

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sock: introduce SOCK_XDP
Jason Wang [Wed, 12 Sep 2018 03:16:59 +0000 (11:16 +0800)]
net: sock: introduce SOCK_XDP

This patch introduces a new sock flag - SOCK_XDP. This will be used
for notifying the upper layer that XDP program is attached on the
lower socket, and requires for extra headroom.

TUN will be the first user.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agollc: avoid blocking in llc_sap_close()
Cong Wang [Tue, 11 Sep 2018 18:42:06 +0000 (11:42 -0700)]
llc: avoid blocking in llc_sap_close()

llc_sap_close() is called by llc_sap_put() which
could be called in BH context in llc_rcv(). We can't
block in BH.

There is no reason to block it here, kfree_rcu() should
be sufficient.

Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv6: Add sockopt IPV6_MULTICAST_ALL analogue to IP_MULTICAST_ALL
Andre Naujoks [Mon, 10 Sep 2018 08:27:15 +0000 (10:27 +0200)]
ipv6: Add sockopt IPV6_MULTICAST_ALL analogue to IP_MULTICAST_ALL

The socket option will be enabled by default to ensure current behaviour
is not changed. This is the same for the IPv4 version.

A socket bound to in6addr_any and a specific port will receive all traffic
on that port. Analogue to IP_MULTICAST_ALL, disable this behaviour, if
one or more multicast groups were joined (using said socket) and only
pass on multicast traffic from groups, which were explicitly joined via
this socket.

Without this option disabled a socket (system even) joined to multiple
multicast groups is very hard to get right. Filtering by destination
address has to take place in user space to avoid receiving multicast
traffic from other multicast groups, which might have traffic on the same
port.

The extension of the IP_MULTICAST_ALL socketoption to just apply to ipv6,
too, is not done to avoid changing the behaviour of current applications.

Signed-off-by: Andre Naujoks <nautsch2@gmail.com>
Acked-By: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'Lantiq-Intel-vrx200-support'
David S. Miller [Thu, 13 Sep 2018 15:14:34 +0000 (08:14 -0700)]
Merge branch 'Lantiq-Intel-vrx200-support'

Hauke Mehrtens says:

====================
Add support for Lantiq / Intel vrx200 network

This adds basic support for the GSWIP (Gigabit Switch) found in the
VRX200 SoC.
There are different versions of this IP core used in different SoCs, but
this driver was currently only tested on the VRX200 SoC line, for other
SoCs this driver probably need some adoptions to work.

I also plan to add Layer 2 offloading to the DSA driver and later also
layer 3 offloading which is supported by the PPE HW block.

All these patches should go through the net-next tree.

This depends on the patch "MIPS: lantiq: dma: add dev pointer" which
should go into 4.19.

Changes since:
v2:
 * Send patch "MIPS: lantiq: dma: add dev pointer" separately
 * all: removed return in register write functions
 * switch: uses phylink
 * switch: uses hardware MDIO auto polling
 * switch: use usleep_range() in MDIO busy check
 * switch: configure MDIO bus to 2.5 MHz
 * switch: disable xMII link when it is not used
 * Ethernet: use NAPI for TX cleanups
 * Ethernet: enable clock in open callback
 * Ethernet: improve skb allocation
 * Ethernet: use net_dev->stats

v1:
 * Add "MIPS: lantiq: dma: add dev pointer"
 * checkpatch fixes a all patches
 * Added binding documentation
 * use readx_poll_timeout function and ETIMEOUT error code
 * integrate GPHY firmware loading into DSA driver
 * renamed to NET_DSA_LANTIQ_GSWIP
 * removed some needed casts
 * added of_device_id.data information about the detected switch
 * fixed John's email address
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: Add Lantiq / Intel DSA driver for vrx200
Hauke Mehrtens [Sun, 9 Sep 2018 20:20:39 +0000 (22:20 +0200)]
net: dsa: Add Lantiq / Intel DSA driver for vrx200

This adds the DSA driver for the GSWIP Switch found in the VRX200 SoC.
This switch is integrated in the DSL SoC, this SoC uses a GSWIP version
2.1, there are other SoCs using different versions of this IP block, but
this driver was only tested with the version found in the VRX200.
Currently only the basic features are implemented which will forward all
packages to the CPU and let the CPU do the forwarding. The hardware also
support Layer 2 offloading which is not yet implemented in this driver.

The GPHY FW loaded is now done by this driver and not any more by the
separate driver in drivers/soc/lantiq/gphy.c, I will remove this driver
is a separate patch. to make use of the GPHY this switch driver is
needed anyway. Other SoCs have more embedded GPHYs so this driver should
support a variable number of GPHYs. After the firmware was loaded the
GPHY can be probed on the MDIO bus and it behaves like an external GPHY,
without the firmware it can not be probed on the MDIO bus.

The clock names in the sysctrl.c file have to be changed because the
clocks are now used by a different driver. This should be cleaned up and
a real common clock driver should provide the clocks instead.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodt-bindings: net: dsa: Add lantiq, xrx200-gswip DT bindings
Hauke Mehrtens [Sun, 9 Sep 2018 20:20:27 +0000 (22:20 +0200)]
dt-bindings: net: dsa: Add lantiq, xrx200-gswip DT bindings

This adds the binding for the GSWIP (Gigabit switch) core found in the
xrx200 / VR9 Lantiq / Intel SoC.

This part takes care of the switch, MDIO bus, and loading the FW into
the embedded GPHYs.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Cc: devicetree@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: lantiq: Add Lantiq / Intel VRX200 Ethernet driver
Hauke Mehrtens [Sun, 9 Sep 2018 20:16:45 +0000 (22:16 +0200)]
net: lantiq: Add Lantiq / Intel VRX200 Ethernet driver

This drives the PMAC between the GSWIP Switch and the CPU in the VRX200
SoC. This is currently only the very basic version of the Ethernet
driver.

When the DMA channel is activated we receive some packets which were
send to the SoC while it was still in U-Boot, these packets have the
wrong header. Resetting the IP cores did not work so we read out the
extra packets at the beginning and discard them.

This also adapts the clock code in sysctrl.c to use the default name of
the device node so that the driver gets the correct clock. sysctrl.c
should be replaced with a proper common clock driver later.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodt-bindings: net: Add lantiq, xrx200-net DT bindings
Hauke Mehrtens [Sun, 9 Sep 2018 20:16:44 +0000 (22:16 +0200)]
dt-bindings: net: Add lantiq, xrx200-net DT bindings

This adds the binding for the PMAC core between the CPU and the GSWIP
switch found on the xrx200 / VR9 Lantiq / Intel SoC.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Cc: devicetree@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: Add Lantiq / Intel GSWIP tag support
Hauke Mehrtens [Sun, 9 Sep 2018 20:16:43 +0000 (22:16 +0200)]
net: dsa: Add Lantiq / Intel GSWIP tag support

This handles the tag added by the PMAC on the VRX200 SoC line.

The GSWIP uses internally a GSWIP special tag which is located after the
Ethernet header. The PMAC which connects the GSWIP to the CPU converts
this special tag used by the GSWIP into the PMAC special tag which is
added in front of the Ethernet header.

This was tested with GSWIP 2.1 found in the VRX200 SoCs, other GSWIP
versions use slightly different PMAC special tags.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMIPS: lantiq: Do not enable IRQs in dma open
Hauke Mehrtens [Sun, 9 Sep 2018 20:16:42 +0000 (22:16 +0200)]
MIPS: lantiq: Do not enable IRQs in dma open

When a DMA channel is opened the IRQ should not get activated
automatically, this allows it to pull data out manually without the help
of interrupts. This is needed for a workaround in the vrx200 Ethernet
driver.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Acked-by: Paul Burton <paul.burton@mips.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
David S. Miller [Thu, 13 Sep 2018 05:22:42 +0000 (22:22 -0700)]
Merge git://git./linux/kernel/git/davem/net

5 years agodocs: net: Remove TCP congestion document
Tobin C. Harding [Wed, 12 Sep 2018 08:14:44 +0000 (18:14 +1000)]
docs: net: Remove TCP congestion document

Document is stale, let's remove it.

Remove TCP congestion document.

Signed-off-by: Tobin C. Harding <me@tobin.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agogeneve: add ttl inherit support
Hangbin Liu [Wed, 12 Sep 2018 02:04:21 +0000 (10:04 +0800)]
geneve: add ttl inherit support

Similar with commit 72f6d71e491e6 ("vxlan: add ttl inherit support"),
currently ttl == 0 means "use whatever default value" on geneve instead
of inherit inner ttl. To respect compatibility with old behavior, let's
add a new IFLA_GENEVE_TTL_INHERIT for geneve ttl inherit support.

Reported-by: Jianlin Shi <jishi@redhat.com>
Suggested-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge tag 'drm-fixes-2018-09-12' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Thu, 13 Sep 2018 03:36:47 +0000 (17:36 -1000)]
Merge tag 'drm-fixes-2018-09-12' of git://anongit.freedesktop.org/drm/drm

Pull drm nouveau fixes from Dave Airlie:
 "I'm sending this separately as it's a bit larger than I generally like
  for one driver, but it does contain a bunch of make my nvidia laptop
  not die (runpm) and a bunch to make my docking station and monitor
  display stuff (mst) fixes.

  Lyude has spent a lot of time on these, and we are putting the fixes
  into distro kernels as well asap, as it helps a bunch of standard
  Lenovo laptops, so I'm fairly happy things are better than they were
  before these patches, but I decided to split them out just for
  clarification"

* tag 'drm-fixes-2018-09-12' of git://anongit.freedesktop.org/drm/drm:
  drm/nouveau/disp/gm200-: enforce identity-mapped SOR assignment for LVDS/eDP panels
  drm/nouveau/disp: fix DP disable race
  drm/nouveau/disp: move eDP panel power handling
  drm/nouveau/disp: remove unused struct member
  drm/nouveau/TBDdevinit: don't fail when PMU/PRE_OS is missing from VBIOS
  drm/nouveau/mmu: don't attempt to dereference vmm without valid instance pointer
  drm/nouveau: fix oops in client init failure path
  drm/nouveau: Fix nouveau_connector_ddc_detect()
  drm/nouveau/drm/nouveau: Don't forget to cancel hpd_work on suspend/unload
  drm/nouveau/drm/nouveau: Prevent handling ACPI HPD events too early
  drm/nouveau: Reset MST branching unit before enabling
  drm/nouveau: Only write DP_MSTM_CTRL when needed
  drm/nouveau: Remove useless poll_enable() call in drm_load()
  drm/nouveau: Remove useless poll_disable() call in switcheroo_set_state()
  drm/nouveau: Remove useless poll_enable() call in switcheroo_set_state()
  drm/nouveau: Fix deadlocks in nouveau_connector_detect()
  drm/nouveau/drm/nouveau: Use pm_runtime_get_noresume() in connector_detect()
  drm/nouveau/drm/nouveau: Fix deadlock with fb_helper with async RPM requests
  drm/nouveau: Remove duplicate poll_enable() in pmops_runtime_suspend()
  drm/nouveau/drm/nouveau: Fix bogus drm_kms_helper_poll_enable() placement

5 years agonet: dsa: mv88e6xxx: Make sure to configure ports with external PHYs
Marek Vasut [Tue, 11 Sep 2018 22:15:24 +0000 (00:15 +0200)]
net: dsa: mv88e6xxx: Make sure to configure ports with external PHYs

The MV88E6xxx can have external PHYs attached to certain ports and those
PHYs could even be on different MDIO bus than the one within the switch.
This patch makes sure that ports with such PHYs are configured correctly
according to the information provided by the PHY.

Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ethernet: Use DIV_ROUND_UP instead of reimplementing its function
zhong jiang [Tue, 11 Sep 2018 13:08:15 +0000 (21:08 +0800)]
net: ethernet: Use DIV_ROUND_UP instead of reimplementing its function

DIV_ROUND_UP has implemented the code-opened function. Therefore, just
replace the implementation with DIV_ROUND_UP.

Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoqlcnic: Remove set but not used variables 'fw_mbx' and 'hdr_size'
Yue Haibing [Tue, 11 Sep 2018 11:51:29 +0000 (11:51 +0000)]
qlcnic: Remove set but not used variables 'fw_mbx' and 'hdr_size'

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_common.c: In function 'qlcnic_sriov_pull_bc_msg':
drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_common.c:907:6: warning:
 variable 'fw_mbx' set but not used [-Wunused-but-set-variable]

drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_common.c: In function 'qlcnic_sriov_issue_bc_post':
drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_common.c:939:16: warning:
 variable 'hdr_size' set but not used [-Wunused-but-set-variable]

Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>