Tariq Toukan [Sun, 4 Mar 2018 08:35:00 +0000 (10:35 +0200)]
net/mlx5e: Add XDP_TX completions statistics
Add per-ring and global ethtool counters for XDP_TX completions.
This helps us monitor and analyze XDP_TX flow performance.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Tariq Toukan [Wed, 18 Apr 2018 10:33:15 +0000 (13:33 +0300)]
net/mlx5e: Add TX completions statistics
Add per-ring and global ethtool counters for TX completions.
This helps us monitor and analyze TX flow performance.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Tariq Toukan [Sun, 3 Jun 2018 14:41:48 +0000 (17:41 +0300)]
net/mlx5e: RX, Use existing WQ local variable
Local variable 'wq' already points to &sq->wq, use it.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Tariq Toukan [Tue, 5 Jun 2018 08:47:04 +0000 (11:47 +0300)]
net/mlx5e: Convert large order kzalloc allocations to kvzalloc
Replace calls to kzalloc_node with kvzalloc_node, as it fallsback
to lower-order pages if the higher-order trials fail.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Boris Pismenny [Mon, 11 Jun 2018 14:24:58 +0000 (17:24 +0300)]
net/mlx5e: Add UDP GSO remaining counter
This patch adds a counter for tx UDP GSO packets that contain a segment
that is not aligned to MSS - remaining segment.
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Boris Pismenny [Thu, 31 May 2018 12:29:42 +0000 (15:29 +0300)]
net/mlx5e: Add UDP GSO support
This patch enables UDP GSO support. We enable this by using two WQEs
the first is a UDP LSO WQE for all segments with equal length, and the
second is for the last segment in case it has different length.
Due to HW limitation, before sending, we must adjust the packet length fields.
We measure performance between two Intel(R) Xeon(R) CPU E5-2643 v2 @3.50GHz
machines connected back-to-back with Connectx4-Lx (40Gbps) NICs.
We compare single stream UDP, UDP GSO and UDP GSO with offload.
Performance:
| MSS (bytes) | Throughput (Gbps) | CPU utilization (%)
UDP GSO offload | 1472 | 35.6 | 8%
UDP GSO | 1472 | 25.5 | 17%
UDP | 1472 | 10.2 | 17%
UDP GSO offload | 1024 | 35.6 | 8%
UDP GSO | 1024 | 19.2 | 17%
UDP | 1024 | 5.7 | 17%
UDP GSO offload | 512 | 33.8 | 16%
UDP GSO | 512 | 10.4 | 17%
UDP | 512 | 3.5 | 17%
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Yossi Kuperman <yossiku@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
David Ahern [Tue, 26 Jun 2018 19:39:18 +0000 (12:39 -0700)]
netlink: Return extack message if attribute validation fails
Have one extack message for parsing and validating.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Brandon Maier [Tue, 26 Jun 2018 17:50:50 +0000 (12:50 -0500)]
net: phy: xgmiitorgmii: Check read_status results
We're ignoring the result of the attached phy device's read_status().
Return it so we can detect errors.
Signed-off-by: Brandon Maier <brandon.maier@rockwellcollins.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Brandon Maier [Tue, 26 Jun 2018 17:50:49 +0000 (12:50 -0500)]
net: phy: xgmiitorgmii: Use correct mdio bus
The xgmiitorgmii is using the mii_bus of the device it's attached to,
instead of the bus it was given during probe.
Signed-off-by: Brandon Maier <brandon.maier@rockwellcollins.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Brandon Maier [Tue, 26 Jun 2018 17:50:48 +0000 (12:50 -0500)]
net: phy: xgmiitorgmii: Check phy_driver ready before accessing
Since a phy_device is added to the global mdio_bus list during
phy_device_register(), but a phy_device's phy_driver doesn't get
attached until phy_probe(). It's possible of_phy_find_device() in
xgmiitorgmii will return a valid phy with a NULL phy_driver. Leading to
a NULL pointer access during the memcpy().
Fixes this Oops:
Unable to handle kernel NULL pointer dereference at virtual address
00000000
pgd =
c0004000
[
00000000] *pgd=
00000000
Internal error: Oops: 5 [#1] PREEMPT SMP ARM
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.14.40 #1
Hardware name: Xilinx Zynq Platform
task:
ce4c8d00 task.stack:
ce4ca000
PC is at memcpy+0x48/0x330
LR is at xgmiitorgmii_probe+0x90/0xe8
pc : [<
c074bc68>] lr : [<
c0529548>] psr:
20000013
sp :
ce4cbb54 ip :
00000000 fp :
ce4cbb8c
r10:
00000000 r9 :
00000000 r8 :
c0c49178
r7 :
00000000 r6 :
cdc14718 r5 :
ce762800 r4 :
cdc14710
r3 :
00000000 r2 :
00000054 r1 :
00000000 r0 :
cdc14718
Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
Control:
18c5387d Table:
0000404a DAC:
00000051
Process swapper/0 (pid: 1, stack limit = 0xce4ca210)
...
[<
c074bc68>] (memcpy) from [<
c0529548>] (xgmiitorgmii_probe+0x90/0xe8)
[<
c0529548>] (xgmiitorgmii_probe) from [<
c0526a94>] (mdio_probe+0x28/0x34)
[<
c0526a94>] (mdio_probe) from [<
c04db98c>] (driver_probe_device+0x254/0x414)
[<
c04db98c>] (driver_probe_device) from [<
c04dbd58>] (__device_attach_driver+0xac/0x10c)
[<
c04dbd58>] (__device_attach_driver) from [<
c04d96f4>] (bus_for_each_drv+0x84/0xc8)
[<
c04d96f4>] (bus_for_each_drv) from [<
c04db5bc>] (__device_attach+0xd0/0x134)
[<
c04db5bc>] (__device_attach) from [<
c04dbdd4>] (device_initial_probe+0x1c/0x20)
[<
c04dbdd4>] (device_initial_probe) from [<
c04da8fc>] (bus_probe_device+0x98/0xa0)
[<
c04da8fc>] (bus_probe_device) from [<
c04d8660>] (device_add+0x43c/0x5d0)
[<
c04d8660>] (device_add) from [<
c0526cb8>] (mdio_device_register+0x34/0x80)
[<
c0526cb8>] (mdio_device_register) from [<
c0580b48>] (of_mdiobus_register+0x170/0x30c)
[<
c0580b48>] (of_mdiobus_register) from [<
c05349c4>] (macb_probe+0x710/0xc00)
[<
c05349c4>] (macb_probe) from [<
c04dd700>] (platform_drv_probe+0x44/0x80)
[<
c04dd700>] (platform_drv_probe) from [<
c04db98c>] (driver_probe_device+0x254/0x414)
[<
c04db98c>] (driver_probe_device) from [<
c04dbc58>] (__driver_attach+0x10c/0x118)
[<
c04dbc58>] (__driver_attach) from [<
c04d9600>] (bus_for_each_dev+0x8c/0xd0)
[<
c04d9600>] (bus_for_each_dev) from [<
c04db1fc>] (driver_attach+0x2c/0x30)
[<
c04db1fc>] (driver_attach) from [<
c04daa98>] (bus_add_driver+0x50/0x260)
[<
c04daa98>] (bus_add_driver) from [<
c04dc440>] (driver_register+0x88/0x108)
[<
c04dc440>] (driver_register) from [<
c04dd6b4>] (__platform_driver_register+0x50/0x58)
[<
c04dd6b4>] (__platform_driver_register) from [<
c0b31248>] (macb_driver_init+0x24/0x28)
[<
c0b31248>] (macb_driver_init) from [<
c010203c>] (do_one_initcall+0x60/0x1a4)
[<
c010203c>] (do_one_initcall) from [<
c0b00f78>] (kernel_init_freeable+0x15c/0x1f8)
[<
c0b00f78>] (kernel_init_freeable) from [<
c0763d10>] (kernel_init+0x18/0x124)
[<
c0763d10>] (kernel_init) from [<
c0112d74>] (ret_from_fork+0x14/0x20)
Code:
ba000002 f5d1f03c f5d1f05c f5d1f07c (
e8b151f8)
---[ end trace
3e4ec21905820a1f ]---
Signed-off-by: Brandon Maier <brandon.maier@rockwellcollins.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 28 Jun 2018 07:10:08 +0000 (16:10 +0900)]
Merge branch 'ipsec-selftests-updates'
Shannon Nelson says:
====================
Updates for ipsec selftests
Fix up the existing ipsec selftest and add tests for
the ipsec offload driver API.
v2: addressed formatting nits in netdevsim from Jakub Kicinski
v3: a couple more nits from Jakub
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson [Tue, 26 Jun 2018 17:07:55 +0000 (10:07 -0700)]
selftests: rtnetlink: add ipsec offload API test
Using the netdevsim as a device for testing, try out the XFRM commands
for setting up IPsec hardware offloads.
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson [Tue, 26 Jun 2018 17:07:54 +0000 (10:07 -0700)]
netdevsim: add ipsec offload testing
Implement the IPsec/XFRM offload API for testing.
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson [Tue, 26 Jun 2018 17:07:53 +0000 (10:07 -0700)]
selftests: rtnetlink: use dummydev as a test device
We really shouldn't mess with local system settings, so let's
use the already created dummy device instead for ipsec testing.
Oh, and let's put the temp file into a proper directory.
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson [Tue, 26 Jun 2018 17:07:52 +0000 (10:07 -0700)]
selftests: rtnetlink: clear the return code at start of ipsec test
Following the custom from the other functions, clear the global
ret code before starting the test so as to not have previously
failed tests cause us to thing this test has failed.
Reported-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Guillaume Nault [Tue, 26 Jun 2018 16:41:36 +0000 (18:41 +0200)]
l2tp: define helper for parsing struct sockaddr_pppol2tp*
'sockaddr_len' is checked against various values when entering
pppol2tp_connect(), to verify its validity. It is used again later, to
find out which sockaddr structure was passed from user space. This
patch combines these two operations into one new function in order to
simplify pppol2tp_connect().
A new structure, l2tp_connect_info, is used to pass sockaddr data back
to pppol2tp_connect(), to avoid passing too many parameters to
l2tp_sockaddr_get_info(). Also, the first parameter is void* in order
to avoid casting between all sockaddr_* structures manually.
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 26 Jun 2018 15:45:49 +0000 (08:45 -0700)]
tcp: remove one indentation level in tcp_create_openreq_child
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergei Shtylyov [Tue, 26 Jun 2018 15:42:33 +0000 (18:42 +0300)]
sh_eth: fix *enum* {A|M}PR_BIT
The *enum* {A|M}PR_BIT were declared in the commit
86a74ff21a7a ("net:
sh_eth: add support for Renesas SuperH Ethernet") adding SH771x support,
however the SH771x manual doesn't have the APR/MPR registers described
and the code writing to them for SH7710 was later removed by the commit
380af9e390ec ("net: sh_eth: CPU dependency code collect to "struct
sh_eth_cpu_data""). All the newer SoC manuals have these registers
documented as having a 16-bit TIME parameter of the PAUSE frame, not
1-bit -- update the *enum* accordingly, fixing up the APR/MPR writes...
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
Keara Leibovitz [Tue, 26 Jun 2018 14:16:28 +0000 (10:16 -0400)]
tc-tests: add an extreme-case csum action test
Added an extreme-case test for all 7 csum action headers.
Signed-off-by: Keara Leibovitz <kleib@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 28 Jun 2018 05:18:49 +0000 (14:18 +0900)]
Merge branch 'mscc-ocelot-add-more-features'
Alexandre Belloni says:
====================
net: mscc: ocelot: add more features
This series adds link aggregation and VLAN filtering hardware offload
support to the ocelot driver.
PTP support will be sent later.
changes in v2:
- rebased on v4.18-rc1
- check for aggregation type and only offload it when type is hash (balance-xor
or 802.3ad)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Antoine Tenart [Tue, 26 Jun 2018 12:28:49 +0000 (14:28 +0200)]
net: mscc: ocelot: add VLAN filtering
Add hardware VLAN filtering offloading on ocelot.
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexandre Belloni [Tue, 26 Jun 2018 12:28:48 +0000 (14:28 +0200)]
net: mscc: ocelot: add bonding support
Add link aggregation hardware offload support for Ocelot.
ocelot_get_link_ksettings() is not great but it does work until the driver
is reworked to switch to phylink.
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ganesh Goudar [Tue, 26 Jun 2018 09:21:13 +0000 (14:51 +0530)]
cxgb4: Add new T5 PCI device id 0x50ae
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Casey Leedom [Tue, 26 Jun 2018 09:18:48 +0000 (14:48 +0530)]
cxgb4: Add flag tc_flower_initialized
Add flag tc_flower_initialized to indicate the
completion if tc flower initialization.
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Roopa Prabhu [Tue, 26 Jun 2018 03:32:53 +0000 (20:32 -0700)]
neighbour: force neigh_invalidate when NUD_FAILED update is from admin
In systems where neigh gc thresh holds are set to high values,
admin deleted neigh entries (eg ip neigh flush or ip neigh del) can
linger around in NUD_FAILED state for a long time until periodic gc kicks
in. This patch forces neigh_invalidate when NUD_FAILED neigh_update is
from an admin.
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 27 Jun 2018 01:42:13 +0000 (10:42 +0900)]
Merge branch 'Multipath-tests-for-tunnel-devices'
Petr Machata says:
====================
Multipath tests for tunnel devices
This patchset adds a test for ECMP and weighted ECMP between two GRE
tunnels.
In patches #1 and #2, the function multipath_eval() is first moved from
router_multipath.sh to lib.sh for ease of reuse, and then fixed up.
In patch #3, the function tc_rule_stats_get() is parameterized to be
useful for egress rules as well.
In patch #4, a new function __simple_if_init() is extracted from
simple_if_init(). This covers the logic that needs to be done for the
usual interface: VRF migration, upping and installation of IP addresses.
Patch #5 then adds the test itself.
Additionally in patch #6, a requirement to add diagrams to selftests is
documented.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Tue, 26 Jun 2018 00:08:17 +0000 (02:08 +0200)]
selftests: forwarding: README: Require diagrams
ASCII art diagrams are well suited for presenting the topology that a
test uses while being easy to embed directly in the test file iteslf.
They make the information very easy to grasp even for simple topologies,
and for more complex ones they are almost essential, as figuring out the
interconnects from the script itself proves to be difficult.
Therefore state the requirement for topology ASCII art in README.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Tue, 26 Jun 2018 00:08:05 +0000 (02:08 +0200)]
selftests: forwarding: Test multipath tunneling
Add a GRE-tunneling test such that there are two tunnels involved, with
a multipath route listing both as next hops. Similarly to
router_multipath.sh, test that the distribution of traffic to the
tunnels honors the configured weights.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Tue, 26 Jun 2018 00:08:00 +0000 (02:08 +0200)]
selftests: forwarding: lib: Extract interface-init functions
The function simple_if_init() does two things: it creates a VRF, then
moves an interface into this VRF and configures addresses. The latter
comes in handy when adding more interfaces into a VRF later on. The
situation is similar for simple_if_fini().
Therefore split the interface remastering and address de/initialization
logic to a new pair of helpers __simple_if_init() / __simple_if_fini(),
and defer to these helpers from simple_if_init() and simple_if_fini().
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Tue, 26 Jun 2018 00:07:45 +0000 (02:07 +0200)]
selftests: forwarding: tc_rule_stats_get: Parameterize direction
The GRE multipath tests need stats on an egress counter. Change
tc_rule_stats_get() to take direction as an optional argument, with
default of ingress.
Take the opportunity to change line continuation character from | to \.
Move the | to the next line, which indent.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Tue, 26 Jun 2018 00:07:08 +0000 (02:07 +0200)]
selftests: forwarding: multipath_eval(): Improve style
- Change the indentation of the function body from 7 spaces to one tab.
- Move initialization of weights_ratio up so that it can be referenced
from the error message about packet difference being zero.
- Move |'s consistently to continuation line, which reindent.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Tue, 26 Jun 2018 00:06:06 +0000 (02:06 +0200)]
selftests: forwarding: Move multipath_eval() to lib.sh
This function will be useful for the GRE multipath test that is coming
later.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kees Cook [Mon, 25 Jun 2018 23:55:05 +0000 (16:55 -0700)]
net/tls: Remove VLA usage on nonce
It looks like the prior VLA removal, commit
b16520f7493d ("net/tls: Remove
VLA usage"), and a new VLA addition, commit
c46234ebb4d1e ("tls: RX path
for ktls"), passed in the night. This removes the newly added VLA, which
happens to have its bounds based on the same max value.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Mon, 25 Jun 2018 23:20:32 +0000 (01:20 +0200)]
selftests: forwarding: mirror_gre_vlan_bridge_1q: Unset rp_filter
The IP addresses of tunnel endpoint at H3 are set at the VLAN device
$h3.555. Therefore when test_gretap_untagged_egress() sets vlan 555 to
egress untagged at $swp3, $h3's rp_filter rejects these packets. The
test then spuriously fails.
Therefore turn off net.ipv4.conf.{all, $h3}.rp_filter.
Fixes:
9c7c8a82442c ("selftests: forwarding: mirror_gre_vlan_bridge_1q: Add more tests")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kees Cook [Mon, 25 Jun 2018 22:49:49 +0000 (15:49 -0700)]
mdio-mux-gpio: Remove VLA usage
In the quest to remove all stack VLA usage from the kernel[1], this
allocates the values buffer during the callback instead of putting it
on the stack.
[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 26 Jun 2018 14:21:33 +0000 (23:21 +0900)]
Merge branch 'net-sched-support-replay-of-filter-offload-when-binding-to-block'
Jakub Kicinski says:
====================
net: sched: support replay of filter offload when binding to block
This series from John adds the ability to replay filter offload requests
when new offload callback is being registered on a TC block. This is most
likely to take place for shared blocks today, when a block which already
has rules is bound to another interface. Prior to this patch set if any
of the rules were offloaded the block bind would fail.
A new tcf_proto_op is added to generate a filter-specific offload request.
The new 'offload' op is supporting extack from day 0, hence we need to
propagate extack to .ndo_setup_tc TC_BLOCK_BIND/TC_BLOCK_UNBIND and
through tcf_block_cb_register() to tcf_block_playback_offloads().
The immediate use of this patch set is to simplify life of drivers which
require duplicating rules when sharing blocks. Switch drivers (mlxsw)
can bind ports to rule lists dynamically, NIC drivers generally don't
have that ability and need the rules to be duplicated for each ingress
they match on. In code terms this means that switch drivers don't
register multiple callbacks for each port. NIC drivers do, and get a
separate request and hance rule per-port, as if the block was not shared.
The registration fails today, however, if some rules were already present.
As John notes in description of patch 7, drivers which register multiple
callbacks to shared blocks will likely need to flush the rules on block
unbind. This set makes the core not only replay the the offload add
requests but also offload remove requests when callback is unregistered.
v2:
- name parameters in patch 2;
- use unsigned int instead of u32 for in_hw_coun;
- improve extack message in patch 7.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
John Hurley [Mon, 25 Jun 2018 21:30:10 +0000 (14:30 -0700)]
net: sched: call reoffload op on block callback reg
Call the reoffload tcf_proto_op on all tcf_proto nodes in all chains of a
block when a callback tries to register to a block that already has
offloaded rules. If all existing rules cannot be offloaded then the
registration is rejected. This replaces the previous policy of rejecting
such callback registration outright.
On unregistration of a callback, the rules are flushed for that given cb.
The implementation of block sharing in the NFP driver, for example,
duplicates shared rules to all devs bound to a block. This meant that
rules could still exist in hw even after a device is unbound from a block
(assuming the block still remains active).
Signed-off-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
John Hurley [Mon, 25 Jun 2018 21:30:09 +0000 (14:30 -0700)]
net: sched: cls_bpf: implement offload tcf_proto_op
Add the offload tcf_proto_op in cls_bpf to generate an offload message for
each bpf prog in the given tcf_proto. Call the specified callback with
this new offload message. The function only returns an error if the
callback rejects adding a 'hardware only' prog.
A prog contains a flag to indicate if it is in hardware or not. To
ensure the offload function properly maintains this flag, keep a reference
counter for the number of instances of the prog that are in hardware. Only
update the flag when this counter changes from or to 0.
Signed-off-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
John Hurley [Mon, 25 Jun 2018 21:30:08 +0000 (14:30 -0700)]
net: sched: cls_u32: implement offload tcf_proto_op
Add the offload tcf_proto_op in cls_u32 to generate an offload message for
each filter and the hashtable in the given tcf_proto. Call the specified
callback with this new offload message. The function only returns an error
if the callback rejects adding a 'hardware only' rule.
A filter contains a flag to indicate if it is in hardware or not. To
ensure the offload function properly maintains this flag, keep a reference
counter for the number of instances of the filter that are in hardware.
Only update the flag when this counter changes from or to 0.
Signed-off-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
John Hurley [Mon, 25 Jun 2018 21:30:07 +0000 (14:30 -0700)]
net: sched: cls_matchall: implement offload tcf_proto_op
Add the reoffload tcf_proto_op in matchall to generate an offload message
for each filter in the given tcf_proto. Call the specified callback with
this new offload message. The function only returns an error if the
callback rejects adding a 'hardware only' rule.
Ensure matchall flags correctly report if the rule is in hw by keeping a
reference counter for the number of instances of the rule offloaded. Only
update the flag when this counter changes from or to 0.
Signed-off-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
John Hurley [Mon, 25 Jun 2018 21:30:06 +0000 (14:30 -0700)]
net: sched: cls_flower: implement offload tcf_proto_op
Add the reoffload tcf_proto_op in flower to generate an offload message
for each filter in the given tcf_proto. Call the specified callback with
this new offload message. The function only returns an error if the
callback rejects adding a 'hardware only' rule.
A filter contains a flag to indicate if it is in hardware or not. To
ensure the reoffload function properly maintains this flag, keep a
reference counter for the number of instances of the filter that are in
hardware. Only update the flag when this counter changes from or to 0. Add
a generic helper function to implement this behaviour.
Signed-off-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
John Hurley [Mon, 25 Jun 2018 21:30:05 +0000 (14:30 -0700)]
net: sched: add tcf_proto_op to offload a rule
Create a new tcf_proto_op called 'reoffload' that generates a new offload
message for each node in a tcf_proto. Pointers to the tcf_proto and
whether the offload request is to add or delete the node are included.
Also included is a callback function to send the offload message to and
the option of priv data to go with the cb.
Signed-off-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
John Hurley [Mon, 25 Jun 2018 21:30:04 +0000 (14:30 -0700)]
net: sched: pass extack pointer to block binds and cb registration
Pass the extact struct from a tc qdisc add to the block bind function and,
in turn, to the setup_tc ndo of binding device via the tc_block_offload
struct. Pass this back to any block callback registrations to allow
netlink logging of fails in the bind process.
Signed-off-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 26 Jun 2018 14:15:43 +0000 (23:15 +0900)]
Merge branch 'sh_eth-RPADIR-related-clean-ups'
Sergei Shtylyov says:
====================
sh_eth: RPADIR related clean-ups
Here's a set of 2 patches against DaveM's 'net-next.git' repo. They are
clean-ups related to RPADIR (DMA padding to NET_IP_ALIGN)...
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergei Shtylyov [Mon, 25 Jun 2018 20:37:06 +0000 (23:37 +0300)]
sh_eth: remove sh_eth_cpu_data::rpadir_value
If RPADIR exists, the value written to it is always the same for all SoCs
(and derived from NET_IP_ALIGN), so there has not been any need to store
it in the *struct* sh_eth_cpu_data...
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergei Shtylyov [Mon, 25 Jun 2018 20:36:21 +0000 (23:36 +0300)]
sh_eth: fix *enum* RPADIR_BIT
The *enum* RPADIR_BIT was declared in the commit
86a74ff21a7a ("net:
sh_eth: add support for Renesas SuperH Ethernet") adding SH771x support,
however the SH771x manual doesn't have the RPADIR register described and,
moreover, tells why the padding insertion must not be used. The newer SoC
manuals do have RPADIR documented, though with somewhat different layout --
update the *enum* according to these manuals...
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Mon, 25 Jun 2018 18:34:41 +0000 (20:34 +0200)]
r8169: reject unsupported WoL options
So far unsupported WoL options are silently ignored. Change this and
reject attempts to set unsupported options. This prevents situations
where a user tries to set an unsupported WoL option and is under the
impression it was successful because ethtool doesn't complain.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Mon, 25 Jun 2018 14:43:55 +0000 (16:43 +0200)]
selftests: net: Test headroom handling of ip6_gre devices
Commit
5691484df961 ("net: ip6_gre: Fix headroom request in
ip6erspan_tunnel_xmit()") and commit
01b8d064d58b ("net: ip6_gre:
Request headroom in __gre6_xmit()") fix problems in reserving headroom
in the packets tunneled through ip6gre/tap and ip6erspan netdevices.
These two patches included snippets that reproduced the issues. This
patch elevates the snippets to a full-fledged test case.
Suggested-by: David Miller <davem@davemloft.net>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 26 Jun 2018 13:55:52 +0000 (22:55 +0900)]
Merge branch 'l2tp-trivial-cleanups'
Guillaume Nault says:
====================
l2tp: trivial cleanups
Just a set of unrelated trivial cleanups (remove unused code, make
local functions static, etc.).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Guillaume Nault [Mon, 25 Jun 2018 14:07:25 +0000 (16:07 +0200)]
l2tp: make l2tp_xmit_core() return void
It always returns 0, and nobody reads the return value anyway.
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Guillaume Nault [Mon, 25 Jun 2018 14:07:24 +0000 (16:07 +0200)]
l2tp: avoid duplicate l2tp_pernet() calls
Replace 'l2tp_pernet(tunnel->l2tp_net)' with 'pn', which has been set
on the preceding line.
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Guillaume Nault [Mon, 25 Jun 2018 14:07:23 +0000 (16:07 +0200)]
l2tp: don't export l2tp_tunnel_closeall()
This function is only used in l2tp_core.c.
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Guillaume Nault [Mon, 25 Jun 2018 14:07:22 +0000 (16:07 +0200)]
l2tp: don't export l2tp_session_queue_purge()
This function is only used in l2tp_core.c.
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Guillaume Nault [Mon, 25 Jun 2018 14:07:20 +0000 (16:07 +0200)]
l2tp: remove l2tp_tunnel_priv()
This function, and the associated .priv field, are unused.
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Guillaume Nault [Mon, 25 Jun 2018 14:07:19 +0000 (16:07 +0200)]
l2tp: remove .show from struct l2tp_tunnel
This callback has never been implemented.
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Guillaume Nault [Mon, 25 Jun 2018 14:07:18 +0000 (16:07 +0200)]
l2tp: remove pppol2tp_session_close()
l2tp_core.c verifies that ->session_close() is defined before calling
it. There's no need for a stub.
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 26 Jun 2018 13:15:15 +0000 (22:15 +0900)]
Merge branch 'DPAA-PTP-clock-and-timestamping'
Yangbo Lu says:
====================
Support DPAA PTP clock and timestamping
This patchset is to support DPAA FMAN PTP clock and HW timestamping.
It had been verified on both ARM platform and PPC platform.
- The patch #1 to patch #5 are to support DPAA FMAN 1588 timer in
ptp_qoriq driver.
- The patch #6 to patch #10 are to add HW timestamping support in
DPAA ethernet driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Yangbo Lu [Mon, 25 Jun 2018 12:37:16 +0000 (20:37 +0800)]
dpaa_eth: add the get_ts_info interface for ethtool
Added the get_ts_info interface for ethtool to check
the timestamping capability.
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yangbo Lu [Mon, 25 Jun 2018 12:37:15 +0000 (20:37 +0800)]
dpaa_eth: add support for hardware timestamping
This patch is to add hardware timestamping support
for dpaa_eth. On Rx, timestamping is enabled for
all frames. On Tx, we only instruct the hardware
to timestamp the frames marked accordingly by the
stack.
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yangbo Lu [Mon, 25 Jun 2018 12:37:14 +0000 (20:37 +0800)]
fsl/fman: define frame description command UPD
Defined frame description command FM_FD_CMD_UPD for
prepended data updating.
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yangbo Lu [Mon, 25 Jun 2018 12:37:13 +0000 (20:37 +0800)]
fsl/fman_port: support getting timestamp
This patch is to add fman_port_get_tstamp() interface
to get timestamp.
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yangbo Lu [Mon, 25 Jun 2018 12:37:12 +0000 (20:37 +0800)]
fsl/fman: add set_tstamp interface
This patch is to add set_tstamp interface for memac,
dtsec, and 10GEC controllers to configure HW timestamping.
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yangbo Lu [Mon, 25 Jun 2018 12:37:11 +0000 (20:37 +0800)]
arm64: dts: fsl: move ptp timer out of fman
This patch is to move ptp timer node out of fman.
Because ptp timer will be probed by ptp_qoriq driver,
it should be an independent device in case of conflict
memory mapping.
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yangbo Lu [Mon, 25 Jun 2018 12:37:10 +0000 (20:37 +0800)]
powerpc/mpc85xx: move ptp timer out of fman in dts
This patch is to move ptp timer node out of fman.
Because ptp timer will be probed by ptp_qoriq driver,
it should be an independent device in case of conflict
memory mapping.
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yangbo Lu [Mon, 25 Jun 2018 12:37:09 +0000 (20:37 +0800)]
dt-binding: ptp_qoriq: add DPAA FMan support
This patch is to add bindings description for DPAA
FMan 1588 timer, and also remove its description in
fsl-fman dt-bindings document.
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yangbo Lu [Mon, 25 Jun 2018 12:37:08 +0000 (20:37 +0800)]
ptp: support DPAA FMan 1588 timer in ptp_qoriq
This patch is to support DPAA (Data Path Acceleration Architecture)
1588 timer by adding "fsl,fman-ptp-timer" compatible, sharing
interrupt with FMan, adding FSL_DPAA_ETH dependency, and fixing
up register offset.
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yangbo Lu [Mon, 25 Jun 2018 12:37:07 +0000 (20:37 +0800)]
fsl/fman: share the event interrupt
This patch is to share fman event interrupt because
the 1588 timer driver will also use this interrupt.
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 26 Jun 2018 09:05:32 +0000 (18:05 +0900)]
Merge branch 'mlxsw-Support-bridge-router-interfaces-with-non-default-VLAN'
Ido Schimmel says:
====================
mlxsw: Support bridge router interfaces with non-default VLAN
Petr says:
When traffic is inserted on a router interface associated with an 802.1q
bridge, the VLAN that the traffic appears on is determined by PVID of
the bridge device itself. However currently mlxsw always configures such
traffic to be forwarded to VLAN 1, regardless of the bridge PVID.
Fix the problem by modifying the FID-handling code to assign such
traffic not to FID that corresponds to VLAN 1, but to a FID that
corresponds to the configured PVID. Bail out if there is no PVID. This
is implemented in patches #1 and #2.
From that point on, also forbid any changes to bridge device PVID,
because such changes would not be reflected. This is implemented in
patches #3, #4 and #5.
Finally in patch #6, introduce tests that use bridge as a routed
interface, and test mlxsw in both the currently-supported scenario of
using PVID 1, and the newly-supported one of using a custom PVID.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Mon, 25 Jun 2018 07:48:18 +0000 (10:48 +0300)]
selftests: forwarding: Test routed bridge interface
Add test for cases where bridge itself acts as a router interface, with
front panel port attached to the bridge in question.
In the first test (router_bridge.sh), VLAN memberships are not
configured in any way, and everything uses default PVID of 1. Thus
traffic in $h1 and $h2 is untagged. This test ensures that the previous
patches didn't break a currently working scenario.
In the second test (router_bridge_vlan.sh), a VLAN 555 pvid untagged is
added to the bridge CPU port, with that VLAN leaving the bridge tagged
through its sole member port. The traffic is therefore expected to come
out tagged at $h1. This tests the fix introduced in the previous
patches.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Mon, 25 Jun 2018 07:48:17 +0000 (10:48 +0300)]
mlxsw: spectrum_switchdev: Ban PVID change if bridge has a RIF
When traffic passes through a router port, it needs to be assigned a FID
for ASIC to forward correctly. For bridges, this FID used to be the one
corresponding to VLAN 1. In a previous patch, this was changed to
instead use the PVID at the time that the RIF is created. This patch
guards PVID changes after the RIF was introduced.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Mon, 25 Jun 2018 07:48:16 +0000 (10:48 +0300)]
mlxsw: spectrum_router: Add mlxsw_sp_rif_fid()
In order to allow querying of the VID for which a RIF was created, add
a new function that returns a FID for a given RIF.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Mon, 25 Jun 2018 07:48:15 +0000 (10:48 +0300)]
mlxsw: spectrum_router: Publish mlxsw_sp_rif_find_by_dev()
In order to guard against removal of a PVID for which a FID was
allocated, spectrum_switchdev needs to first determine whether there is
a RIF associated with a given bridge. To that end, publish a preexisting
function mlxsw_sp_rif_find_by_dev().
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Mon, 25 Jun 2018 07:48:14 +0000 (10:48 +0300)]
mlxsw: spectrum_router: Allocate FID according to PVID
For bridge netdevices, instead of assuming that the router traffic is on
VLAN 1, look at the bridge PVID.
This patch assumes that the PVID doesn't change after the router
interface is created (i.e. after the IP address is assigned).
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Mon, 25 Jun 2018 07:48:13 +0000 (10:48 +0300)]
mlxsw: spectrum_router: Propagate extack to .fid_get()
In the follow-up patch, mlxsw_sp_rif_vlan_fid_get() will be changed in a
way that could fail. Give that function a possibility to explain the
failure through extack.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yafang Shao [Sun, 24 Jun 2018 14:02:54 +0000 (10:02 -0400)]
tcp: add SNMP counter for zero-window drops
It will be helpful if we could display the drops due to zero window or no
enough window space.
So a new SNMP MIB entry is added to track this behavior.
This entry is named LINUX_MIB_TCPZEROWINDOWDROP and published in
/proc/net/netstat in TcpExt line as TCPZeroWindowDrop.
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 26 Jun 2018 02:33:04 +0000 (11:33 +0900)]
Merge branch 'NAPI-gro-hash'
Convert GRO receive over to hash table.
When many parallel flows are present and being received on the same
RX queue, GRO processing can become expensive because each incoming
frame must traverse the per-NAPI GRO list at each protocol layer
of GRO receive (eth --> ipv{4,6} --> tcp).
Use the already computed hash to chain these SKBs in a hash table
instead of a simple list.
The first patch makes the GRO list a true list_head.
The second patch implements the hash table.
This series patches basic testing and I added some diagnostics
to make sure we really were aggregating GRO frames :-)
Signed-off-by: David S. Miller <davem@davemloft.net>
David Miller [Sun, 24 Jun 2018 05:14:02 +0000 (14:14 +0900)]
net: Convert NAPI gro list into a small hash table.
Improve the performance of GRO receive by splitting flows into
multiple hash chains.
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Miller [Sun, 24 Jun 2018 05:13:49 +0000 (14:13 +0900)]
net: Convert GRO SKB handling to list_head.
Manage pending per-NAPI GRO packets via list_head.
Return an SKB pointer from the GRO receive handlers. When GRO receive
handlers return non-NULL, it means that this SKB needs to be completed
at this time and removed from the NAPI queue.
Several operations are greatly simplified by this transformation,
especially timing out the oldest SKB in the list when gro_count
exceeds MAX_GRO_SKBS, and napi_gro_flush() which walks the queue
in reverse order.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 25 Jun 2018 23:07:17 +0000 (08:07 +0900)]
Merge ra./pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Mon, 25 Jun 2018 07:58:17 +0000 (15:58 +0800)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Fix netpoll OOPS in r8169, from Ville Syrjälä.
2) Fix bpf instruction alignment on powerpc et al., from Eric Dumazet.
3) Don't ignore IFLA_MTU attribute when creating new ipvlan links. From
Xin Long.
4) Fix use after free in AF_PACKET, from Eric Dumazet.
5) Mis-matched RTNL unlock in xen-netfront, from Ross Lagerwall.
6) Fix VSOCK loopback on big-endian, from Claudio Imbrenda.
7) Missing RX buffer offset correction when computing DMA addresses in
mvneta driver, from Antoine Tenart.
8) Fix crashes in DCCP's ccid3_hc_rx_send_feedback, from Eric Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (34 commits)
sfc: make function efx_rps_hash_bucket static
strparser: Corrected typo in documentation.
qmi_wwan: add support for the Dell Wireless 5821e module
cxgb4: when disabling dcb set txq dcb priority to 0
net_sched: remove a bogus warning in hfsc
net: dccp: switch rx_tstamp_last_feedback to monotonic clock
net: dccp: avoid crash in ccid3_hc_rx_send_feedback()
net: Remove depends on HAS_DMA in case of platform dependency
MAINTAINERS: Add file patterns for dsa device tree bindings
net: mscc: make sparse happy
net: mvneta: fix the Rx desc DMA address in the Rx path
Documentation: e1000: Fix docs build error
Documentation: e100: Fix docs build error
Documentation: e1000: Use correct heading adornment
Documentation: e100: Use correct heading adornment
ipv6: mcast: fix unsolicited report interval after receiving querys
vhost_net: validate sock before trying to put its fd
VSOCK: fix loopback on big-endian systems
net: ethernet: ti: davinci_cpdma: make function cpdma_desc_pool_create static
xen-netfront: Update features after registering netdev
...
David S. Miller [Mon, 25 Jun 2018 07:21:52 +0000 (16:21 +0900)]
Merge branch 'r8169-improve-PHY-initialization-and-WoL-handling'
Heiner Kallweit says:
====================
r8169: improve PHY initialization and WoL handling
Series with smaller improvements regarding PHY initialization and
WoL handling.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Sun, 24 Jun 2018 16:40:23 +0000 (18:40 +0200)]
r8169: don't check WoL when powering down PHY and interface is down
We can power down the PHY irregardless of WOL settings if interface
is down. So far we would have left the PHY enabled if WOL options
are set and the interface is brought down.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Sun, 24 Jun 2018 16:39:06 +0000 (18:39 +0200)]
r8169: improve saved_wolopts handling
Let's make saved_wolopts a shadow copy of the WoL options. This allows
to simplify the code and get rid of calls to now unneeded function
__rtl8169_get_wol(). However don't remove __rtl8169_get_wol()
completely to be prepared for the case that we can respect BIOS WOL
settings again.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Sun, 24 Jun 2018 16:37:36 +0000 (18:37 +0200)]
r8169: improve phy initialization when resuming
Let's move calling rtl8169_init_phy() to __rtl8169_resume().
It simplifies the code and avoids rtl8169_init_phy() being called
when resuming whilst interface is down. rtl_open() will initialize
the PHY when the interface is brought up.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 25 Jun 2018 07:14:04 +0000 (16:14 +0900)]
Merge branch 'sched-couple-of-ndo_setup_tc-fixes-and-adjustments'
Jiri Pirko says:
====================
net: sched: couple of ndo_setup_tc fixes and adjustments
This patchset includes couple of patches that fix or adjust default
cases and return values in ndo_setup_tc implementations in drivers.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Sun, 24 Jun 2018 08:38:39 +0000 (10:38 +0200)]
cls_flower: fix error values for commands not supported by drivers
-EOPNOTSUPP is the error value that should be reported if a flower
command is not supported by a driver. Fix it in couple of Intel drivers.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Sun, 24 Jun 2018 08:38:38 +0000 (10:38 +0200)]
nfp: handle cls_flower command default case
Currently the default case is not handled, which with future command
introductions would introduce a warning. So handle it.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Sun, 24 Jun 2018 08:38:37 +0000 (10:38 +0200)]
bnxt: simplify cls_flower command switch and handle default case
Currently the default case is not handled, which with future command
introductions would introduce a warning. So handle it and make the
switch a bit simplier removing unneeded "rc" variable.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vakul Garg [Sun, 24 Jun 2018 20:07:50 +0000 (01:37 +0530)]
tls: Removed unused variable
Removed unused variable 'rxm' from tls_queue().
Signed-off-by: Vakul Garg <vakul.garg@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Sun, 24 Jun 2018 10:57:31 +0000 (11:57 +0100)]
sfc: make function efx_rps_hash_bucket static
The function efx_rps_hash_bucket is local to the source and
does not need to be in global scope, so make it static.
Cleans up sparse warning:
symbol 'efx_rps_hash_bucket' was not declared. Should it be static?
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sun, 24 Jun 2018 12:54:29 +0000 (20:54 +0800)]
Linux 4.18-rc2
Linus Torvalds [Sun, 24 Jun 2018 12:29:15 +0000 (20:29 +0800)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner:
"A pile of perf updates:
Kernel side:
- Remove an incorrect warning in uprobe_init_insn() when
insn_get_length() fails. The error return code is handled at the
call site.
- Move the inline keyword to the right place in the perf ringbuffer
code to address a W=1 build warning.
Tooling:
perf stat:
- Fix metric column header display alignment
- Improve error messages for default attributes, providing better
output for error in command line.
- Add --interval-clear option, to provide a 'watch' like printing
perf script:
- Show hw-cache events too
perf c2c:
- Fix data dependency problem in layout of 'struct c2c_hist_entry'
Core:
- Do not blindly assume that 'struct perf_evsel' can be obtained via
a straight forward container_of() as there are call sites which
hand in a plain 'struct hist' which is not part of a container.
- Fix error index in the PMU event parser, so that error messages can
point to the problematic token"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Move the inline keyword at the beginning of the function declaration
uprobes/x86: Remove incorrect WARN_ON() in uprobe_init_insn()
perf script: Show hw-cache events
perf c2c: Keep struct hist_entry at the end of struct c2c_hist_entry
perf stat: Add event parsing error handling to add_default_attributes
perf stat: Allow to specify specific metric column len
perf stat: Fix metric column header display alignment
perf stat: Use only color_fprintf call in print_metric_only
perf stat: Add --interval-clear option
perf tools: Fix error index for pmu event parser
perf hists: Reimplement hists__has_callchains()
perf hists browser gtk: Use hist_entry__has_callchains()
perf hists: Make hist_entry__has_callchains() work with 'perf c2c'
perf hists: Save the callchain_size in struct hist_entry
Linus Torvalds [Sun, 24 Jun 2018 12:18:19 +0000 (20:18 +0800)]
Merge branch 'sched-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull rseq fixes from Thomas Gleixer:
"A pile of rseq related fixups:
- Prevent infinite recursion when delivering SIGSEGV
- Remove the abort of rseq critical section on fork() as syscalls
inside rseq critical sections are explicitely forbidden. So no
point in doing the abort on the child.
- Align the rseq structure on 32 bytes in the ARM selftest code.
- Fix file permissions of the test script"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rseq: Avoid infinite recursion when delivering SIGSEGV
rseq/cleanup: Do not abort rseq c.s. in child on fork()
rseq/selftests/arm: Align 'struct rseq_cs' on 32 bytes
rseq/selftests: Make run_param_test.sh executable
Linus Torvalds [Sun, 24 Jun 2018 12:16:17 +0000 (20:16 +0800)]
Merge branch 'efi-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull EFI fixes from Thomas Gleixner:
"Two fixlets for the EFI maze:
- Properly zero variables to prevent an early boot hang on EFI mixed
mode systems
- Fix the fallout of merging the 32bit and 64bit variants of EFI PCI
related code which ended up chosing the 32bit variant of the actual
EFi call invocation which leads to failures on 64bit"
* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi/x86: Fix incorrect invocation of PciIo->Attributes()
efi/libstub/tpm: Initialize efi_physical_addr_t vars to zero for mixed mode
Linus Torvalds [Sun, 24 Jun 2018 12:06:42 +0000 (20:06 +0800)]
Merge branch 'core-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull core fixes from Thomas Gleixner:
"Two tiny fixes:
- Add the missing machine_real_restart() to objtools noreturn list so
it stops complaining
- Fix a trivial comment typo"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
kernel.h: Fix a typo in comment
objtool: Add machine_real_restart() to the noreturn list
Linus Torvalds [Sun, 24 Jun 2018 11:59:52 +0000 (19:59 +0800)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"A set of fixes for x86:
- Make Xen PV guest deal with speculative store bypass correctly
- Address more fallout from the 5-Level pagetable handling. Undo an
__initdata annotation to avoid section mismatch and malfunction
when post init code would touch the freed variable.
- Handle exception fixup in math_error() before calling notify_die().
The reverse call order incorrectly triggers notify_die() listeners
for soemthing which is handled correctly at the site which issues
the floating point instruction.
- Fix an off by one in the LLC topology calculation on AMD
- Handle non standard memory block sizes gracefully un UV platforms
- Plug a memory leak in the microcode loader
- Sanitize the purgatory build magic
- Add the x86 specific device tree bindings directory to the x86
MAINTAINER file patterns"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Fix 'no5lvl' handling
Revert "x86/mm: Mark __pgtable_l5_enabled __initdata"
x86/CPU/AMD: Fix LLC ID bit-shift calculation
MAINTAINERS: Add file patterns for x86 device tree bindings
x86/microcode/intel: Fix memleak in save_microcode_patch()
x86/platform/UV: Add kernel parameter to set memory block size
x86/platform/UV: Use new set memory block size function
x86/platform/UV: Add adjustable set memory block size function
x86/build: Remove unnecessary preparation for purgatory
Revert "kexec/purgatory: Add clean-up for purgatory directory"
x86/xen: Add call of speculative_store_bypass_ht_init() to PV paths
x86: Call fixup_exception() before notify_die() in math_error()
Linus Torvalds [Sun, 24 Jun 2018 11:48:30 +0000 (19:48 +0800)]
Merge branch 'x86-pti-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 pti fixes from Thomas Gleixner:
"Two small updates for the speculative distractions:
- Make it more clear to the compiler that array_index_mask_nospec()
is not subject for optimizations. It's not perfect, but ...
- Don't report XEN PV guests as vulnerable because their mitigation
state depends on the hypervisor. Report unknown and refer to the
hypervisor requirement"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/spectre_v1: Disable compiler optimizations over array_index_mask_nospec()
x86/pti: Don't report XenPV as vulnerable
Linus Torvalds [Sun, 24 Jun 2018 11:36:16 +0000 (19:36 +0800)]
Merge branch 'locking-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull locking fixes from Thomas Gleixner:
"A set of fixes and updates for the locking code:
- Prevent lockdep from updating irq state within its own code and
thereby confusing itself.
- Buid fix for older GCCs which mistreat anonymous unions
- Add a missing lockdep annotation in down_read_non_onwer() which
causes up_read_non_owner() to emit a lockdep splat
- Remove the custom alpha dec_and_lock() implementation which is
incorrect in terms of ordering and use the generic one.
The remaining two commits are not strictly fixes. They provide irqsave
variants of atomic_dec_and_lock() and refcount_dec_and_lock(). These
are required to merge the relevant updates and cleanups into different
maintainer trees for 4.19, so routing them into mainline without
actual users is the sanest approach.
They should have been in -rc1, but last weekend I took the liberty to
just avoid computers in order to regain some mental sanity"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/qspinlock: Fix build for anonymous union in older GCC compilers
locking/lockdep: Do not record IRQ state within lockdep code
locking/rwsem: Fix up_read_non_owner() warning with DEBUG_RWSEMS
locking/refcounts: Implement refcount_dec_and_lock_irqsave()
atomic: Add irqsave variant of atomic_dec_and_lock()
alpha: Remove custom dec_and_lock() implementation
Linus Torvalds [Sun, 24 Jun 2018 11:22:19 +0000 (19:22 +0800)]
Merge branch 'ras-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull ras fixes from Thomas Gleixner:
"A set of fixes for RAS/MCE:
- Improve the error message when the kernel cannot recover from a MCE
so the maximum amount of information gets provided.
- Individually check MCE recovery features on SkyLake CPUs instead of
assuming none when the CAPID0 register does not advertise the
general ability for recovery.
- Prevent MCE to output inconsistent messages which first show an
error location and then claim that the source is unknown.
- Prevent overwriting MCi_STATUS in the attempt to gather more
information when a fatal MCE has alreay been detected. This leads
to empty status values in the printout and failing to react
promptly on the fatal event"
* 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mce: Fix incorrect "Machine check from unknown source" message
x86/mce: Do not overwrite MCi_STATUS in mce_no_way_out()
x86/mce: Check for alternate indication of machine check recovery on Skylake
x86/mce: Improve error message when kernel cannot recover
Linus Torvalds [Sun, 24 Jun 2018 11:16:42 +0000 (19:16 +0800)]
Merge branch 'timers-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull timer fixes from Thomas Gleixner:
"A small set of fixes for time(r) related issues:
- Fix a long standing conversion issue in jiffies_to_msecs() for odd
HZ values like 1024 or 1200 which resulted in returning 0 for small
jiffies values due to rounding down.
- Use the proper CONFIG symbol in the new Y2038 safe compat code for
posix-timers. Not yet a visible breakage, but this will immediately
trigger when the architecture support for the new interfaces is
merged.
- Return an error code in the STM32 clocksource driver on failure
instead of success.
- Remove the redundant and stale irq disabled check in the posix cpu
timer code. The check is at the wrong place anyway and lockdep
already covers it via the sighand lock locking coverage"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
time: Make sure jiffies_to_msecs() preserves non-zero time periods
posix-timers: Fix nanosleep_copyout() for CONFIG_COMPAT_32BIT_TIME
clocksource/drivers/stm32: Fix error return code
posix-cpu-timers: Remove lockdep_assert_irqs_disabled()