platform/kernel/linux-starfive.git
5 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
David S. Miller [Wed, 8 May 2019 00:22:09 +0000 (17:22 -0700)]
Merge git://git./linux/kernel/git/davem/net

Minor conflict with the DSA legacy code removal.

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agocxgb4: Fix error path in cxgb4_init_module
YueHaibing [Mon, 6 May 2019 15:57:54 +0000 (23:57 +0800)]
cxgb4: Fix error path in cxgb4_init_module

BUG: unable to handle kernel paging request at ffffffffa016a270
PGD 3270067 P4D 3270067 PUD 3271063 PMD 230bbd067 PTE 0
Oops: 0000 [#1
CPU: 0 PID: 6134 Comm: modprobe Not tainted 5.1.0+ #33
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
RIP: 0010:atomic_notifier_chain_register+0x24/0x60
Code: 1f 80 00 00 00 00 55 48 89 e5 41 54 49 89 f4 53 48 89 fb e8 ae b4 38 01 48 8b 53 38 48 8d 4b 38 48 85 d2 74 20 45 8b 44 24 10 <44> 3b 42 10 7e 08 eb 13 44 39 42 10 7c 0d 48 8d 4a 08 48 8b 52 08
RSP: 0018:ffffc90000e2bc60 EFLAGS: 00010086
RAX: 0000000000000292 RBX: ffffffff83467240 RCX: ffffffff83467278
RDX: ffffffffa016a260 RSI: ffffffff83752140 RDI: ffffffff83467240
RBP: ffffc90000e2bc70 R08: 0000000000000000 R09: 0000000000000001
R10: 0000000000000000 R11: 00000000014fa61f R12: ffffffffa01c8260
R13: ffff888231091e00 R14: 0000000000000000 R15: ffffc90000e2be78
FS:  00007fbd8d7cd540(0000) GS:ffff888237a00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffa016a270 CR3: 000000022c7e3000 CR4: 00000000000006f0
Call Trace:
 register_inet6addr_notifier+0x13/0x20
 cxgb4_init_module+0x6c/0x1000 [cxgb4
 ? 0xffffffffa01d7000
 do_one_initcall+0x6c/0x3cc
 ? do_init_module+0x22/0x1f1
 ? rcu_read_lock_sched_held+0x97/0xb0
 ? kmem_cache_alloc_trace+0x325/0x3b0
 do_init_module+0x5b/0x1f1
 load_module+0x1db1/0x2690
 ? m_show+0x1d0/0x1d0
 __do_sys_finit_module+0xc5/0xd0
 __x64_sys_finit_module+0x15/0x20
 do_syscall_64+0x6b/0x1d0
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

If pci_register_driver fails, register inet6addr_notifier is
pointless. This patch fix the error path in cxgb4_init_module.

Fixes: b5a02f503caa ("cxgb4 : Update ipv6 address handling api")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: improve pause mode reporting in phy_print_status
Heiner Kallweit [Sun, 5 May 2019 17:03:51 +0000 (19:03 +0200)]
net: phy: improve pause mode reporting in phy_print_status

So far we report symmetric pause only, and we don't consider the local
pause capabilities. Let's properly consider local and remote
capabilities, and report also asymmetric pause.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodt-bindings: net: Fix a typo in the phy-mode list for ethernet bindings
Maxime Chevallier [Tue, 7 May 2019 15:35:55 +0000 (17:35 +0200)]
dt-bindings: net: Fix a typo in the phy-mode list for ethernet bindings

The phy_mode "2000base-x" is actually supposed to be "1000base-x", even
though the commit title of the original patch says otherwise.

Fixes: 55601a880690 ("net: phy: Add 2000base-x, 2500base-x and rxaui modes")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: macb: Change interrupt and napi enable order in open
Harini Katakam [Tue, 7 May 2019 14:29:10 +0000 (19:59 +0530)]
net: macb: Change interrupt and napi enable order in open

Current order in open:
-> Enable interrupts (macb_init_hw)
-> Enable NAPI
-> Start PHY

Sequence of RX handling:
-> RX interrupt occurs
-> Interrupt is cleared and interrupt bits disabled in handler
-> NAPI is scheduled
-> In NAPI, RX budget is processed and RX interrupts are re-enabled

With the above, on QEMU or fixed link setups (where PHY state doesn't
matter), there's a chance macb RX interrupt occurs before NAPI is
enabled. This will result in NAPI being scheduled before it is enabled.
Fix this macb open by changing the order.

Fixes: ae1f2a56d273 ("net: macb: Added support for many RX queues")
Signed-off-by: Harini Katakam <harini.katakam@xilinx.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ll_temac: Improve error message on error IRQ
Esben Haabendal [Tue, 7 May 2019 06:42:57 +0000 (08:42 +0200)]
net: ll_temac: Improve error message on error IRQ

The channel status register value can be very helpful when debugging
SDMA problems.

Signed-off-by: Esben Haabendal <esben@geanix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/sched: remove block pointer from common offload structure
Pieter Jansen van Vuuren [Tue, 7 May 2019 00:24:21 +0000 (17:24 -0700)]
net/sched: remove block pointer from common offload structure

Based on feedback from Jiri avoid carrying a pointer to the tcf_block
structure in the tc_cls_common_offload structure. Instead store
a flag in driver private data which indicates if offloads apply
to a shared block at block binding time.

Suggested-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'of_get_mac_address-ERR_PTR-fixes'
David S. Miller [Tue, 7 May 2019 19:22:47 +0000 (12:22 -0700)]
Merge branch 'of_get_mac_address-ERR_PTR-fixes'

Petr Štetiar says:

====================
of_get_mac_address ERR_PTR fixes

this patch series is an attempt to fix the mess, I've somehow managed to
introduce.

First patch in this series is defacto v5 of the previous 05/10 patch in the
series, but since the v4 of this 05/10 patch wasn't picked up by the
patchwork for some unknown reason, this patch wasn't applied with the other
9 patches in the series, so I'm resending it as a separate patch of this
fixup series again.

Second patch is a result of this rebase against net-next tree, where I was
checking again all current users of of_get_mac_address and found out, that
there's new one in DSA, so I've converted this user to the new ERR_PTR
encoded error value as well.

Third patch which was sent as v5 wasn't considered for merge, but I still
think, that we need to check for possible NULL value, thus current IS_ERR
check isn't sufficient and we need to use IS_ERR_OR_NULL instead.

Fourth patch fixes warning reported by kbuild test robot.
====================

Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Leonard Crestez <leonard.crestez@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ethernet: support of_get_mac_address new ERR_PTR error
Petr Štetiar [Mon, 6 May 2019 21:27:04 +0000 (23:27 +0200)]
net: ethernet: support of_get_mac_address new ERR_PTR error

There was NVMEM support added to of_get_mac_address, so it could now
return ERR_PTR encoded error values, so we need to adjust all current
users of of_get_mac_address to this new fact.

While at it, remove superfluous is_valid_ether_addr as the MAC address
returned from of_get_mac_address is always valid and checked by
is_valid_ether_addr anyway.

Fixes: d01f449c008a ("of_net: add NVMEM support to of_get_mac_address")
Signed-off-by: Petr Štetiar <ynezz@true.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: usb: smsc: fix warning reported by kbuild test robot
Petr Štetiar [Mon, 6 May 2019 21:24:47 +0000 (23:24 +0200)]
net: usb: smsc: fix warning reported by kbuild test robot

This patch fixes following warning reported by kbuild test robot:

 In function ‘memcpy’,
     inlined from ‘smsc75xx_init_mac_address’ at drivers/net/usb/smsc75xx.c:778:3,
     inlined from ‘smsc75xx_bind’ at drivers/net/usb/smsc75xx.c:1501:2:
 ./include/linux/string.h:355:9: warning: argument 2 null where non-null expected [-Wnonnull]
   return __builtin_memcpy(p, q, size);
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
 drivers/net/usb/smsc75xx.c: In function ‘smsc75xx_bind’:
 ./include/linux/string.h:355:9: note: in a call to built-in function ‘__builtin_memcpy’

I've replaced the offending memcpy with ether_addr_copy, because I'm
100% sure, that of_get_mac_address can't return NULL as it returns valid
pointer or ERR_PTR encoded value, nothing else.

I'm hesitant to just change IS_ERR into IS_ERR_OR_NULL check, as this
would make the warning disappear also, but it would be confusing to
check for impossible return value just to make a compiler happy.

Fixes: adfb3cb2c52e ("net: usb: support of_get_mac_address new ERR_PTR error")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Petr Štetiar <ynezz@true.cz>
Reviewed-by: Woojung Huh <woojung.huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agostaging: octeon-ethernet: Fix of_get_mac_address ERR_PTR check
Petr Štetiar [Mon, 6 May 2019 21:24:46 +0000 (23:24 +0200)]
staging: octeon-ethernet: Fix of_get_mac_address ERR_PTR check

Commit 284eb160681c ("staging: octeon-ethernet: support
of_get_mac_address new ERR_PTR error") has introduced checking for
ERR_PTR encoded error value from of_get_mac_address with IS_ERR macro,
which is not sufficient in this case, as the mac variable is set to NULL
initialy and if the kernel is compiled without DT support this NULL
would get passed to IS_ERR, which would lead to the wrong decision and
would pass that NULL pointer and invalid MAC address further.

Fixes: 284eb160681c ("staging: octeon-ethernet: support of_get_mac_address new ERR_PTR error")
Signed-off-by: Petr Štetiar <ynezz@true.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: support of_get_mac_address new ERR_PTR error
Petr Štetiar [Mon, 6 May 2019 21:24:45 +0000 (23:24 +0200)]
net: dsa: support of_get_mac_address new ERR_PTR error

There was NVMEM support added to of_get_mac_address, so it could now
return ERR_PTR encoded error values, so we need to adjust all current
users of of_get_mac_address to this new fact.

While at it, remove superfluous is_valid_ether_addr as the MAC address
returned from of_get_mac_address is always valid and checked by
is_valid_ether_addr anyway.

Fixes: d01f449c008a ("of_net: add NVMEM support to of_get_mac_address")
Signed-off-by: Petr Štetiar <ynezz@true.cz>
Tested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: sja1105: Fix status initialization in sja1105_get_ethtool_stats
Nathan Chancellor [Mon, 6 May 2019 20:24:47 +0000 (13:24 -0700)]
net: dsa: sja1105: Fix status initialization in sja1105_get_ethtool_stats

Clang warns:

drivers/net/dsa/sja1105/sja1105_ethtool.c:316:39: warning: suggest
braces around initialization of subobject [-Wmissing-braces]
        struct sja1105_port_status status = {0};
                                             ^
                                             {}
1 warning generated.

One way to fix these warnings is to add additional braces like Clang
suggests; however, there has been a bit of push back from some
maintainers[1][2], who just prefer memset as it is unambiguous, doesn't
depend on a particular compiler version[3], and properly initializes all
subobjects. Do that here so there are no more warnings.

[1]: https://lore.kernel.org/lkml/022e41c0-8465-dc7a-a45c-64187ecd9684@amd.com/
[2]: https://lore.kernel.org/lkml/20181128.215241.702406654469517539.davem@davemloft.net/
[3]: https://lore.kernel.org/lkml/20181116150432.2408a075@redhat.com/

Fixes: 52c34e6e125c ("net: dsa: sja1105: Add support for ethtool port counters")
Link: https://github.com/ClangBuiltLinux/linux/issues/471
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agovrf: sit mtu should not be updated when vrf netdev is the link
Stephen Suryaputra [Mon, 6 May 2019 19:00:01 +0000 (15:00 -0400)]
vrf: sit mtu should not be updated when vrf netdev is the link

VRF netdev mtu isn't typically set and have an mtu of 65536. When the
link of a tunnel is set, the tunnel mtu is changed from 1480 to the link
mtu minus tunnel header. In the case of VRF netdev is the link, then the
tunnel mtu becomes 65516. So, fix it by not setting the tunnel mtu in
this case.

Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: Fix error cleanup path in dsa_init_module
YueHaibing [Mon, 6 May 2019 15:25:29 +0000 (23:25 +0800)]
net: dsa: Fix error cleanup path in dsa_init_module

BUG: unable to handle kernel paging request at ffffffffa01c5430
PGD 3270067 P4D 3270067 PUD 3271063 PMD 230bc5067 PTE 0
Oops: 0000 [#1
CPU: 0 PID: 6159 Comm: modprobe Not tainted 5.1.0+ #33
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
RIP: 0010:raw_notifier_chain_register+0x16/0x40
Code: 63 f8 66 90 e9 5d ff ff ff 90 90 90 90 90 90 90 90 90 90 90 55 48 8b 07 48 89 e5 48 85 c0 74 1c 8b 56 10 3b 50 10 7e 07 eb 12 <39> 50 10 7c 0d 48 8d 78 08 48 8b 40 08 48 85 c0 75 ee 48 89 46 08
RSP: 0018:ffffc90001c33c08 EFLAGS: 00010282
RAX: ffffffffa01c5420 RBX: ffffffffa01db420 RCX: 4fcef45928070a8b
RDX: 0000000000000000 RSI: ffffffffa01db420 RDI: ffffffffa01b0068
RBP: ffffc90001c33c08 R08: 000000003e0a33d0 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000094443661 R12: ffff88822c320700
R13: ffff88823109be80 R14: 0000000000000000 R15: ffffc90001c33e78
FS:  00007fab8bd08540(0000) GS:ffff888237a00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffa01c5430 CR3: 00000002297ea000 CR4: 00000000000006f0
Call Trace:
 register_netdevice_notifier+0x43/0x250
 ? 0xffffffffa01e0000
 dsa_slave_register_notifier+0x13/0x70 [dsa_core
 ? 0xffffffffa01e0000
 dsa_init_module+0x2e/0x1000 [dsa_core
 do_one_initcall+0x6c/0x3cc
 ? do_init_module+0x22/0x1f1
 ? rcu_read_lock_sched_held+0x97/0xb0
 ? kmem_cache_alloc_trace+0x325/0x3b0
 do_init_module+0x5b/0x1f1
 load_module+0x1db1/0x2690
 ? m_show+0x1d0/0x1d0
 __do_sys_finit_module+0xc5/0xd0
 __x64_sys_finit_module+0x15/0x20
 do_syscall_64+0x6b/0x1d0
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Cleanup allocated resourses if there are errors,
otherwise it will trgger memleak.

Fixes: c9eb3e0f8701 ("net: dsa: Add support for learning FDB through notification")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agol2tp: Fix possible NULL pointer dereference
YueHaibing [Mon, 6 May 2019 14:44:04 +0000 (22:44 +0800)]
l2tp: Fix possible NULL pointer dereference

BUG: unable to handle kernel NULL pointer dereference at 0000000000000128
PGD 0 P4D 0
Oops: 0000 [#1
CPU: 0 PID: 5697 Comm: modprobe Tainted: G        W         5.1.0-rc7+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
RIP: 0010:__lock_acquire+0x53/0x10b0
Code: 8b 1c 25 40 5e 01 00 4c 8b 6d 10 45 85 e4 0f 84 bd 06 00 00 44 8b 1d 7c d2 09 02 49 89 fe 41 89 d2 45 85 db 0f 84 47 02 00 00 <48> 81 3f a0 05 70 83 b8 00 00 00 00 44 0f 44 c0 83 fe 01 0f 86 3a
RSP: 0018:ffffc90001c07a28 EFLAGS: 00010002
RAX: 0000000000000000 RBX: ffff88822f038440 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000128
RBP: ffffc90001c07a88 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000001
R13: 0000000000000000 R14: 0000000000000128 R15: 0000000000000000
FS:  00007fead0811540(0000) GS:ffff888237a00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000128 CR3: 00000002310da000 CR4: 00000000000006f0
Call Trace:
 ? __lock_acquire+0x24e/0x10b0
 lock_acquire+0xdf/0x230
 ? flush_workqueue+0x71/0x530
 flush_workqueue+0x97/0x530
 ? flush_workqueue+0x71/0x530
 l2tp_exit_net+0x170/0x2b0 [l2tp_core
 ? l2tp_exit_net+0x93/0x2b0 [l2tp_core
 ops_exit_list.isra.6+0x36/0x60
 unregister_pernet_operations+0xb8/0x110
 unregister_pernet_device+0x25/0x40
 l2tp_init+0x55/0x1000 [l2tp_core
 ? 0xffffffffa018d000
 do_one_initcall+0x6c/0x3cc
 ? do_init_module+0x22/0x1f1
 ? rcu_read_lock_sched_held+0x97/0xb0
 ? kmem_cache_alloc_trace+0x325/0x3b0
 do_init_module+0x5b/0x1f1
 load_module+0x1db1/0x2690
 ? m_show+0x1d0/0x1d0
 __do_sys_finit_module+0xc5/0xd0
 __x64_sys_finit_module+0x15/0x20
 do_syscall_64+0x6b/0x1d0
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7fead031a839
Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1f f6 2c 00 f7 d8 64 89 01 48
RSP: 002b:00007ffe8d9acca8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
RAX: ffffffffffffffda RBX: 0000560078398b80 RCX: 00007fead031a839
RDX: 0000000000000000 RSI: 000056007659dc2e RDI: 0000000000000003
RBP: 000056007659dc2e R08: 0000000000000000 R09: 0000560078398b80
R10: 0000000000000003 R11: 0000000000000246 R12: 0000000000000000
R13: 00005600783a04a0 R14: 0000000000040000 R15: 0000560078398b80
Modules linked in: l2tp_core(+) e1000 ip_tables ipv6 [last unloaded: l2tp_core
CR2: 0000000000000128
---[ end trace 8322b2b8bf83f8e1

If alloc_workqueue fails in l2tp_init, l2tp_net_ops
is unregistered on failure path. Then l2tp_exit_net
is called which will flush NULL workqueue, this patch
add a NULL check to fix it.

Fixes: 67e04c29ec0d ("l2tp: unregister l2tp_net_ops on failure path")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotaprio: add null check on sched_nest to avoid potential null pointer dereference
Colin Ian King [Sun, 5 May 2019 21:50:19 +0000 (22:50 +0100)]
taprio: add null check on sched_nest to avoid potential null pointer dereference

The call to nla_nest_start_noflag can return a null pointer and currently
this is not being checked and this can lead to a null pointer dereference
when the null pointer sched_nest is passed to function nla_nest_end. Fix
this by adding in a null pointer check.

Addresses-Coverity: ("Dereference null return value")
Fixes: a3d43c0d56f1 ("taprio: Add support adding an admin schedule")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: mvpp2: cls: fix less than zero check on a u32 variable
Colin Ian King [Sun, 5 May 2019 21:38:14 +0000 (22:38 +0100)]
net: mvpp2: cls: fix less than zero check on a u32 variable

The signed return from the call to mvpp2_cls_c2_port_flow_index is being
assigned to the u32 variable c2.index and then checked for a negative
error condition which is always going to be false. Fix this by assigning
the return to the int variable index and checking this instead.

Addresses-Coverity: ("Unsigned compared against 0")
Fixes: 90b509b39ac9 ("net: mvpp2: cls: Add Classification offload support")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'fc-quic-pacing'
David S. Miller [Tue, 7 May 2019 19:09:25 +0000 (12:09 -0700)]
Merge branch 'fc-quic-pacing'

Eric Dumazet says:

====================
net_sched: sch_fq: enable in-kernel pacing for QUIC servers

Willem added GSO support to UDP stack, greatly improving performance
of QUIC servers.

We also want to enable in-kernel pacing, which is possible thanks to EDT
model, since each sendmsg() can provide a timestamp for the skbs.

We have to change sch_fq to enable feeding packets in arbitrary EDT order,
and make sure that packet classification do not trust unconnected sockets.

Note that this patch series also is a prereq for a future TCP change
enabling per-flow delays/reorders/losses to implement high performance
TCP emulators.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet_sched: sch_fq: handle non connected flows
Eric Dumazet [Sat, 4 May 2019 23:48:54 +0000 (16:48 -0700)]
net_sched: sch_fq: handle non connected flows

FQ packet scheduler assumed that packets could be classified
based on their owning socket.

This means that if a UDP server uses one UDP socket to send
packets to different destinations, packets all land
in one FQ flow.

This is unfair, since each TCP flow has a unique bucket, meaning
that in case of pressure (fully utilised uplink), TCP flows
have more share of the bandwidth.

If we instead detect unconnected sockets, we can use a stochastic
hash based on the 4-tuple hash.

This also means a QUIC server using one UDP socket will properly
spread the outgoing packets to different buckets, and in-kernel
pacing based on EDT model will no longer risk having big rb-tree on
one flow.

Note that UDP application might provide the skb->hash in an
ancillary message at sendmsg() time to avoid the cost of a dissection
in fq packet scheduler.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet_sched: sch_fq: do not assume EDT packets are ordered
Eric Dumazet [Sat, 4 May 2019 23:48:53 +0000 (16:48 -0700)]
net_sched: sch_fq: do not assume EDT packets are ordered

TCP stack makes sure packets for a given flow are monotically
increasing, but we want to allow UDP packets to use EDT as
well, so that QUIC servers can use in-kernel pacing.

This patch adds a per-flow rb-tree on which packets might
be stored. We still try to use the linear list for the
typical cases where packets are queued with monotically
increasing skb->tstamp, since queue/dequeue packets on
a standard list is O(1).

Note that the ability to store packets in arbitrary EDT
order will allow us to implement later a per TCP socket
mechanism adding delays (with jitter eventually) and reorders,
to implement convenient network emulators.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'hns3-next'
David S. Miller [Tue, 7 May 2019 17:37:14 +0000 (10:37 -0700)]
Merge branch 'hns3-next'

Huazhong Tan says:

====================
cleanup & optimizations & bugfixes for HNS3 driver

This patchset contains some cleanup related to hns3_enet_ring
struct and tx bd filling process, optimizations related
to rx page reusing, barrier using and tso handling process,
bugfixes related to tunnel type handling and error handling for
desc filling.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: use devm_kcalloc when allocating desc_cb
Yunsheng Lin [Mon, 6 May 2019 02:48:52 +0000 (10:48 +0800)]
net: hns3: use devm_kcalloc when allocating desc_cb

This patch uses devm_kcalloc instead of kcalloc when allocating
ring->desc_cb, because devm_kcalloc not only ensure to free the
memory when the dev is deallocted, but also allocate the memory
from it's device memory node.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: some cleanup for struct hns3_enet_ring
Yunsheng Lin [Mon, 6 May 2019 02:48:51 +0000 (10:48 +0800)]
net: hns3: some cleanup for struct hns3_enet_ring

This patch removes some unused field in struct hns3_enet_ring,
use ring->dev for ring_to_dev macro, and use dev consistently
in hns3_fill_desc.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: unify the page reusing for page size 4K and 64K
Yunsheng Lin [Mon, 6 May 2019 02:48:50 +0000 (10:48 +0800)]
net: hns3: unify the page reusing for page size 4K and 64K

When page size is 64K, RX buffer is currently not reused when the
page_offset is moved to last buffer. This patch adds checking to
decide whether the buffer page can be reused when last_offset is
moved beyond last offset.

If the driver is the only user of page when page_offset is moved
to beyond last offset, then buffer can be reused and page_offset
is set to zero.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: optimize the barrier using when cleaning TX BD
Yunsheng Lin [Mon, 6 May 2019 02:48:49 +0000 (10:48 +0800)]
net: hns3: optimize the barrier using when cleaning TX BD

Currently, a barrier is used when cleaning each TX BD, which may
cause performance degradation.

This patch optimizes it to use one barrier when cleaning TX BD
each round.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: fix error handling for desc filling
Yunsheng Lin [Mon, 6 May 2019 02:48:48 +0000 (10:48 +0800)]
net: hns3: fix error handling for desc filling

When desc filling fails in hns3_nic_net_xmit, it will call
hns3_clear_desc to unmap the dma mapping. But currently the
ring->next_to_use points to the desc where the desc filling
or dma mapping return error, which means the desc that
ring->next_to_use points to has not done the dma mapping,
the desc that need unmapping is before the ring->next_to_use.

This patch fixes it by calling ring_ptr_move_bw(next_to_use)
before doing unmapping operation, and set desc_cb->dma to
zero to avoid freeing it again when unloading.

Also, when filling skb head or frag fails, both need to unmap
all the way back to next_to_use_head, so remove one desc filling
error handling.

Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: combine len and checksum handling for inner and outer header.
Yunsheng Lin [Mon, 6 May 2019 02:48:47 +0000 (10:48 +0800)]
net: hns3: combine len and checksum handling for inner and outer header.

When filling len and checksum info to description, there is some
similar checking or calculation.

So this patch adds hns3_set_l2l3l4 to fill the inner(/normal)
header's len and checksum info. If it is a encapsulation skb, it
calls hns3_set_outer_l2l3l4 to handle the outer header's len and
checksum info, in order to avoid some similar checking or
calculation.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: refactor BD filling for l2l3l4 info
Yunsheng Lin [Mon, 6 May 2019 02:48:46 +0000 (10:48 +0800)]
net: hns3: refactor BD filling for l2l3l4 info

This patch separates the inner and outer l2l3l4 len handling in
hns3_set_l2l3l4_len, this is a preparation to combine the l2l3l4
len and checksum handling for inner and outer header.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: fix for tunnel type handling in hns3_rx_checksum
Yunsheng Lin [Mon, 6 May 2019 02:48:45 +0000 (10:48 +0800)]
net: hns3: fix for tunnel type handling in hns3_rx_checksum

According to hardware user manual, the tunnel packet type is
available in the rx.ol_info field of struct hns3_desc. Currently
the tunnel packet type is decided by the rx.l234_info, which may
cause RX checksum handling error.

This patch fixes it by using the correct field in struct hns3_desc
to decide the tunnel packet type.

Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: add linearizing checking for TSO case
Yunsheng Lin [Mon, 6 May 2019 02:48:44 +0000 (10:48 +0800)]
net: hns3: add linearizing checking for TSO case

HW requires every continuous 8 buffer data to be larger than MSS,
we simplify it by ensuring skb_headlen + the first continuous
7 frags to to be larger than GSO header len + mss, and the
remaining continuous 7 frags to be larger than MSS except the
last 7 frags.

This patch adds hns3_skb_need_linearized to handle it for TSO
case.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: add counter for times RX pages gets allocated
Yunsheng Lin [Mon, 6 May 2019 02:48:43 +0000 (10:48 +0800)]
net: hns3: add counter for times RX pages gets allocated

Currently, using "ethtool --statistics" can show how many time RX
page have been reused, but there is no counter for RX page not
being reused.

This patch adds non_reuse_pg counter to better debug the performance
issue, because it is hard to determine when the RX page is reused
or not if there is no such counter.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: use napi_schedule_irqoff in hard interrupts handlers
Yunsheng Lin [Mon, 6 May 2019 02:48:42 +0000 (10:48 +0800)]
net: hns3: use napi_schedule_irqoff in hard interrupts handlers

napi_schedule_irqoff is introduced to be used from hard interrupts
handlers or when irqs are already masked, see:

https://lists.openwall.net/netdev/2014/10/29/2

So this patch replaces napi_schedule with napi_schedule_irqoff.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: unify maybe_stop_tx for TSO and non-TSO case
Yunsheng Lin [Mon, 6 May 2019 02:48:41 +0000 (10:48 +0800)]
net: hns3: unify maybe_stop_tx for TSO and non-TSO case

Currently, maybe_stop_tx ops for TSO and non-TSO case share some BD
calculation code, so this patch unifies the maybe_stop_tx by removing
the maybe_stop_tx ops. skb_is_gso() can be used to differentiate the
case between TSO and non-TSO case if there is need to handle special
case for TSO case.

This patch also add tx_copy field in "ethtool --statistics" to help
better debug the performance issue caused by calling skb_copy.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'net-dsa-lantiq-Add-bridge-offloading'
David S. Miller [Tue, 7 May 2019 17:34:45 +0000 (10:34 -0700)]
Merge branch 'net-dsa-lantiq-Add-bridge-offloading'

Hauke Mehrtens says:

====================
net: dsa: lantiq: Add bridge offloading

This adds bridge offloading for the Intel / Lantiq GSWIP 2.1 switch.

Changes since:
v2:
 - Added Fixes tag to patch 1
 - Fixed typo
 - added GSWIP_TABLE_MAC_BRIDGE_STATIC and made use of it
 - used GSWIP_TABLE_MAC_BRIDGE in more places

v1:
 - fix typo signle -> single
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: lantiq: Add Forwarding Database access
Hauke Mehrtens [Sun, 5 May 2019 22:25:10 +0000 (00:25 +0200)]
net: dsa: lantiq: Add Forwarding Database access

This adds functions to add and remove static entries to and from the
forwarding database and dump the full forwarding database.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: lantiq: Add fast age function
Hauke Mehrtens [Sun, 5 May 2019 22:25:09 +0000 (00:25 +0200)]
net: dsa: lantiq: Add fast age function

Fast aging per port is not supported directly by the hardware, it is
only possible to configure a global aging time.

Do the fast aging by iterating over the MAC forwarding table and remove
all dynamic entries for a given port.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: lantiq: Add VLAN aware bridge offloading
Hauke Mehrtens [Sun, 5 May 2019 22:25:08 +0000 (00:25 +0200)]
net: dsa: lantiq: Add VLAN aware bridge offloading

The VLAN aware bridge offloading is similar to the VLAN unaware
offloading, this makes it possible to offload the VLAN bridge
functionalities.

The hardware supports up to 64 VLAN bridge entries, we already use one
entry for each LAN port to prevent forwarding of packets between the
ports when the ports are not in a bridge, so in the end we have 57
possible VLANs.

The VLAN filtering is currently only active when the ports are in a
bridge, VLAN filtering for ports not in a bridge is not implemented.

It is currently not possible to change between VLAN filtering and not
filtering while the port is already in a bridge, this would make the
driver more complicated.

The VLANs are only defined on bridge entries, so we will not add
anything into the hardware when the port joins a bridge if it is doing
VLAN filtering, but only when an allowed VLAN is added.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: lantiq: Add VLAN unaware bridge offloading
Hauke Mehrtens [Sun, 5 May 2019 22:25:07 +0000 (00:25 +0200)]
net: dsa: lantiq: Add VLAN unaware bridge offloading

This allows to offload bridges with DSA to the switch hardware and do
the packet forwarding in hardware.

This implements generic functions to access the switch hardware tables,
which are used to control many features of the switch.

This patch activates the MAC learning by removing the MAC address table
lock, to prevent uncontrolled forwarding of packets between all the LAN
ports, they are added into individual bridge tables entries with
individual flow ids and the switch will do the MAC learning for each
port separately before they are added to a real bridge.

Each bridge consist of an entry in the active VLAN table and the VLAN
mapping table, table entries with the same index are matching. In the
VLAN unaware mode we configure everything with VLAN ID 0, but we use
different flow IDs, the switch should handle all VLANs as normal payload
and ignore them. When the hardware looks for the port of the destination
MAC address it only takes the entries which have the same flow ID of the
ingress packet.

The bridges are configured with 64 possible entries with these
information:
Table Index, 0...63
VLAN ID, 0...4095: VLAN ID 0 is untagged
flow ID, 0..63: Same flow IDs share entries in MAC learning table
port map, one bit for each port number
tagged port map, one bit for each port number

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: lantiq: Allow special tags only on CPU port
Hauke Mehrtens [Sun, 5 May 2019 22:25:06 +0000 (00:25 +0200)]
net: dsa: lantiq: Allow special tags only on CPU port

Allow the special tag in ingress only on the CPU port and not on all
ports. A packet with a special tag could circumvent the hardware
forwarding and should only be allowed on the CPU port where Linux
controls the port.

Fixes: 14fceff4771e ("net: dsa: Add Lantiq / Intel DSA driver for vrx200)"
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
David S. Miller [Tue, 7 May 2019 16:29:16 +0000 (09:29 -0700)]
Merge git://git./linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2019-05-06

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Two AF_XDP libbpf fixes for socket teardown; first one an invalid
   munmap and the other one an invalid skmap cleanup, both from Björn.

2) More graceful CONFIG_DEBUG_INFO_BTF handling when pahole is not
   present in the system to generate vmlinux btf info, from Andrii.

3) Fix libbpf and thus fix perf build error with uClibc on arc
   architecture, from Vineet.

4) Fix missing libbpf_util.h header install in libbpf, from William.

5) Exclude bash-completion/bpftool from .gitignore pattern, from Masahiro.

6) Fix up rlimit in test_libbpf_open kselftest test case, from Yonghong.

7) Minor misc cleanups.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
David S. Miller [Tue, 7 May 2019 16:25:43 +0000 (09:25 -0700)]
Merge git://git./pub/scm/linux/kernel/git/bpf/bpf

Daniel Borkmann says:

====================
pull-request: bpf 2019-05-06

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) Two x32 JIT fixes: one which has buggy signed comparisons in 64
   bit conditional jumps and another one for 64 bit negation, both
   from Wang.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agolibbpf: remove unnecessary cast-to-void
Björn Töpel [Mon, 6 May 2019 09:24:43 +0000 (11:24 +0200)]
libbpf: remove unnecessary cast-to-void

The patches with fixes tags added a cast-to-void in the places when
the return value of a function was ignored.

This is not common practice in the kernel, and is therefore removed in
this patch.

Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Fixes: 5750902a6e9b ("libbpf: proper XSKMAP cleanup")
Fixes: 0e6741f09297 ("libbpf: fix invalid munmap call")
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
5 years agokbuild: tolerate missing pahole when generating BTF
Andrii Nakryiko [Mon, 6 May 2019 00:10:33 +0000 (17:10 -0700)]
kbuild: tolerate missing pahole when generating BTF

When BTF generation is enabled through CONFIG_DEBUG_INFO_BTF,
scripts/link-vmlinux.sh detects if pahole version is too old and
gracefully continues build process, skipping BTF generation build step.
But if pahole is not available, build will still fail. This patch adds
check for whether pahole exists at all and bails out gracefully, if not.

Cc: Alexei Starovoitov <ast@fb.com>
Reported-by: Yonghong Song <yhs@fb.com>
Fixes: e83b9f55448a ("kbuild: add ability to generate BTF type info for vmlinux")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
5 years agoMerge branch 'r8169-replace-some-magic-with-more-speaking-functions'
David S. Miller [Mon, 6 May 2019 04:58:36 +0000 (21:58 -0700)]
Merge branch 'r8169-replace-some-magic-with-more-speaking-functions'

Heiner Kallweit says:

====================
r8169: replace some magic with more speaking functions

Based on info from Realtek replace some magic with speaking functions
even though the exact meaning of certain values isn't known.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agor8169: add rtl8168g_set_pause_thresholds
Heiner Kallweit [Sun, 5 May 2019 10:34:25 +0000 (12:34 +0200)]
r8169: add rtl8168g_set_pause_thresholds

Based on info from Realtek add a function for defining the thresholds
controlling ethernet flow control.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agor8169: add rtl_set_fifo_size
Heiner Kallweit [Sun, 5 May 2019 10:33:40 +0000 (12:33 +0200)]
r8169: add rtl_set_fifo_size

Based on info from Realtek replace FIFO size config magic with
a function.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'mlxsw-spectrum-Implement-loopback-ethtool-feature'
David S. Miller [Mon, 6 May 2019 04:56:57 +0000 (21:56 -0700)]
Merge branch 'mlxsw-spectrum-Implement-loopback-ethtool-feature'

Ido Schimmel says:

====================
mlxsw: spectrum: Implement loopback ethtool feature

This patchset from Jiri allows users to enable loopback feature for
individual ports using ethtool. The loopback feature is useful for
testing purposes and will also be used by upcoming patchsets to enable
the monitoring of buffer drops.

Patch #1 adds the relevant device register.

Patch #2 Implements support in the driver.

Patch #3 adds a selftest.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests: Add loopback test
Jiri Pirko [Sun, 5 May 2019 06:48:07 +0000 (09:48 +0300)]
selftests: Add loopback test

Add selftest for loopback feature

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: spectrum: Implement loopback ethtool feature
Jiri Pirko [Sun, 5 May 2019 06:48:06 +0000 (09:48 +0300)]
mlxsw: spectrum: Implement loopback ethtool feature

Allow user to enable loopback feature for individual ports using ethtool.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: reg: Add Port Physical Loopback Register
Jiri Pirko [Sun, 5 May 2019 06:48:05 +0000 (09:48 +0300)]
mlxsw: reg: Add Port Physical Loopback Register

The PPLR register allows configuration of the port's loopback mode.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agosch_htb: redefine htb qdisc overlimits
Cong Wang [Sat, 4 May 2019 18:43:42 +0000 (11:43 -0700)]
sch_htb: redefine htb qdisc overlimits

In commit 3c75f6ee139d ("net_sched: sch_htb: add per class overlimits counter")
we added an overlimits counter for each HTB class which could
properly reflect how many times we use up all the bandwidth
on each class. However, the overlimits counter in HTB qdisc
does not, it is way bigger than the sum of each HTB class.
In fact, this qdisc overlimits counter increases when we have
no skb to dequeue, which happens more often than we run out of
bandwidth.

It makes more sense to make this qdisc overlimits counter just
be a sum of each HTB class, in case people still get confused.

I have verified this patch with one single HTB class, where HTB
qdisc counters now always match HTB class counters as expected.

Eric suggested we could fold this field into 'direct_pkts' as
we only use its 32bit on 64bit CPU, this saves one cache line.

Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agor8169: move EEE LED config to rtl8168_config_eee_mac
Heiner Kallweit [Sat, 4 May 2019 15:13:09 +0000 (17:13 +0200)]
r8169: move EEE LED config to rtl8168_config_eee_mac

Move adjusting the EEE LED frequency to rtl8168_config_eee_mac.
Exclude RTL8411 (version 38) like in the existing code.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agor8169: simplify rtl_writephy_batch and rtl_ephy_init
Heiner Kallweit [Sat, 4 May 2019 14:57:49 +0000 (16:57 +0200)]
r8169: simplify rtl_writephy_batch and rtl_ephy_init

Make both functions macros to allow omitting the ARRAY_SIZE(x) argument.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'Traffic-support-for-SJA1105-DSA-driver'
David S. Miller [Mon, 6 May 2019 04:52:42 +0000 (21:52 -0700)]
Merge branch 'Traffic-support-for-SJA1105-DSA-driver'

Vladimir Oltean says:

====================
Traffic support for SJA1105 DSA driver

This patch set is a continuation of the "NXP SJA1105 DSA driver" v3
series, which was split in multiple pieces for easier review.

Supporting a fully-featured (traffic-capable) driver for this switch
requires some rework in DSA and also leaves behind a more generic
infrastructure for other dumb switches that rely on 802.1Q pseudo-switch
tagging for port separation. Among the DSA changes required are:

* Generic xmit and rcv functions for pushing/popping 802.1Q tags on
  skb's. These are modeled as a tagging protocol of its own but which
  must be customized by drivers to fit their own hardware possibilities.

* Permitting the .setup callback to invoke switchdev operations that
  will loop back into the driver through the switchdev notifier chain.

The SJA1105 driver then proceeds to extend this 8021q switch tagging
protocol while adding its own (tag_sja1105). This is done because
the driver actually implements a "dual tagger":

* For normal traffic it uses 802.1Q tags

* For management (multicast DMAC) frames the switch has native support
  for recognizing and annotating these with source port and switch id
  information.

Because this is a "dual tagger", decoding of management frames should
still function when regular traffic can't (under a bridge with VLAN
filtering).
There was intervention in the DSA receive hotpath, where a new
filtering function called from eth_type_trans() is needed. This is
useful in the general sense for switches that might actually have some
limited means of source port decoding, such as only for management
traffic, but not for everything.
In order for the 802.1Q tagging protocol (which cannot be enabled under
all conditions, unlike the management traffic decoding) to not be an
all-or-nothing choice, the filtering function matches everything that
can be decoded, and everything else is left to pass to the master
netdevice.

Lastly, DSA core support was added for drivers to request skb deferral.
SJA1105 needs this for SPI intervention during transmission of link-local
traffic. This is not done from within the tagger.

Some patches were carried over unchanged from the previous patchset
(01/09). Others were slightly reworked while adapting to the recent
changes in "Make DSA tag drivers kernel modules" (02/09).

The introduction of some structures (DSA_SKB_CB, dp->priv) may seem a
little premature at this point and the new structures under-utilized.
The reason is that traffic support has been rewritten with PTP
timestamping in mind, and then I removed the timestamping code from the
current submission (1. it is a different topic, 2. it does not work very
well yet). On demand I can provide the timestamping patchset as a RFC
though.

"NXP SJA1105 DSA driver" v3 patchset can be found at:
https://lkml.org/lkml/2019/4/12/978

v1 patchset can be found at:
https://lkml.org/lkml/2019/5/3/877

Changes in v2:

* Made the deferred xmit workqueue also be drained on the netdev suspend
  callback, not just on ndo_stop.

* Added clarification about how other netdevices may be bridged with the
  switch ports.

v2 patchset can be found at:
https://www.spinics.net/lists/netdev/msg568818.html

Changes in v3:

* Exported the dsa_port_vid_add and dsa_port_vid_del symbols to fix an
  error reported by the kbuild test robot

* Fixed the following checkpatch warnings in 05/10:
  Macro argument reuse 'skb' - possible side-effects?
  Macro argument reuse 'clone' - possible side-effects?

* Added a commit description to the documentation patch (10/10)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoDocumentation: net: dsa: sja1105: Add info about supported traffic modes
Vladimir Oltean [Sun, 5 May 2019 10:19:29 +0000 (13:19 +0300)]
Documentation: net: dsa: sja1105: Add info about supported traffic modes

This adds a table which illustrates what combinations of management /
regular traffic work depending on the state the switch ports are in.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: sja1105: Add support for Spanning Tree Protocol
Vladimir Oltean [Sun, 5 May 2019 10:19:28 +0000 (13:19 +0300)]
net: dsa: sja1105: Add support for Spanning Tree Protocol

While not explicitly documented as supported in UM10944, compliance with
the STP states can be obtained by manipulating 3 settings at the
(per-port) MAC config level: dynamic learning, inhibiting reception of
regular traffic, and inhibiting transmission of regular traffic.

In all these modes, transmission and reception of special BPDU frames
from the stack is still enabled (not inhibited by the MAC-level
settings).

On ingress, BPDUs are classified by the MAC filter as link-local
(01-80-C2-00-00-00) and forwarded to the CPU port.  This mechanism works
under all conditions (even without the custom 802.1Q tagging) because
the switch hardware inserts the source port and switch ID into bytes 4
and 5 of the MAC-filtered frames. Then the DSA .rcv handler needs to put
back zeroes into the MAC address after decoding the source port
information.

On egress, BPDUs are transmitted using management routes from the xmit
worker thread. Again this does not require switch tagging, as the switch
port is programmed through SPI to hold a temporary (single-fire) route
for a frame with the programmed destination MAC (01-80-C2-00-00-00).

STP is activated using the following commands and was tested by
connecting two front-panel ports together and noticing that switching
loops were prevented (one port remains in the blocking state):

$ ip link add name br0 type bridge stp_state 1 && ip link set br0 up
$ for eth in $(ls /sys/devices/platform/soc/2100000.spi/spi_master/spi0/spi0.1/net/);
  do ip link set ${eth} master br0 && ip link set ${eth} up; done

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: sja1105: Add support for traffic through standalone ports
Vladimir Oltean [Sun, 5 May 2019 10:19:27 +0000 (13:19 +0300)]
net: dsa: sja1105: Add support for traffic through standalone ports

In order to support this, we are creating a make-shift switch tag out of
a VLAN trunk configured on the CPU port. Termination of normal traffic
on switch ports only works when not under a vlan_filtering bridge.
Termination of management (PTP, BPDU) traffic works under all
circumstances because it uses a different tagging mechanism
(incl_srcpt). We are making use of the generic CONFIG_NET_DSA_TAG_8021Q
code and leveraging it from our own CONFIG_NET_DSA_TAG_SJA1105.

There are two types of traffic: regular and link-local.

The link-local traffic received on the CPU port is trapped from the
switch's regular forwarding decisions because it matched one of the two
DMAC filters for management traffic.

On transmission, the switch requires special massaging for these
link-local frames. Due to a weird implementation of the switching IP, by
default it drops link-local frames that originate on the CPU port.
It needs to be told where to forward them to, through an SPI command
("management route") that is valid for only a single frame.
So when we're sending link-local traffic, we are using the
dsa_defer_xmit mechanism.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: Add a private structure pointer to dsa_port
Vladimir Oltean [Sun, 5 May 2019 10:19:26 +0000 (13:19 +0300)]
net: dsa: Add a private structure pointer to dsa_port

This is supposed to share information between the driver and the tagger,
or used by the tagger to keep some state. Its use is optional.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: Add support for deferred xmit
Vladimir Oltean [Sun, 5 May 2019 10:19:25 +0000 (13:19 +0300)]
net: dsa: Add support for deferred xmit

Some hardware needs to take work to get convinced to receive frames on
the CPU port (such as the sja1105 which takes temporary L2 forwarding
rules over SPI that last for a single frame). Such work needs a
sleepable context, and because the regular .ndo_start_xmit is atomic,
this cannot be done in the tagger. So introduce a generic DSA mechanism
that sets up a transmit skb queue and a workqueue for deferred
transmission.

The new driver callback (.port_deferred_xmit) is in dsa_switch and not
in the tagger because the operations that require sleeping typically
also involve interacting with the hardware, and not simply skb
manipulations. Therefore having it there simplifies the structure a bit
and makes it unnecessary to export functions from the driver to the
tagger.

The driver is responsible of calling dsa_enqueue_skb which transfers it
to the master netdevice. This is so that it has a chance of performing
some more work afterwards, such as cleanup or TX timestamping.

To tell DSA that skb xmit deferral is required, I have thought about
changing the return type of the tagger .xmit from struct sk_buff * into
a enum dsa_tx_t that could potentially encode a DSA_XMIT_DEFER value.

But the trailer tagger is reallocating every skb on xmit and therefore
making a valid use of the pointer return value. So instead of reworking
the API in complicated ways, right now a boolean property in the newly
introduced DSA_SKB_CB is set.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: Keep private info in the skb->cb
Vladimir Oltean [Sun, 5 May 2019 10:19:24 +0000 (13:19 +0300)]
net: dsa: Keep private info in the skb->cb

Map a DSA structure over the 48-byte control block that will hold
skb info on transmit and receive. This is only for use within the DSA
processing layer (e.g. communicating between DSA core and tagger) and
not for passing info around with other layers such as the master net
device.

Also add a DSA_SKB_CB_PRIV() macro which retrieves a pointer to the
space up to 48 bytes that the DSA structure does not use. This space can
be used for drivers to add their own private info.

One use is for the PTP timestamping code path. When cloning a skb,
annotate the original with a pointer to the clone, which the driver can
then find easily and place the timestamp to. This avoids the need of a
separate queue to hold clones and a way to match an original to a cloned
skb.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: Allow drivers to filter packets they can decode source port from
Vladimir Oltean [Sun, 5 May 2019 10:19:23 +0000 (13:19 +0300)]
net: dsa: Allow drivers to filter packets they can decode source port from

Frames get processed by DSA and redirected to switch port net devices
based on the ETH_P_XDSA multiplexed packet_type handler found by the
network stack when calling eth_type_trans().

The running assumption is that once the DSA .rcv function is called, DSA
is always able to decode the switch tag in order to change the skb->dev
from its master.

However there are tagging protocols (such as the new DSA_TAG_PROTO_SJA1105,
user of DSA_TAG_PROTO_8021Q) where this assumption is not completely
true, since switch tagging piggybacks on the absence of a vlan_filtering
bridge. Moreover, management traffic (BPDU, PTP) for this switch doesn't
rely on switch tagging, but on a different mechanism. So it would make
sense to at least be able to terminate that.

Having DSA receive traffic it can't decode would put it in an impossible
situation: the eth_type_trans() function would invoke the DSA .rcv(),
which could not change skb->dev, then eth_type_trans() would be invoked
again, which again would call the DSA .rcv, and the packet would never
be able to exit the DSA filter and would spiral in a loop until the
whole system dies.

This happens because eth_type_trans() doesn't actually look at the skb
(so as to identify a potential tag) when it deems it as being
ETH_P_XDSA. It just checks whether skb->dev has a DSA private pointer
installed (therefore it's a DSA master) and that there exists a .rcv
callback (everybody except DSA_TAG_PROTO_NONE has that). This is
understandable as there are many switch tags out there, and exhaustively
checking for all of them is far from ideal.

The solution lies in introducing a filtering function for each tagging
protocol. In the absence of a filtering function, all traffic is passed
to the .rcv DSA callback. The tagging protocol should see the filtering
function as a pre-validation that it can decode the incoming skb. The
traffic that doesn't match the filter will bypass the DSA .rcv callback
and be left on the master netdevice, which wasn't previously possible.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: Optional VLAN-based port separation for switches without tagging
Vladimir Oltean [Sun, 5 May 2019 10:19:22 +0000 (13:19 +0300)]
net: dsa: Optional VLAN-based port separation for switches without tagging

This patch provides generic DSA code for using VLAN (802.1Q) tags for
the same purpose as a dedicated switch tag for injection/extraction.
It is based on the discussions and interest that has been so far
expressed in https://www.spinics.net/lists/netdev/msg556125.html.

Unlike all other DSA-supported tagging protocols, CONFIG_NET_DSA_TAG_8021Q
does not offer a complete solution for drivers (nor can it). Instead, it
provides generic code that driver can opt into calling:
- dsa_8021q_xmit: Inserts a VLAN header with the specified contents.
  Can be called from another tagging protocol's xmit function.
  Currently the LAN9303 driver is inserting headers that are simply
  802.1Q with custom fields, so this is an opportunity for code reuse.
- dsa_8021q_rcv: Retrieves the TPID and TCI from a VLAN-tagged skb.
  Removing the VLAN header is left as a decision for the caller to make.
- dsa_port_setup_8021q_tagging: For each user port, installs an Rx VID
  and a Tx VID, for proper untagged traffic identification on ingress
  and steering on egress. Also sets up the VLAN trunk on the upstream
  (CPU or DSA) port. Drivers are intentionally left to call this
  function explicitly, depending on the context and hardware support.
  The expected switch behavior and VLAN semantics should not be violated
  under any conditions. That is, after calling
  dsa_port_setup_8021q_tagging, the hardware should still pass all
  ingress traffic, be it tagged or untagged.

For uniformity with the other tagging protocols, a module for the
dsa_8021q_netdev_ops structure is registered, but the typical usage is
to set up another tagging protocol which selects CONFIG_NET_DSA_TAG_8021Q,
and calls the API from tag_8021q.h. Null function definitions are also
provided so that a "depends on" is not forced in the Kconfig.

This tagging protocol only works when switch ports are standalone, or
when they are added to a VLAN-unaware bridge. It will probably remain
this way for the reasons below.

When added to a bridge that has vlan_filtering 1, the bridge core will
install its own VLANs and reset the pvids through switchdev. For the
bridge core, switchdev is a write-only pipe. All VLAN-related state is
kept in the bridge core and nothing is read from DSA/switchdev or from
the driver. So the bridge core will break this port separation because
it will install the vlan_default_pvid into all switchdev ports.

Even if we could teach the bridge driver about switchdev preference of a
certain vlan_default_pvid (task difficult in itself since the current
setting is per-bridge but we would need it per-port), there would still
exist many other challenges.

Firstly, in the DSA rcv callback, a driver would have to perform an
iterative reverse lookup to find the correct switch port. That is
because the port is a bridge slave, so its Rx VID (port PVID) is subject
to user configuration. How would we ensure that the user doesn't reset
the pvid to a different value (which would make an O(1) translation
impossible), or to a non-unique value within this DSA switch tree (which
would make any translation impossible)?

Finally, not all switch ports are equal in DSA, and that makes it
difficult for the bridge to be completely aware of this anyway.
The CPU port needs to transmit tagged packets (VLAN trunk) in order for
the DSA rcv code to be able to decode source information.
But the bridge code has absolutely no idea which switch port is the CPU
port, if nothing else then just because there is no netdevice registered
by DSA for the CPU port.
Also DSA does not currently allow the user to specify that they want the
CPU port to do VLAN trunking anyway. VLANs are added to the CPU port
using the same flags as they were added on the user port.

So the VLANs installed by dsa_port_setup_8021q_tagging per driver
request should remain private from the bridge's and user's perspective,
and should not alter the VLAN semantics observed by the user.

In the current implementation a VLAN range ending at 4095 (VLAN_N_VID)
is reserved for this purpose. Each port receives a unique Rx VLAN and a
unique Tx VLAN. Separate VLANs are needed for Rx and Tx because they
serve different purposes: on Rx the switch must process traffic as
untagged and process it with a port-based VLAN, but with care not to
hinder bridging. On the other hand, the Tx VLAN is where the
reachability restrictions are imposed, since by tagging frames in the
xmit callback we are telling the switch onto which port to steer the
frame.

Some general guidance on how this support might be employed for
real-life hardware (some comments made by Florian Fainelli):

- If the hardware supports VLAN tag stacking, it should somehow back
  up its private VLAN settings when the bridge tries to override them.
  Then the driver could re-apply them as outer tags. Dedicating an outer
  tag per bridge device would allow identical inner tag VID numbers to
  co-exist, yet preserve broadcast domain isolation.

- If the switch cannot handle VLAN tag stacking, it should disable this
  port separation when added as slave to a vlan_filtering bridge, in
  that case having reduced functionality.

- Drivers for old switches that don't support the entire VLAN_N_VID
  range will need to rework the current range selection mechanism.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: Export symbols for dsa_port_vid_{add, del}
Vladimir Oltean [Sun, 5 May 2019 10:19:21 +0000 (13:19 +0300)]
net: dsa: Export symbols for dsa_port_vid_{add, del}

This is needed so that the newly introduced tag_8021q may access these
core DSA functions when built as a module.

Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: Call driver's setup callback after setting up its switchdev notifier
Vladimir Oltean [Sun, 5 May 2019 10:19:20 +0000 (13:19 +0300)]
net: dsa: Call driver's setup callback after setting up its switchdev notifier

This allows the driver to perform some manipulations of its own during
setup, using generic switchdev calls. Having the notifiers registered at
setup time is important because otherwise any switchdev transaction
emitted during this time would be ignored (dispatched to an empty call
chain).

One current usage scenario is for the driver to request DSA to set up
802.1Q based switch tagging for its ports.

There is no danger for the driver setup code to start racing now with
switchdev events emitted from the network stack (such as bridge core)
even if the notifier is registered earlier. This is because the network
stack needs a net_device as a vehicle to perform switchdev operations,
and the slave net_devices are registered later than the core driver
setup anyway (ds->ops->setup in dsa_switch_setup vs dsa_port_setup).

Luckily DSA doesn't need a net_device to carry out switchdev callbacks,
and therefore drivers shouldn't assume either that net_devices are
available at the time their switchdev callbacks get invoked.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com>-
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: mv88e6xxx: refine SMI support
Vivien Didelot [Fri, 3 May 2019 23:28:22 +0000 (19:28 -0400)]
net: dsa: mv88e6xxx: refine SMI support

The Marvell SOHO switches have several ways to access the internal
registers. One of them being the System Management Interface (SMI),
using the MDC and MDIO pins, with direct and indirect variants.

In preparation for adding support for other register accesses, move
the SMI code into its own files. At the same time, refine the code
to make it clear that the indirect variant is implemented using the
direct variant accessing only two registers for command and data.

Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'net-act_police-offload-support'
David S. Miller [Mon, 6 May 2019 04:49:24 +0000 (21:49 -0700)]
Merge branch 'net-act_police-offload-support'

Jakub Kicinski says:

===================
net: act_police offload support

this set starts by converting cls_matchall to the new flow offload
infrastructure. It so happens that all drivers implementing cls_matchall
offload today also offload cls_flower, so its a little easier for
them to handle the actions in unified flow_rule format, even though
in cls_matchall there is no flow to speak of. If a driver ever appears
which would prefer the old, direct access to TC exts, we can add the
pointer in the offload structure back and support both.

Next the act_police is added to actions supported by flow offload API.

NFP support for act_police offload is added as the final step.  The flower
firmware is configured to perform TX rate limiting in a way which matches
act_police's behaviour.  It does not use DMA.IN back pressure, and
instead drops packets after they had been already DMAed into the NIC.
IOW it uses our standard traffic policing implementation, future patches
will extend it to other ports and traffic directions.
===================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonfp: flower: add qos offload stats request and reply
Pieter Jansen van Vuuren [Sat, 4 May 2019 11:46:28 +0000 (04:46 -0700)]
nfp: flower: add qos offload stats request and reply

Add stats request function that sends a stats request message to hw for
a specific police-filter. Process stats reply from hw and update the
stored qos structure.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonfp: flower: add qos offload install and remove functionality.
Pieter Jansen van Vuuren [Sat, 4 May 2019 11:46:27 +0000 (04:46 -0700)]
nfp: flower: add qos offload install and remove functionality.

Add install and remove offload functionality for qos offloads. We
first check that a police filter can be implemented by the VF rate
limiting feature in hw, then we install the filter via the qos
infrastructure. Finally we implement the mechanism for removing
these types of filters.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonfp: flower: add qos offload framework
Pieter Jansen van Vuuren [Sat, 4 May 2019 11:46:26 +0000 (04:46 -0700)]
nfp: flower: add qos offload framework

Introduce matchall filter offload infrastructure that is needed to
offload qos features like policing. Subsequent patches will make
use of police-filters for ingress rate limiting.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/sched: add block pointer to tc_cls_common_offload structure
Pieter Jansen van Vuuren [Sat, 4 May 2019 11:46:25 +0000 (04:46 -0700)]
net/sched: add block pointer to tc_cls_common_offload structure

Some actions like the police action are stateful and could share state
between devices. This is incompatible with offloading to multiple devices
and drivers might want to test for shared blocks when offloading.
Store a pointer to the tcf_block structure in the tc_cls_common_offload
structure to allow drivers to determine when offloads apply to a shared
block.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/sched: allow stats updates from offloaded police actions
Pieter Jansen van Vuuren [Sat, 4 May 2019 11:46:24 +0000 (04:46 -0700)]
net/sched: allow stats updates from offloaded police actions

Implement the stats_update callback for the police action that
will be used by drivers for hardware offload.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/sched: extend matchall offload for hardware statistics
Pieter Jansen van Vuuren [Sat, 4 May 2019 11:46:23 +0000 (04:46 -0700)]
net/sched: extend matchall offload for hardware statistics

Introduce a new command for matchall classifiers that allows hardware
to update statistics.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/sched: add police action to the hardware intermediate representation
Pieter Jansen van Vuuren [Sat, 4 May 2019 11:46:22 +0000 (04:46 -0700)]
net/sched: add police action to the hardware intermediate representation

Add police action to the hardware intermediate representation which
would subsequently allow it to be used by drivers for offload.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/sched: move police action structures to header
Pieter Jansen van Vuuren [Sat, 4 May 2019 11:46:21 +0000 (04:46 -0700)]
net/sched: move police action structures to header

Move tcf_police_params, tcf_police and tc_police_compat structures to a
header. Making them usable to other code for example drivers that would
offload police actions to hardware.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/sched: remove unused functions for matchall offload
Pieter Jansen van Vuuren [Sat, 4 May 2019 11:46:20 +0000 (04:46 -0700)]
net/sched: remove unused functions for matchall offload

Cleanup unused functions and variables after porting to the newer
intermediate representation.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/dsa: use intermediate representation for matchall offload
Pieter Jansen van Vuuren [Sat, 4 May 2019 11:46:19 +0000 (04:46 -0700)]
net/dsa: use intermediate representation for matchall offload

Updates dsa hardware switch handling infrastructure to use the newer
intermediate representation for flow actions in matchall offloads.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: use intermediate representation for matchall offload
Pieter Jansen van Vuuren [Sat, 4 May 2019 11:46:18 +0000 (04:46 -0700)]
mlxsw: use intermediate representation for matchall offload

Updates the Mellanox spectrum driver to use the newer intermediate
representation for flow actions in matchall offloads.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/sched: use the hardware intermediate representation for matchall
Pieter Jansen van Vuuren [Sat, 4 May 2019 11:46:17 +0000 (04:46 -0700)]
net/sched: use the hardware intermediate representation for matchall

Extends matchall offload to make use of the hardware intermediate
representation. More specifically, this patch moves the native TC
actions in cls_matchall offload to the newer flow_action
representation. This ultimately allows us to avoid a direct
dependency on native TC actions for matchall.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/sched: add sample action to the hardware intermediate representation
Pieter Jansen van Vuuren [Sat, 4 May 2019 11:46:16 +0000 (04:46 -0700)]
net/sched: add sample action to the hardware intermediate representation

Add sample action to the hardware intermediate representation model which
would subsequently allow it to be used by drivers for offload.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'of_net-Add-NVMEM-support-to-of_get_mac_address'
David S. Miller [Mon, 6 May 2019 04:47:08 +0000 (21:47 -0700)]
Merge branch 'of_net-Add-NVMEM-support-to-of_get_mac_address'

Petr Štetiar says:

====================
of_net: Add NVMEM support to of_get_mac_address

this patch series is a continuation of my previous attempt[1], where I've
tried to wire MTD layer into of_get_mac_address, so it would be possible to
load MAC addresses from various NVMEMs as EEPROMs etc.

Predecessor of this patch which used directly MTD layer has originated in
OpenWrt some time ago and supports already about 497 use cases in 357
device tree files.

During the review process of my 1st attempt I was told, that I shouldn't be
using MTD directly, but that I should rather use new NVMEM subsystem and
during the review process of v2 I was told, that I should handle
EPROBE_DEFFER error as well, during the review process of v3 I was told,
that returning pointer/NULL/ERR_PTR is considered as wrong API design, so
this v4 patch series tries to accommodate all this previous remarks.

First patch is wiring NVMEM support directly into of_get_mac_address as
it's obvious, that adding support for NVMEM into every other driver would
mean adding a lot of repetitive code. This patch allows us to configure MAC
addresses in various devices like ethernet and wireless adapters directly
from of_get_mac_address, which is used by quite a lot of drivers in the
tree already.

Second patch is simply updating documentation with NVMEM bits, and cleaning
up all current binding documentation referencing any of the MAC address
related properties.

Third and fourth patches are simply removing duplicate NVMEM code which is
no longer needed as the first patch has wired NVMEM support directly into
of_get_mac_address.

Patches 5-10 are converting all current users of of_get_mac_address to the
new ERR_PTR encoded error value, as of_get_mac_address could now return
valid pointer, NULL and ERR_PTR.

Just for a better picture, this patch series and one simple patch[2] on top
of it, allows me to configure 8Devices Carambola2 board's MAC addresses
with following DTS (simplified):

 &spi {
  flash@0 {
  partitions {
art: partition@ff0000 {
label = "art";
reg = <0xff0000 0x010000>;
read-only;

nvmem-cells {
compatible = "nvmem-cells";
#address-cells = <1>;
#size-cells = <1>;

eth0_addr: eth-mac-addr@0 {
reg = <0x0 0x6>;
};

eth1_addr: eth-mac-addr@6 {
reg = <0x6 0x6>;
};

wmac_addr: wifi-mac-addr@1002 {
reg = <0x1002 0x6>;
};
};
};
};
};
 };

 &eth0 {
nvmem-cells = <&eth0_addr>;
nvmem-cell-names = "mac-address";
 };

 &eth1 {
nvmem-cells = <&eth1_addr>;
nvmem-cell-names = "mac-address";
 };

 &wmac {
nvmem-cells = <&wmac_addr>;
nvmem-cell-names = "mac-address";
 };

1. https://patchwork.ozlabs.org/patch/1086628/
2. https://patchwork.ozlabs.org/patch/890738/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agopowerpc: tsi108: support of_get_mac_address new ERR_PTR error
Petr Štetiar [Fri, 3 May 2019 14:27:15 +0000 (16:27 +0200)]
powerpc: tsi108: support of_get_mac_address new ERR_PTR error

There was NVMEM support added to of_get_mac_address, so it could now return
ERR_PTR encoded error values, so we need to adjust all current users of
of_get_mac_address to this new fact.

Signed-off-by: Petr Štetiar <ynezz@true.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoARM: Kirkwood: support of_get_mac_address new ERR_PTR error
Petr Štetiar [Fri, 3 May 2019 14:27:14 +0000 (16:27 +0200)]
ARM: Kirkwood: support of_get_mac_address new ERR_PTR error

There was NVMEM support added to of_get_mac_address, so it could now return
ERR_PTR encoded error values, so we need to adjust all current users of
of_get_mac_address to this new fact.

Signed-off-by: Petr Štetiar <ynezz@true.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agostaging: octeon-ethernet: support of_get_mac_address new ERR_PTR error
Petr Štetiar [Fri, 3 May 2019 14:27:13 +0000 (16:27 +0200)]
staging: octeon-ethernet: support of_get_mac_address new ERR_PTR error

There was NVMEM support added to of_get_mac_address, so it could now return
ERR_PTR encoded error values, so we need to adjust all current users of
of_get_mac_address to this new fact.

Signed-off-by: Petr Štetiar <ynezz@true.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: wireless: support of_get_mac_address new ERR_PTR error
Petr Štetiar [Fri, 3 May 2019 14:27:12 +0000 (16:27 +0200)]
net: wireless: support of_get_mac_address new ERR_PTR error

There was NVMEM support added to of_get_mac_address, so it could now return
ERR_PTR encoded error values, so we need to adjust all current users of
of_get_mac_address to this new fact.

Signed-off-by: Petr Štetiar <ynezz@true.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: usb: support of_get_mac_address new ERR_PTR error
Petr Štetiar [Fri, 3 May 2019 14:27:11 +0000 (16:27 +0200)]
net: usb: support of_get_mac_address new ERR_PTR error

There was NVMEM support added to of_get_mac_address, so it could now return
ERR_PTR encoded error values, so we need to adjust all current users of
of_get_mac_address to this new fact.

Signed-off-by: Petr Štetiar <ynezz@true.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: davinci: support of_get_mac_address new ERR_PTR error
Petr Štetiar [Fri, 3 May 2019 14:27:09 +0000 (16:27 +0200)]
net: davinci: support of_get_mac_address new ERR_PTR error

There was NVMEM support added directly to of_get_mac_address, and it uses
nvmem_get_mac_address under the hood, so we can remove it. As
of_get_mac_address can now return ERR_PTR encoded error values, adjust to
that as well.

Signed-off-by: Petr Štetiar <ynezz@true.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: macb: support of_get_mac_address new ERR_PTR error
Petr Štetiar [Fri, 3 May 2019 14:27:08 +0000 (16:27 +0200)]
net: macb: support of_get_mac_address new ERR_PTR error

There was NVMEM support added directly to of_get_mac_address, and it uses
nvmem_get_mac_address under the hood, so we can remove it. As
of_get_mac_address can now return ERR_PTR encoded error values, adjust to
that as well.

Signed-off-by: Petr Štetiar <ynezz@true.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodt-bindings: doc: reflect new NVMEM of_get_mac_address behaviour
Petr Štetiar [Fri, 3 May 2019 14:27:07 +0000 (16:27 +0200)]
dt-bindings: doc: reflect new NVMEM of_get_mac_address behaviour

As of_get_mac_address now supports NVMEM under the hood, we need to update
the bindings documentation with the new nvmem-cell* properties, which would
mean copy&pasting a lot of redundant information to every binding
documentation currently referencing some of the MAC address properties.

So I've just removed all the references to the optional MAC address
properties and replaced them with the small note referencing
net/ethernet.txt file.

Signed-off-by: Petr Štetiar <ynezz@true.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoof_net: add NVMEM support to of_get_mac_address
Petr Štetiar [Fri, 3 May 2019 14:27:06 +0000 (16:27 +0200)]
of_net: add NVMEM support to of_get_mac_address

Many embedded devices have information such as MAC addresses stored
inside NVMEMs like EEPROMs and so on. Currently there are only two
drivers in the tree which benefit from NVMEM bindings.

Adding support for NVMEM into every other driver would mean adding a lot
of repetitive code. This patch allows us to configure MAC addresses in
various devices like ethernet and wireless adapters directly from
of_get_mac_address, which is already used by almost every driver in the
tree.

Predecessor of this patch which used directly MTD layer has originated
in OpenWrt some time ago and supports already about 497 use cases in 357
device tree files.

Cc: Alban Bedel <albeu@free.fr>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: John Crispin <john@phrozen.org>
Signed-off-by: Petr Štetiar <ynezz@true.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'bnxt_en-Driver-updates'
David S. Miller [Mon, 6 May 2019 04:42:17 +0000 (21:42 -0700)]
Merge branch 'bnxt_en-Driver-updates'

Michael Chan says:

====================
bnxt_en: Driver updates.

This patch series adds some extended statistics available with the new
firmware interface, package version from firmware, aRFS support on
57500 chips, new PCI IDs, and some miscellaneous fixes and improvements.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: Add device IDs 0x1806 and 0x1752 for 57500 devices.
Michael Chan [Sun, 5 May 2019 11:17:08 +0000 (07:17 -0400)]
bnxt_en: Add device IDs 0x1806 and 0x1752 for 57500 devices.

0x1806 and 0x1752 are VF variant and PF variant of the 57500 chip
family.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: Add support for aRFS on 57500 chips.
Michael Chan [Sun, 5 May 2019 11:17:07 +0000 (07:17 -0400)]
bnxt_en: Add support for aRFS on 57500 chips.

Set RSS ring table index of the RFS destination ring for the NTUPLE
filters on 57500 chips.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: Query firmware capability to support aRFS on 57500 chips.
Michael Chan [Sun, 5 May 2019 11:17:06 +0000 (07:17 -0400)]
bnxt_en: Query firmware capability to support aRFS on 57500 chips.

Query support for the aRFS ring table index in the firmware.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: Improve NQ reservations.
Michael Chan [Sun, 5 May 2019 11:17:05 +0000 (07:17 -0400)]
bnxt_en: Improve NQ reservations.

bnxt_need_reserve_rings() determines if any resources have changed and
requires new reservation with firmware.  The NQ checking is currently
just an approximation.  Improve the NQ checking logic to make it
accurate.  NQ reservation is only needed on 57500 PFs.  This fix will
eliminate unnecessary reservations and will reduce NQ reservations
when some NQs have been released on 57500 PFs.

Fixes: c0b8cda05e1d ("bnxt_en: Fix NQ/CP rings accounting on the new 57500 chips.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: Separate RDMA MR/AH context allocation.
Devesh Sharma [Sun, 5 May 2019 11:17:04 +0000 (07:17 -0400)]
bnxt_en: Separate RDMA MR/AH context allocation.

In newer firmware, the context memory for MR (Memory Region)
and AH (Address Handle) to support RDMA are specified separately.
Modify driver to specify and allocate the 2 context memory types
separately when supported by the firmware.

Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: read the clause type from the PHY ID
Vasundhara Volam [Sun, 5 May 2019 11:17:03 +0000 (07:17 -0400)]
bnxt_en: read the clause type from the PHY ID

Currently driver hard code Clause 45 based on speed supported by the
PHY. Instead read the clause type from the PHY ID provided as input
to the mdio ioctl.

Fixes: 0ca12be99667 ("bnxt_en: Add support for mdio read/write to external PHY")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: Read package version from firmware.
Vasundhara Volam [Sun, 5 May 2019 11:17:02 +0000 (07:17 -0400)]
bnxt_en: Read package version from firmware.

HWRM_VER_GET firmware command returns package name that is running
actively on the adapter.  Use this version instead of parsing from
the package log in NVRAM.

Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: Check new firmware capability to display extended stats.
Vasundhara Volam [Sun, 5 May 2019 11:17:01 +0000 (07:17 -0400)]
bnxt_en: Check new firmware capability to display extended stats.

Newer firmware now advertises the capability for extended stats
support.  Check the new capability in addition to the existing
version check.

Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: Add support for PCIe statistics
Vasundhara Volam [Sun, 5 May 2019 11:17:00 +0000 (07:17 -0400)]
bnxt_en: Add support for PCIe statistics

Gather periodic PCIe statistics for ethtool -S.

Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>