platform/kernel/linux-rpi.git
3 years agoMerge branch 'microchip-ksz88x3'
David S. Miller [Tue, 27 Apr 2021 21:13:24 +0000 (14:13 -0700)]
Merge branch 'microchip-ksz88x3'

Oleksij Rempel says:

====================
microchip: add support for ksz88x3 driver family

changes v8:
- add Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
- fix build issue on "net: dsa: microchip: ksz8795: move register
  offsets and shifts to separate struct"

changes v7:
- Reverse christmas tree fixes
- remove IS_88X3 and use chip_id instead
- drop own tag and use DSA_TAG_PROTO_KSZ9893 instead

changes v6:
- take over this patch set
- rebase against latest netdev-next and fix regressions
- disable VLAN support for KSZ8863. KSZ8863's VLAN is not compatible to the
  KSZ8795's. So disable it for now and mainline it separately.

This series adds support for the ksz88x3 driver family to the dsa based
ksz drivers. The driver is making use of the already available ksz8795
driver and moves it to an generic driver for the ksz8 based chips which
have similar functions but an totaly different register layout.

The mainlining discussion history of this branch:
v1: https://lore.kernel.org/netdev/20191107110030.25199-1-m.grzeschik@pengutronix.de/
v2: https://lore.kernel.org/netdev/20191218200831.13796-1-m.grzeschik@pengutronix.de/
v3: https://lore.kernel.org/netdev/20200508154343.6074-1-m.grzeschik@pengutronix.de/
v4: https://lore.kernel.org/netdev/20200803054442.20089-1-m.grzeschik@pengutronix.de/
v5: https://lore.kernel.org/netdev/20201207125627.30843-1-m.grzeschik@pengutronix.de/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodt-bindings: net: mdio-gpio: add compatible for microchip,mdio-smi0
Michael Grzeschik [Tue, 27 Apr 2021 07:09:09 +0000 (09:09 +0200)]
dt-bindings: net: mdio-gpio: add compatible for microchip,mdio-smi0

Microchip SMI0 Mode is a special mode, where the MDIO Read/Write
commands are part of the PHY Address and the OP Code is always 0. We add
the compatible for this special mode of the bitbanged mdio driver.

Cc: devicetree@vger.kernel.org
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: microchip: Add Microchip KSZ8863 SMI based driver support
Michael Grzeschik [Tue, 27 Apr 2021 07:09:08 +0000 (09:09 +0200)]
net: dsa: microchip: Add Microchip KSZ8863 SMI based driver support

Add KSZ88X3 driver support. We add support for the KXZ88X3 three port
switches using the Microchip SMI Interface. They are supported using the
MDIO-Bitbang Interface.

Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: phy: Add support for microchip SMI0 MDIO bus
Andrew Lunn [Tue, 27 Apr 2021 07:09:07 +0000 (09:09 +0200)]
net: phy: Add support for microchip SMI0 MDIO bus

SMI0 is a mangled version of MDIO. The main low level difference is
the MDIO C22 OP code is always 0, not 0x2 or 0x1 for Read/Write. The
read/write information is instead encoded in the PHY address.

Extend the bit-bang code to allow the op code to be overridden, but
default to normal C22 values. Add an extra compatible to the mdio-gpio
driver, and when this compatible is present, set the op codes to 0.

A higher level driver, sitting on top of the basic MDIO bus driver can
then implement the rest of the microchip SMI0 odderties.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodt-bindings: net: dsa: document additional Microchip KSZ8863/8873 switch
Michael Grzeschik [Tue, 27 Apr 2021 07:09:06 +0000 (09:09 +0200)]
dt-bindings: net: dsa: document additional Microchip KSZ8863/8873 switch

It is a 3-Port 10/100 Ethernet Switch. One CPU-Port and two
Switch-Ports.

Cc: devicetree@vger.kernel.org
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Rob Herring <robh@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: microchip: Add Microchip KSZ8863 SPI based driver support
Michael Grzeschik [Tue, 27 Apr 2021 07:09:05 +0000 (09:09 +0200)]
net: dsa: microchip: Add Microchip KSZ8863 SPI based driver support

Add KSZ88X3 driver support. We add support for the KXZ88X3 three port
switches using the SPI Interface.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: microchip: ksz8795: add support for ksz88xx chips
Oleksij Rempel [Tue, 27 Apr 2021 07:09:04 +0000 (09:09 +0200)]
net: dsa: microchip: ksz8795: add support for ksz88xx chips

We add support for the ksz8863 and ksz8873 chips which are
using the same register patterns but other offsets as the
ksz8795.

Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: microchip: ksz8795: move register offsets and shifts to separate struct
Michael Grzeschik [Tue, 27 Apr 2021 07:09:03 +0000 (09:09 +0200)]
net: dsa: microchip: ksz8795: move register offsets and shifts to separate struct

In order to get this driver used with other switches the functions need
to use different offsets and register shifts. This patch changes the
direct use of the register defines to register description structures,
which can be set depending on the chips register layout.

Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: microchip: ksz8795: move cpu_select_interface to extra function
Michael Grzeschik [Tue, 27 Apr 2021 07:09:02 +0000 (09:09 +0200)]
net: dsa: microchip: ksz8795: move cpu_select_interface to extra function

This patch moves the cpu interface selection code to a individual
function specific for ksz8795. It will make it simpler to customize the
code path for different switches supported by this driver.

Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: microchip: ksz8795: change drivers prefix to be generic
Michael Grzeschik [Tue, 27 Apr 2021 07:09:01 +0000 (09:09 +0200)]
net: dsa: microchip: ksz8795: change drivers prefix to be generic

The driver can be used on other chips of this type. To reflect
this we rename the drivers prefix from ksz8795 to ksz8.

Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'ocelot-ptp'
David S. Miller [Tue, 27 Apr 2021 21:10:16 +0000 (14:10 -0700)]
Merge branch 'ocelot-ptp'

Yangbo Lu says:

====================
Support Ocelot PTP Sync one-step timestamping

This patch-set is to support Ocelot PTP Sync one-step timestamping.
Actually before that, this patch-set cleans up and optimizes the
DSA slave tx timestamp request handling process.

Changes for v2:
- Split tx timestamp optimization patch.
- Updated doc patch.
- Freed skb->cb usage in dsa core driver, and moved to device
  drivers.
- Other minor fixes.
Changes for v3:
- Switched sequence of patch #3 and #4 with rebasing to fix build.
- Replaced hard coded 48 of memset(skb->cb, 0, 48) with sizeof().
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: mscc: ocelot: support PTP Sync one-step timestamping
Yangbo Lu [Tue, 27 Apr 2021 04:22:03 +0000 (12:22 +0800)]
net: mscc: ocelot: support PTP Sync one-step timestamping

Although HWTSTAMP_TX_ONESTEP_SYNC existed in ioctl for hardware timestamp
configuration, the PTP Sync one-step timestamping had never been supported.

This patch is to truely support it.

- ocelot_port_txtstamp_request()
  This function handles tx timestamp request by storing
  ptp_cmd(tx timestamp type) in OCELOT_SKB_CB(skb)->ptp_cmd,
  and additionally for two-step timestamp storing ts_id in
  OCELOT_SKB_CB(clone)->ptp_cmd.

- ocelot_ptp_rew_op()
  During xmit, this function is called to get rew_op (rewriter option) by
  checking skb->cb for tx timestamp request, and configure to transmitting.

Non-onestep-Sync packet with one-step timestamp request falls back to use
two-step timestamp.

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: mscc: ocelot: convert to ocelot_port_txtstamp_request()
Yangbo Lu [Tue, 27 Apr 2021 04:22:02 +0000 (12:22 +0800)]
net: mscc: ocelot: convert to ocelot_port_txtstamp_request()

Convert to a common ocelot_port_txtstamp_request() for TX timestamp
request handling.

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodocs: networking: timestamping: update for DSA switches
Yangbo Lu [Tue, 27 Apr 2021 04:22:01 +0000 (12:22 +0800)]
docs: networking: timestamping: update for DSA switches

Update timestamping doc for DSA switches to describe current
implementation accurately. On TX, the skb cloning is no longer
in DSA generic code.

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: free skb->cb usage in core driver
Yangbo Lu [Tue, 27 Apr 2021 04:22:00 +0000 (12:22 +0800)]
net: dsa: free skb->cb usage in core driver

Free skb->cb usage in core driver and let device drivers decide to
use or not. The reason having a DSA_SKB_CB(skb)->clone was because
dsa_skb_tx_timestamp() which may set the clone pointer was called
before p->xmit() which would use the clone if any, and the device
driver has no way to initialize the clone pointer.

This patch just put memset(skb->cb, 0, sizeof(skb->cb)) at beginning
of dsa_slave_xmit(). Some new features in the future, like one-step
timestamp may need more bytes of skb->cb to use in
dsa_skb_tx_timestamp(), and p->xmit().

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: no longer clone skb in core driver
Yangbo Lu [Tue, 27 Apr 2021 04:21:59 +0000 (12:21 +0800)]
net: dsa: no longer clone skb in core driver

It was a waste to clone skb directly in dsa_skb_tx_timestamp().
For one-step timestamping, a clone was not needed. For any failure of
port_txtstamp (this may usually happen), the skb clone had to be freed.

So this patch moves skb cloning for tx timestamp out of dsa core, and
let drivers clone skb in port_txtstamp if they really need.

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Tested-by: Kurt Kanzenbach <kurt@linutronix.de>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: no longer identify PTP packet in core driver
Yangbo Lu [Tue, 27 Apr 2021 04:21:58 +0000 (12:21 +0800)]
net: dsa: no longer identify PTP packet in core driver

Move ptp_classify_raw out of dsa core driver for handling tx
timestamp request. Let device drivers do this if they want.
Not all drivers want to limit tx timestamping for only PTP
packet.

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Tested-by: Kurt Kanzenbach <kurt@linutronix.de>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: check tx timestamp request in core driver
Yangbo Lu [Tue, 27 Apr 2021 04:21:57 +0000 (12:21 +0800)]
net: dsa: check tx timestamp request in core driver

Check tx timestamp request in core driver at very beginning of
dsa_skb_tx_timestamp(), so that most skbs not requiring tx
timestamp just return. And drop such checking in device drivers.

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Tested-by: Kurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agofddi/skfp: fix typo
qhjindev [Mon, 26 Apr 2021 23:57:52 +0000 (07:57 +0800)]
fddi/skfp: fix typo

change 'privae' to 'private'

Signed-off-by: qhjindev <qhjin_dev@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: mv88e6xxx: Fix 6095/6097/6185 ports in non-SERDES CMODE
Tobias Waldekranz [Mon, 26 Apr 2021 16:17:34 +0000 (18:17 +0200)]
net: dsa: mv88e6xxx: Fix 6095/6097/6185 ports in non-SERDES CMODE

The .serdes_get_lane op used the magic value 0xff to indicate a valid
SERDES lane and 0 signaled that a non-SERDES mode was set on the port.

Unfortunately, "0" is also a valid lane ID, so even when these ports
where configured to e.g. RGMII the driver would set them up as SERDES
ports.

- Replace 0xff with 0 to indicate a valid lane ID. The number is on
  the one hand just as arbitrary, but it is at least the first valid one
  and therefore less of a surprise.

- Follow the other .serdes_get_lane implementations and return -ENODEV
  in the case where no SERDES is assigned to the port.

Fixes: f5be107c3338 ("net: dsa: mv88e6xxx: Support serdes ports on MV88E6097/6095/6185")
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: phy: marvell-88x2222: enable autoneg by default
Ivan Bornyakov [Mon, 26 Apr 2021 16:08:23 +0000 (19:08 +0300)]
net: phy: marvell-88x2222: enable autoneg by default

There is no real need for disabling autonigotiation in config_init().
Leave it enabled by default.

Signed-off-by: Ivan Bornyakov <i.bornyakov@metrotek.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agorxrpc: rxkad: Remove redundant variable offset
Jiapeng Chong [Mon, 26 Apr 2021 10:13:03 +0000 (18:13 +0800)]
rxrpc: rxkad: Remove redundant variable offset

Variable offset is being assigned a value from a calculation
however the variable is never read, so this redundant variable
can be removed.

Cleans up the following clang-analyzer warning:

net/rxrpc/rxkad.c:579:2: warning: Value stored to 'offset' is never read
[clang-analyzer-deadcode.DeadStores].

net/rxrpc/rxkad.c:485:2: warning: Value stored to 'offset' is never read
[clang-analyzer-deadcode.DeadStores].

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agomacvlan: Use 'hash' iterators to simplify code
Christophe JAILLET [Sun, 25 Apr 2021 16:14:10 +0000 (18:14 +0200)]
macvlan: Use 'hash' iterators to simplify code

Use 'hash_for_each_rcu' and 'hash_for_each_safe' instead of hand writing
them. This saves some lines of code, reduce indentation and improve
readability.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: bridge: mcast: fix broken length + header check for MRDv6 Adv.
Linus Lüssing [Sun, 25 Apr 2021 15:27:35 +0000 (17:27 +0200)]
net: bridge: mcast: fix broken length + header check for MRDv6 Adv.

The IPv6 Multicast Router Advertisements parsing has the following two
issues:

For one thing, ICMPv6 MRD Advertisements are smaller than ICMPv6 MLD
messages (ICMPv6 MRD Adv.: 8 bytes vs. ICMPv6 MLDv1/2: >= 24 bytes,
assuming MLDv2 Reports with at least one multicast address entry).
When ipv6_mc_check_mld_msg() tries to parse an Multicast Router
Advertisement its MLD length check will fail - and it will wrongly
return -EINVAL, even if we have a valid MRD Advertisement. With the
returned -EINVAL the bridge code will assume a broken packet and will
wrongly discard it, potentially leading to multicast packet loss towards
multicast routers.

The second issue is the MRD header parsing in
br_ip6_multicast_mrd_rcv(): It wrongly checks for an ICMPv6 header
immediately after the IPv6 header (IPv6 next header type). However
according to RFC4286, section 2 all MRD messages contain a Router Alert
option (just like MLD). So instead there is an IPv6 Hop-by-Hop option
for the Router Alert between the IPv6 and ICMPv6 header, again leading
to the bridge wrongly discarding Multicast Router Advertisements.

To fix these two issues, introduce a new return value -ENODATA to
ipv6_mc_check_mld() to indicate a valid ICMPv6 packet with a hop-by-hop
option which is not an MLD but potentially an MRD packet. This also
simplifies further parsing in the bridge code, as ipv6_mc_check_mld()
already fully checks the ICMPv6 header and hop-by-hop option.

These issues were found and fixed with the help of the mrdisc tool
(https://github.com/troglobit/mrdisc).

Fixes: 4b3087c7e37f ("bridge: Snoop Multicast Router Advertisements")
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet:emac/emac-mac: Fix a use after free in emac_mac_tx_buf_send
Lv Yunlong [Mon, 26 Apr 2021 16:06:25 +0000 (09:06 -0700)]
net:emac/emac-mac: Fix a use after free in emac_mac_tx_buf_send

In emac_mac_tx_buf_send, it calls emac_tx_fill_tpd(..,skb,..).
If some error happens in emac_tx_fill_tpd(), the skb will be freed via
dev_kfree_skb(skb) in error branch of emac_tx_fill_tpd().
But the freed skb is still used via skb->len by netdev_sent_queue(,skb->len).

As i observed that emac_tx_fill_tpd() haven't modified the value of skb->len,
thus my patch assigns skb->len to 'len' before the possible free and
use 'len' instead of skb->len later.

Fixes: b9b17debc69d2 ("net: emac: emac gigabit ethernet controller driver")
Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet/sched: act_ct: fix wild memory access when clearing fragments
Davide Caratti [Mon, 26 Apr 2021 15:45:51 +0000 (17:45 +0200)]
net/sched: act_ct: fix wild memory access when clearing fragments

while testing re-assembly/re-fragmentation using act_ct, it's possible to
observe a crash like the following one:

 KASAN: maybe wild-memory-access in range [0x0001000000000448-0x000100000000044f]
 CPU: 50 PID: 0 Comm: swapper/50 Tainted: G S                5.12.0-rc7+ #424
 Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.4.3 01/17/2017
 RIP: 0010:inet_frag_rbtree_purge+0x50/0xc0
 Code: 00 fc ff df 48 89 c3 31 ed 48 89 df e8 a9 7a 38 ff 4c 89 fe 48 89 df 49 89 c6 e8 5b 3a 38 ff 48 8d 7b 40 48 89 f8 48 c1 e8 03 <42> 80 3c 20 00 75 59 48 8d bb d0 00 00 00 4c 8b 6b 40 48 89 f8 48
 RSP: 0018:ffff888c31449db8 EFLAGS: 00010203
 RAX: 0000200000000089 RBX: 000100000000040e RCX: ffffffff989eb960
 RDX: 0000000000000140 RSI: ffffffff97cfb977 RDI: 000100000000044e
 RBP: 0000000000000900 R08: 0000000000000000 R09: ffffed1186289350
 R10: 0000000000000003 R11: ffffed1186289350 R12: dffffc0000000000
 R13: 000100000000040e R14: 0000000000000000 R15: ffff888155e02160
 FS:  0000000000000000(0000) GS:ffff888c31440000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00005600cb70a5b8 CR3: 0000000a2c014005 CR4: 00000000003706e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  <IRQ>
  inet_frag_destroy+0xa9/0x150
  call_timer_fn+0x2d/0x180
  run_timer_softirq+0x4fe/0xe70
  __do_softirq+0x197/0x5a0
  irq_exit_rcu+0x1de/0x200
  sysvec_apic_timer_interrupt+0x6b/0x80
  </IRQ>

when act_ct temporarily stores an IP fragment, restoring the skb qdisc cb
results in putting random data in FRAG_CB(), and this causes those "wild"
memory accesses later, when the rbtree is purged. Never overwrite the skb
cb in case tcf_ct_handle_fragments() returns -EINPROGRESS.

Fixes: ae372cb1750f ("net/sched: act_ct: fix restore the qdisc_skb_cb after defrag")
Fixes: 7baf2429a1a9 ("net/sched: cls_flower add CT_FLAGS_INVALID flag support")
Reported-by: Shuang Li <shuali@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: Fix typo in comment about ancillary data
Arnaldo Carvalho de Melo [Mon, 26 Apr 2021 12:44:47 +0000 (09:44 -0300)]
net: Fix typo in comment about ancillary data

Ingo sent typo fixes for tools/ and this resulted in a warning when
building the perf/core branch that will be sent upstream in the next
merge window:

  Warning: Kernel ABI header at 'tools/perf/trace/beauty/include/linux/socket.h' differs from latest version at 'include/linux/socket.h'
  diff -u tools/perf/trace/beauty/include/linux/socket.h include/linux/socket.h

Fix the typo on the kernel file to address this.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodt-bindings: net: convert rockchip-dwmac to json-schema
Ezequiel Garcia [Mon, 26 Apr 2021 02:41:18 +0000 (23:41 -0300)]
dt-bindings: net: convert rockchip-dwmac to json-schema

Convert Rockchip dwmac controller dt-bindings to YAML.

Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Reviewed-by: David Wu <david.wu@rock-chips.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodt-bindings: net: dwmac: Add Rockchip DWMAC support
Ezequiel Garcia [Mon, 26 Apr 2021 02:41:17 +0000 (23:41 -0300)]
dt-bindings: net: dwmac: Add Rockchip DWMAC support

Add Rockchip DWMAC controllers, which are based on snps,dwmac.
Some of the SoCs require up to eight clocks, so maxItems
for clocks and clock-names need to be increased.

Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Reviewed-by: David Wu <david.wu@rock-chips.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoarm64: dts: rockchip: Remove unnecessary reset in rk3328.dtsi
Ezequiel Garcia [Mon, 26 Apr 2021 02:41:16 +0000 (23:41 -0300)]
arm64: dts: rockchip: Remove unnecessary reset in rk3328.dtsi

Rockchip DWMAC glue driver uses the phy node (phy-handle)
reset specifier, and not a "mac-phy" reset specifier.

Remove it.

Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Reviewed-by: David Wu <david.wu@rock-chips.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: hso: fix NULL-deref on disconnect regression
Johan Hovold [Mon, 26 Apr 2021 08:11:49 +0000 (10:11 +0200)]
net: hso: fix NULL-deref on disconnect regression

Commit 8a12f8836145 ("net: hso: fix null-ptr-deref during tty device
unregistration") fixed the racy minor allocation reported by syzbot, but
introduced an unconditional NULL-pointer dereference on every disconnect
instead.

Specifically, the serial device table must no longer be accessed after
the minor has been released by hso_serial_tty_unregister().

Fixes: 8a12f8836145 ("net: hso: fix null-ptr-deref during tty device unregistration")
Cc: stable@vger.kernel.org
Cc: Anirudh Rayabharam <mail@anirudhrb.com>
Reported-by: Leonardo Antoniazzi <leoanto@aruba.it>
Signed-off-by: Johan Hovold <johan@kernel.org>
Reviewed-by: Anirudh Rayabharam <mail@anirudhrb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge tag 'linux-can-next-for-5.13-20210426' of git://git.kernel.org/pub/scm/linux...
David S. Miller [Mon, 26 Apr 2021 19:52:15 +0000 (12:52 -0700)]
Merge tag 'linux-can-next-for-5.13-20210426' of git://git./linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
pull-request: can-next 2021-04-26

this is a pull request of 4 patches for net-next/master.

the first two patches are from Colin Ian King and target the
etas_es58x driver, they add a missing NULL pointer check and fix some
typos.

The next two patches are by Erik Flodin. The first one updates the CAN
documentation regarding filtering, the other one fixes the header
alignment in CAN related proc output on 64 bit systems.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: davicom: Remove redundant assignment to ret
Jiapeng Chong [Sun, 25 Apr 2021 10:42:56 +0000 (18:42 +0800)]
net: davicom: Remove redundant assignment to ret

Variable ret is set to zero but this value is never read as it is
overwritten with a new value later on, hence it is a redundant
assignment and can be removed.

Cleans up the following clang-analyzer warning:

drivers/net/ethernet/davicom/dm9000.c:1527:5: warning: Value stored to
'ret' is never read [clang-analyzer-deadcode.DeadStores].

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agopcnet32: Remove redundant variable prev_link and curr_link
Jiapeng Chong [Sun, 25 Apr 2021 10:35:18 +0000 (18:35 +0800)]
pcnet32: Remove redundant variable prev_link and curr_link

Variable prev_link and curr_link is being assigned a value from a
calculation however the variable is never read, so this redundant
variable can be removed.

Cleans up the following clang-analyzer warning:

drivers/net/ethernet/amd/pcnet32.c:2857:4: warning: Value stored to
'prev_link' is never read [clang-analyzer-deadcode.DeadStores].

drivers/net/ethernet/amd/pcnet32.c:2856:4: warning: Value stored to
'curr_link' is never read [clang-analyzer-deadcode.DeadStores].

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
David S. Miller [Mon, 26 Apr 2021 19:31:42 +0000 (12:31 -0700)]
Merge git://git./linux/kernel/git/pablo/nf-next

Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for net-next:

1) The various ip(6)table_foo incarnations are updated to expect
   that the table is passed as 'void *priv' argument that netfilter core
   passes to the hook functions. This reduces the struct net size by 2
   cachelines on x86_64. From Florian Westphal.

2) Add cgroupsv2 support for nftables.

3) Fix bridge log family merge into nf_log_syslog: Missing
   unregistration from netns exit path, from Phil Sutter.

4) Add nft_pernet() helper to access nftables pernet area.

5) Add struct nfnl_info to reduce nfnetlink callback footprint and
   to facilite future updates. Consolidate nfnetlink callbacks.

6) Add CONFIG_NETFILTER_XTABLES_COMPAT Kconfig knob, also from Florian.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
David S. Miller [Mon, 26 Apr 2021 19:00:00 +0000 (12:00 -0700)]
Merge git://git./linux/kernel/git/netdev/net

3 years agonetfilter: allow to turn off xtables compat layer
Florian Westphal [Mon, 26 Apr 2021 10:14:40 +0000 (12:14 +0200)]
netfilter: allow to turn off xtables compat layer

The compat layer needs to parse untrusted input (the ruleset)
to translate it to a 64bit compatible format.

We had a number of bugs in this department in the past, so allow users
to turn this feature off.

Add CONFIG_NETFILTER_XTABLES_COMPAT kconfig knob and make it default to y
to keep existing behaviour.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 years agonetfilter: nfnetlink: consolidate callback types
Pablo Neira Ayuso [Thu, 22 Apr 2021 22:17:12 +0000 (00:17 +0200)]
netfilter: nfnetlink: consolidate callback types

Add enum nfnl_callback_type to identify the callback type to provide one
single callback.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 years agonetfilter: nfnetlink: pass struct nfnl_info to batch callbacks
Pablo Neira Ayuso [Thu, 22 Apr 2021 22:17:11 +0000 (00:17 +0200)]
netfilter: nfnetlink: pass struct nfnl_info to batch callbacks

Update batch callbacks to use the nfnl_info structure. Rename one
clashing info variable to expr_info.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 years agonetfilter: nfnetlink: pass struct nfnl_info to rcu callbacks
Pablo Neira Ayuso [Thu, 22 Apr 2021 22:17:10 +0000 (00:17 +0200)]
netfilter: nfnetlink: pass struct nfnl_info to rcu callbacks

Update rcu callbacks to use the nfnl_info structure.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 years agonetfilter: nfnetlink: add struct nfnl_info and pass it to callbacks
Pablo Neira Ayuso [Thu, 22 Apr 2021 22:17:09 +0000 (00:17 +0200)]
netfilter: nfnetlink: add struct nfnl_info and pass it to callbacks

Add a new structure to reduce callback footprint and to facilite
extensions of the nfnetlink callback interface in the future.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 years agonetfilter: nftables: add nft_pernet() helper function
Pablo Neira Ayuso [Thu, 22 Apr 2021 22:17:08 +0000 (00:17 +0200)]
netfilter: nftables: add nft_pernet() helper function

Consolidate call to net_generic(net, nf_tables_net_id) in this
wrapper function.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 years agoMerge branch 'bnxt_en-next'
David S. Miller [Mon, 26 Apr 2021 01:37:39 +0000 (18:37 -0700)]
Merge branch 'bnxt_en-next'

Michael Chan says:

====================
bnxt_en: Updates for net-next.

This series includes these main enhancements:

1. Link related changes
    - add NRZ/PAM4 link signal mode to the link up message if known
    - rely on firmware to bring down the link during ifdown

2. SRIOV related changes
    - allow VF promiscuous mode if the VF is trusted
    - allow ndo operations to configure VF when the PF is ifdown
    - fix the scenario of the VF taking back control of it's MAC address
    - add Hyper-V VF device IDs

3. Support the option to transmit without FCS/CRC.

4. Implement .ndo_features_check() to disable offload when the UDP
   encap. packets are not supported.

v2: Patch10: Reverse the check for supported UDP ports to be more straight
forward.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agobnxt_en: Implement .ndo_features_check().
Michael Chan [Sun, 25 Apr 2021 17:45:27 +0000 (13:45 -0400)]
bnxt_en: Implement .ndo_features_check().

For UDP encapsultions, we only support the offloaded Vxlan port and
Geneve port.  All other ports included FOU and GUE are not supported so
we need to turn off TSO and checksum features.

v2: Reverse the check for supported UDP ports to be more straight forward.

Reviewed-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agobnxt_en: Support IFF_SUPP_NOFCS feature to transmit without ethernet FCS.
Michael Chan [Sun, 25 Apr 2021 17:45:26 +0000 (13:45 -0400)]
bnxt_en: Support IFF_SUPP_NOFCS feature to transmit without ethernet FCS.

If firmware is capable, set the IFF_SUPP_NOFCS flag to support the
sockets option to transmit packets without FCS.  This is mainly used
for testing.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agobnxt_en: Add PCI IDs for Hyper-V VF devices.
Michael Chan [Sun, 25 Apr 2021 17:45:25 +0000 (13:45 -0400)]
bnxt_en: Add PCI IDs for Hyper-V VF devices.

Support VF device IDs used by the Hyper-V hypervisor.

Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agobnxt_en: Call bnxt_approve_mac() after the PF gives up control of the VF MAC.
Michael Chan [Sun, 25 Apr 2021 17:45:24 +0000 (13:45 -0400)]
bnxt_en: Call bnxt_approve_mac() after the PF gives up control of the VF MAC.

When the PF is no longer enforcing an assigned MAC address on a VF, the
VF needs to call bnxt_approve_mac() to tell the PF what MAC address it is
now using.  Otherwise it gets out of sync and the PF won't know what
MAC address the VF wants to use.  Ultimately the VF will fail when it
tries to setup the L2 MAC filter for the vnic.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agobnxt_en: Move bnxt_approve_mac().
Michael Chan [Sun, 25 Apr 2021 17:45:23 +0000 (13:45 -0400)]
bnxt_en: Move bnxt_approve_mac().

Move it before bnxt_update_vf_mac().  In the next patch, we need to call
bnxt_approve_mac() from bnxt_update_mac() under some conditions.  This
will avoid forward declaration.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agobnxt_en: allow VF config ops when PF is closed
Edwin Peer [Sun, 25 Apr 2021 17:45:22 +0000 (13:45 -0400)]
bnxt_en: allow VF config ops when PF is closed

It is perfectly legal for the stack to query and configure VFs via PF
NDOs while the NIC is administratively down.  Remove the unnecessary
check for the PF to be in open state.

Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agobnxt_en: allow promiscuous mode for trusted VFs
Edwin Peer [Sun, 25 Apr 2021 17:45:21 +0000 (13:45 -0400)]
bnxt_en: allow promiscuous mode for trusted VFs

Firmware previously only allowed promiscuous mode for VFs associated with
a default VLAN. It is now possible to enable promiscuous mode for a VF
having no VLAN configured provided that it is trusted. In such cases the
VF will see all packets received by the PF, irrespective of destination
MAC or VLAN.

Note, it is necessary to query firmware at the time of bnxt_promisc_ok()
instead of in bnxt_hwrm_func_qcfg() because the trusted status might be
altered by the PF after the VF has been configured. This check must now
also be deferred because the firmware call sleeps.

Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agobnxt_en: Add support for fw managed link down feature.
Michael Chan [Sun, 25 Apr 2021 17:45:20 +0000 (13:45 -0400)]
bnxt_en: Add support for fw managed link down feature.

In the current code, the driver will not shutdown the link during
IFDOWN if there are still VFs sharing the port.  Newer firmware will
manage the link down decision when the port is shared by VFs, so
we can just call firmware to shutdown the port unconditionally and
let firmware make the final decision.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agobnxt_en: Add a new phy_flags field to the main driver structure.
Michael Chan [Sun, 25 Apr 2021 17:45:19 +0000 (13:45 -0400)]
bnxt_en: Add a new phy_flags field to the main driver structure.

Copy the phy related feature flags from the firmware call
HWRM_PORT_PHY_QCAPS to this new field.  We can also remove the flags
field in the bnxt_test_info structure.  It's cleaner to have all PHY
related flags in one location, directly copied from the firmware.

To keep the BNXT_PHY_CFG_ABLE() macro logic the same, we need to make
a slight adjustment to check that it is a PF.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agobnxt_en: report signal mode in link up messages
Edwin Peer [Sun, 25 Apr 2021 17:45:18 +0000 (13:45 -0400)]
bnxt_en: report signal mode in link up messages

Firmware reports link signalling mode for certain speeds. In these
cases, print the signalling modes in kernel log link up messages.

Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agomacvlan: Add nodst option to macvlan type source
Jethro Beekman [Sun, 25 Apr 2021 09:22:03 +0000 (11:22 +0200)]
macvlan: Add nodst option to macvlan type source

The default behavior for source MACVLAN is to duplicate packets to
appropriate type source devices, and then do the normal destination MACVLAN
flow. This patch adds an option to skip destination MACVLAN processing if
any matching source MACVLAN device has the option set.

This allows setting up a "catch all" device for source MACVLAN: create one
or more devices with type source nodst, and one device with e.g. type vepa,
and incoming traffic will be received on exactly one device.

v2: netdev wants non-standard line length

Signed-off-by: Jethro Beekman <kernel@jbeekman.nl>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge tag 'mlx5-updates-2021-04-21' of git://git.kernel.org/pub/scm/linux/kernel...
David S. Miller [Mon, 26 Apr 2021 01:31:35 +0000 (18:31 -0700)]
Merge tag 'mlx5-updates-2021-04-21' of git://git./linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2021-04-21

devlink external port attribute for SF (Sub-Function) port flavour

This adds the support to instantiate Sub-Functions on external hosts
E.g when Eswitch manager is enabled on the ARM SmarNic SoC CPU, users
are now able to spawn new Sub-Functions on the Host server CPU.

Parav Pandit Says:
==================

This series introduces and uses external attribute for the SF port to
indicate that a SF port belongs to an external controller.

This is needed to generate unique phys_port_name when PF and SF numbers
are overlapping between local and external controllers.
For example two controllers 0 and 1, both of these controller have a SF.
having PF number 0, SF number 77. Here, phys_port_name has duplicate
entry which doesn't have controller number in it.

Hence, add controller number optionally when a SF port is for an
external controller. This extension is similar to existing PF and VF
eswitch ports of the external controller.

When a SF is for external controller an example view of external SF
port and config sequence:

On eswitch system:
$ devlink dev eswitch set pci/0033:01:00.0 mode switchdev

$ devlink port show
pci/0033:01:00.0/196607: type eth netdev enP51p1s0f0np0 flavour physical port 0 splittable false
pci/0033:01:00.0/131072: type eth netdev eth0 flavour pcipf controller 1 pfnum 0 external true splittable false
  function:
    hw_addr 00:00:00:00:00:00

$ devlink port add pci/0033:01:00.0 flavour pcisf pfnum 0 sfnum 77 controller 1
pci/0033:01:00.0/163840: type eth netdev eth1 flavour pcisf controller 1 pfnum 0 sfnum 77 splittable false
  function:
    hw_addr 00:00:00:00:00:00 state inactive opstate detached

phys_port_name construction:
$ cat /sys/class/net/eth1/phys_port_name
c1pf0sf77

Patch summary:
First 3 patches prepares the eswitch to handle vports in more generic
way using xarray to lookup vport from its unique vport number.
Patch-1 returns maximum eswitch ports only when eswitch is enabled
Patch-2 prepares eswitch to return eswitch max ports from a struct
Patch-3 uses xarray for vport and representor lookup
Patch-4 considers SF for an additioanl range of SF vports
Patch-5 relies on SF hw table to check SF support
Patch-6 extends SF devlink port attribute for external flag
Patch-7 stores the per controller SF allocation attributes
Patch-8 uses SF function id for filtering events
Patch-9 uses helper for allocation and free
Patch-10 splits hw table into per controller table and generic one
Patch-11 extends sf table for additional range

==================

====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ethernet: ixp4xx: Support device tree probing
Linus Walleij [Sun, 25 Apr 2021 00:30:38 +0000 (02:30 +0200)]
net: ethernet: ixp4xx: Support device tree probing

This adds device tree probing to the IXP4xx ethernet
driver.

Add a platform data bool to tell us whether to
register an MDIO bus for the device or not, as well
as the corresponding NPE.

We need to drop the memory region request as part of
this since the OF core will request the memory for the
device.

Cc: Zoltan HERPAI <wigyori@uid0.hu>
Cc: Raylynn Knight <rayknight@me.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ethernet: ixp4xx: Retire ancient phy retrieveal
Linus Walleij [Sun, 25 Apr 2021 00:30:37 +0000 (02:30 +0200)]
net: ethernet: ixp4xx: Retire ancient phy retrieveal

This driver was using a really dated way of obtaining the
phy by printing a string and using it with phy_connect().
Switch to using more reasonable modern interfaces.

Suggested-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ethernet: ixp4xx: Add DT bindings
Linus Walleij [Sun, 25 Apr 2021 00:30:36 +0000 (02:30 +0200)]
net: ethernet: ixp4xx: Add DT bindings

This adds device tree bindings for the IXP4xx ethernet
controller with optional MDIO bridge.

Cc: Zoltan HERPAI <wigyori@uid0.hu>
Cc: Raylynn Knight <rayknight@me.com>
Cc: devicetree@vger.kernel.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agor8152: remove some bit operations
Hayes Wang [Sat, 24 Apr 2021 06:09:03 +0000 (14:09 +0800)]
r8152: remove some bit operations

Remove DELL_TB_RX_AGG_BUG and LENOVO_MACPASSTHRU flags of rtl8152_flags.
They are only set when initializing and wouldn't be change. It is enough
to record them with variables.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonetfilter: nf_log_syslog: Unset bridge logger in pernet exit
Phil Sutter [Wed, 21 Apr 2021 10:34:21 +0000 (12:34 +0200)]
netfilter: nf_log_syslog: Unset bridge logger in pernet exit

Without this, a stale pointer remains in pernet loggers after module
unload causing a kernel oops during dereference. Easily reproduced by:

| # modprobe nf_log_syslog
| # rmmod nf_log_syslog
| # cat /proc/net/netfilter/nf_log

Fixes: 77ccee96a6742 ("netfilter: nf_log_bridge: merge with nf_log_syslog")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 years agonetfilter: remove all xt_table anchors from struct net
Florian Westphal [Wed, 21 Apr 2021 07:51:10 +0000 (09:51 +0200)]
netfilter: remove all xt_table anchors from struct net

No longer needed, table pointer arg is now passed via netfilter core.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 years agonetfilter: ip6_tables: pass table pointer via nf_hook_ops
Florian Westphal [Wed, 21 Apr 2021 07:51:09 +0000 (09:51 +0200)]
netfilter: ip6_tables: pass table pointer via nf_hook_ops

Same patch as the ip_tables one: removal of all accesses to ip6_tables
xt_table pointers.  After this patch the struct net xt_table anchors
can be removed.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 years agonetfilter: arp_tables: pass table pointer via nf_hook_ops
Florian Westphal [Wed, 21 Apr 2021 07:51:08 +0000 (09:51 +0200)]
netfilter: arp_tables: pass table pointer via nf_hook_ops

Same change as previous patch.  Only difference:
no need to handle NULL template_ops parameter, the only caller
(arptable_filter) always passes non-NULL argument.

This removes all remaining accesses to net->ipv4.arptable_filter.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 years agonetfilter: ip_tables: pass table pointer via nf_hook_ops
Florian Westphal [Wed, 21 Apr 2021 07:51:07 +0000 (09:51 +0200)]
netfilter: ip_tables: pass table pointer via nf_hook_ops

iptable_x modules rely on 'struct net' to contain a pointer to the
table that should be evaluated.

In order to remove these pointers from struct net, pass them via
the 'priv' pointer in a similar fashion as nf_tables passes the
rule data.

To do that, duplicate the nf_hook_info array passed in from the
iptable_x modules, update the ops->priv pointers of the copy to
refer to the table and then change the hookfn implementations to
just pass the 'priv' argument to the traverser.

After this patch, the xt_table pointers can already be removed
from struct net.

However, changes to struct net result in re-compile of the entire
network stack, so do the removal after arptables and ip6tables
have been converted as well.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 years agonetfilter: xt_nat: pass table to hookfn
Florian Westphal [Wed, 21 Apr 2021 07:51:06 +0000 (09:51 +0200)]
netfilter: xt_nat: pass table to hookfn

This changes how ip(6)table nat passes the ruleset/table to the
evaluation loop.

At the moment, it will fetch the table from struct net.

This change stores the table in the hook_ops 'priv' argument
instead.

This requires to duplicate the hook_ops for each netns, so
they can store the (per-net) xt_table structure.

The dupliated nat hook_ops get stored in net_generic data area.
They are free'd in the namespace exit path.

This is a pre-requisite to remove the xt_table/ruleset pointers
from struct net.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 years agonetfilter: x_tables: remove paranoia tests
Florian Westphal [Wed, 21 Apr 2021 07:51:05 +0000 (09:51 +0200)]
netfilter: x_tables: remove paranoia tests

No need for these.
There is only one caller, the xtables core, when the table is registered
for the first time with a particular network namespace.

After ->table_init() call, the table is linked into the tables[af] list,
so next call to that function will skip the ->table_init().

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 years agonetfilter: arptables: unregister the tables by name
Florian Westphal [Wed, 21 Apr 2021 07:51:04 +0000 (09:51 +0200)]
netfilter: arptables: unregister the tables by name

and again, this time for arptables.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 years agonetfilter: ip6tables: unregister the tables by name
Florian Westphal [Wed, 21 Apr 2021 07:51:03 +0000 (09:51 +0200)]
netfilter: ip6tables: unregister the tables by name

Same as the previous patch, but for ip6tables.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 years agonetfilter: iptables: unregister the tables by name
Florian Westphal [Wed, 21 Apr 2021 07:51:02 +0000 (09:51 +0200)]
netfilter: iptables: unregister the tables by name

xtables stores the xt_table structs in the struct net.  This isn't
needed anymore, the structures could be passed via the netfilter hook
'private' pointer to the hook functions, which would allow us to remove
those pointers from struct net.

As a first step, reduce the number of accesses to the
net->ipv4.ip6table_{raw,filter,...} pointers.
This allows the tables to get unregistered by name instead of having to
pass the raw address.

The xt_table structure cane looked up by name+address family instead.

This patch is useless as-is (the backends still have the raw pointer
address), but it lowers the bar to remove those.

It also allows to put the 'was table registered in the first place' check
into ip_tables.c rather than have it in each table sub module.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 years agonetfilter: x_tables: add xt_find_table
Florian Westphal [Wed, 21 Apr 2021 07:51:01 +0000 (09:51 +0200)]
netfilter: x_tables: add xt_find_table

This will be used to obtain the xt_table struct given address family and
table name.

Followup patches will reduce the number of direct accesses to the xt_table
structures via net->ipv{4,6}.ip(6)table_{nat,mangle,...} pointers, then
remove them.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 years agonetfilter: x_tables: remove ipt_unregister_table
Florian Westphal [Wed, 21 Apr 2021 07:51:00 +0000 (09:51 +0200)]
netfilter: x_tables: remove ipt_unregister_table

Its the same function as ipt_unregister_table_exit.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 years agonetfilter: ebtables: remove the 3 ebtables pointers from struct net
Florian Westphal [Wed, 21 Apr 2021 07:50:59 +0000 (09:50 +0200)]
netfilter: ebtables: remove the 3 ebtables pointers from struct net

ebtables stores the table internal data (what gets passed to the
ebt_do_table() interpreter) in struct net.

nftables keeps the internal interpreter format in pernet lists
and passes it via the netfilter core infrastructure (priv pointer).

Do the same for ebtables: the nf_hook_ops are duplicated via kmemdup,
then the ops->priv pointer is set to the table that is being registered.

After that, the netfilter core passes this table info to the hookfn.

This allows to remove the pointers from struct net.

Same pattern can be applied to ip/ip6/arptables.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 years agonetfilter: disable defrag once its no longer needed
Florian Westphal [Wed, 21 Apr 2021 07:45:40 +0000 (09:45 +0200)]
netfilter: disable defrag once its no longer needed

When I changed defrag hooks to no longer get registered by default I
intentionally made it so that registration can only be un-done by unloading
the nf_defrag_ipv4/6 module.

In hindsight this was too conservative; there is no reason to keep defrag
on while there is no feature dependency anymore.

Moreover, this won't work if user isn't allowed to remove nf_defrag module.

This adds the disable() functions for both ipv4 and ipv6 and calls them
from conntrack, TPROXY and the xtables socket module.

ipvs isn't converted here, it will behave as before this patch and
will need module removal.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 years agonetfilter: nft_socket: add support for cgroupsv2
Pablo Neira Ayuso [Tue, 20 Apr 2021 23:12:44 +0000 (01:12 +0200)]
netfilter: nft_socket: add support for cgroupsv2

Allow to match on the cgroupsv2 id from ancestor level.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 years agonetfilter: nat: move nf_xfrm_me_harder to where it is used
Florian Westphal [Mon, 19 Apr 2021 16:16:49 +0000 (18:16 +0200)]
netfilter: nat: move nf_xfrm_me_harder to where it is used

remove the export and make it static.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 years agohv_netvsc: Make netvsc/VF binding check both MAC and serial number
Dexuan Cui [Sat, 24 Apr 2021 01:12:35 +0000 (18:12 -0700)]
hv_netvsc: Make netvsc/VF binding check both MAC and serial number

Currently the netvsc/VF binding logic only checks the PCI serial number.

The Microsoft Azure Network Adapter (MANA) supports multiple net_device
interfaces (each such interface is called a "vPort", and has its unique
MAC address) which are backed by the same VF PCI device, so the binding
logic should check both the MAC address and the PCI serial number.

The change should not break any other existing VF drivers, because
Hyper-V NIC SR-IOV implementation requires the netvsc network
interface and the VF network interface have the same MAC address.

Co-developed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Co-developed-by: Shachar Raindel <shacharr@microsoft.com>
Signed-off-by: Shachar Raindel <shacharr@microsoft.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agobnxt_en: Fix RX consumer index logic in the error path.
Michael Chan [Fri, 23 Apr 2021 22:13:19 +0000 (18:13 -0400)]
bnxt_en: Fix RX consumer index logic in the error path.

In bnxt_rx_pkt(), the RX buffers are expected to complete in order.
If the RX consumer index indicates an out of order buffer completion,
it means we are hitting a hardware bug and the driver will abort all
remaining RX packets and reset the RX ring.  The RX consumer index
that we pass to bnxt_discard_rx() is not correct.  We should be
passing the current index (tmp_raw_cons) instead of the old index
(raw_cons).  This bug can cause us to be at the wrong index when
trying to abort the next RX packet.  It can crash like this:

 #0 [ffff9bbcdf5c39a8] machine_kexec at ffffffff9b05e007
 #1 [ffff9bbcdf5c3a00] __crash_kexec at ffffffff9b111232
 #2 [ffff9bbcdf5c3ad0] panic at ffffffff9b07d61e
 #3 [ffff9bbcdf5c3b50] oops_end at ffffffff9b030978
 #4 [ffff9bbcdf5c3b78] no_context at ffffffff9b06aaf0
 #5 [ffff9bbcdf5c3bd8] __bad_area_nosemaphore at ffffffff9b06ae2e
 #6 [ffff9bbcdf5c3c28] bad_area_nosemaphore at ffffffff9b06af24
 #7 [ffff9bbcdf5c3c38] __do_page_fault at ffffffff9b06b67e
 #8 [ffff9bbcdf5c3cb0] do_page_fault at ffffffff9b06bb12
 #9 [ffff9bbcdf5c3ce0] page_fault at ffffffff9bc015c5
    [exception RIP: bnxt_rx_pkt+237]
    RIP: ffffffffc0259cdd  RSP: ffff9bbcdf5c3d98  RFLAGS: 00010213
    RAX: 000000005dd8097f  RBX: ffff9ba4cb11b7e0  RCX: ffffa923cf6e9000
    RDX: 0000000000000fff  RSI: 0000000000000627  RDI: 0000000000001000
    RBP: ffff9bbcdf5c3e60   R8: 0000000000420003   R9: 000000000000020d
    R10: ffffa923cf6ec138  R11: ffff9bbcdf5c3e83  R12: ffff9ba4d6f928c0
    R13: ffff9ba4cac28080  R14: ffff9ba4cb11b7f0  R15: ffff9ba4d5a30000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018

Fixes: a1b0e4e684e9 ("bnxt_en: Improve RX consumer index validity check.")
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoch_ktls: Remove redundant variable result
Jiapeng Chong [Fri, 23 Apr 2021 09:52:23 +0000 (17:52 +0800)]
ch_ktls: Remove redundant variable result

Variable result is being assigned a value from a calculation
however the variable is never read, so this redundant variable
can be removed.

Cleans up the following clang-analyzer warning:

drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c:1488:2:
warning: Value stored to 'pos' is never read
[clang-analyzer-deadcode.DeadStores].

drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c:876:3:
warning: Value stored to 'pos' is never read
[clang-analyzer-deadcode.DeadStores].

drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c:36:3:
warning: Value stored to 'start' is never read
[clang-analyzer-deadcode.DeadStores].

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
David S. Miller [Mon, 26 Apr 2021 01:02:32 +0000 (18:02 -0700)]
Merge git://git./linux/kernel/git/bpf/bpf-next

Alexei Starovoitov says:

====================
pull-request: bpf-next 2021-04-23

The following pull-request contains BPF updates for your *net-next* tree.

We've added 69 non-merge commits during the last 22 day(s) which contain
a total of 69 files changed, 3141 insertions(+), 866 deletions(-).

The main changes are:

1) Add BPF static linker support for extern resolution of global, from Andrii.

2) Refine retval for bpf_get_task_stack helper, from Dave.

3) Add a bpf_snprintf helper, from Florent.

4) A bunch of miscellaneous improvements from many developers.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agocan: proc: fix rcvlist_* header alignment on 64-bit system
Erik Flodin [Sun, 25 Apr 2021 14:14:35 +0000 (16:14 +0200)]
can: proc: fix rcvlist_* header alignment on 64-bit system

Before this fix, the function and userdata columns weren't aligned:
  device   can_id   can_mask  function  userdata   matches  ident
   vcan0  92345678  9fffffff  0000000000000000  0000000000000000         0  raw
   vcan0     123    00000123  0000000000000000  0000000000000000         0  raw

After the fix they are:
  device   can_id   can_mask      function          userdata       matches  ident
   vcan0  92345678  9fffffff  0000000000000000  0000000000000000         0  raw
   vcan0     123    00000123  0000000000000000  0000000000000000         0  raw

Link: https://lore.kernel.org/r/20210425141440.229653-1-erik@flodin.me
Signed-off-by: Erik Flodin <erik@flodin.me>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
3 years agocan: add a note that RECV_OWN_MSGS frames are subject to filtering
Erik Flodin [Tue, 20 Apr 2021 19:12:00 +0000 (21:12 +0200)]
can: add a note that RECV_OWN_MSGS frames are subject to filtering

Some parts of the documentation may lead the reader to think that the
socket's own frames are always received when CAN_RAW_RECV_OWN_MSGS is
enabled, but all frames are subject to filtering.

As explained by Marc Kleine-Budde:

On TX complete of a CAN frame it's pushed into the RX path of the
networking stack, along with the information of the originating socket.

Then the CAN frame is delivered into AF_CAN, where it is passed on to
all registered receivers depending on filters. One receiver is the
sending socket in CAN_RAW. Then in CAN_RAW the it is checked if the
sending socket has RECV_OWN_MSGS enabled.

Link: https://lore.kernel.org/r/20210420191212.42753-1-erik@flodin.me
Signed-off-by: Erik Flodin <erik@flodin.me>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
3 years agocan: etas_es58x: Fix a couple of spelling mistakes
Colin Ian King [Thu, 15 Apr 2021 11:30:50 +0000 (12:30 +0100)]
can: etas_es58x: Fix a couple of spelling mistakes

There are spelling mistakes in netdev_dbg and netdev_dbg messages,
fix these.

Link: https://lore.kernel.org/r/20210415113050.1942333-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
3 years agocan: etas_es58x: Fix missing null check on netdev pointer
Colin Ian King [Thu, 15 Apr 2021 08:47:23 +0000 (09:47 +0100)]
can: etas_es58x: Fix missing null check on netdev pointer

There is an assignment to *netdev that is that can potentially be null
but the null check is checking netdev and not *netdev as intended. Fix
this by adding in the missing * operator.

Fixes: 8537257874e9 ("can: etas_es58x: add core support for ETAS ES58X CAN USB interfaces")
Link: https://lore.kernel.org/r/20210415084723.1807935-1-colin.king@canonical.com
Addresses-Coverity: ("Dereference before null check")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
3 years agonet/mlx5: SF, Extend SF table for additional SF id range
Parav Pandit [Mon, 8 Mar 2021 11:19:53 +0000 (13:19 +0200)]
net/mlx5: SF, Extend SF table for additional SF id range

Extended the SF table to cover additioanl SF id range of external
controller.

A user optionallly provides the external controller number when user
wants to create SF on the external controller.

An example on eswitch system:
$ devlink dev eswitch set pci/0033:01:00.0 mode switchdev

$ devlink port show
pci/0033:01:00.0/196607: type eth netdev enP51p1s0f0np0 flavour physical port 0 splittable false
pci/0033:01:00.0/131072: type eth netdev eth0 flavour pcipf controller 1 pfnum 0 external true splittable false
  function:
    hw_addr 00:00:00:00:00:00

$ devlink port add pci/0033:01:00.0 flavour pcisf pfnum 0 sfnum 77 controller 1
pci/0033:01:00.0/163840: type eth netdev eth1 flavour pcisf controller 1 pfnum 0 sfnum 77 external true splittable false
  function:
    hw_addr 00:00:00:00:00:00 state inactive opstate detached

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: SF, Split mlx5_sf_hw_table into two parts
Parav Pandit [Fri, 5 Mar 2021 08:06:06 +0000 (10:06 +0200)]
net/mlx5: SF, Split mlx5_sf_hw_table into two parts

Device has SF ids in two different contiguous ranges. One for the local
controller and second for the external controller's PF.

Each such range has its own maximum number of functions and base id.
To allocate SF from either of the range, prepare code to split into
range specific fields into its own structure.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Vu Pham <vuhuong@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: SF, Use helpers for allocation and free
Parav Pandit [Fri, 5 Mar 2021 07:35:21 +0000 (09:35 +0200)]
net/mlx5: SF, Use helpers for allocation and free

Use helper routines for SF id and SF table allocation and free
so that subsequent patch can reuse it for multiple SF function
id range.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Vu Pham <vuhuong@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: SF, Consider own vhca events of SF devices
Parav Pandit [Thu, 11 Mar 2021 11:51:38 +0000 (13:51 +0200)]
net/mlx5: SF, Consider own vhca events of SF devices

Vhca events on eswitch manager are received for all the functions on the
NIC, including for SFs of external host PF controllers.

While SF device handler is only interested in SF devices events related
to its own PF.
Hence, validate if the function belongs to self or not.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Vu Pham <vuhuong@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: SF, Store and use start function id
Parav Pandit [Fri, 5 Mar 2021 06:51:10 +0000 (08:51 +0200)]
net/mlx5: SF, Store and use start function id

SF ids in the device are in two different contiguous ranges. One for
the local controller and second for the external host controller.

Prepare code to handle multiple start function id by storing it in the
table.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Vu Pham <vuhuong@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agodevlink: Extend SF port attributes to have external attribute
Parav Pandit [Wed, 10 Mar 2021 13:35:03 +0000 (15:35 +0200)]
devlink: Extend SF port attributes to have external attribute

Extended SF port attributes to have optional external flag similar to
PCI PF and VF port attributes.

External atttibute is required to generate unique phys_port_name when PF number
and SF number are overlapping between two controllers similar to SR-IOV
VFs.

When a SF is for external controller an example view of external SF
port and config sequence.

On eswitch system:
$ devlink dev eswitch set pci/0033:01:00.0 mode switchdev

$ devlink port show
pci/0033:01:00.0/196607: type eth netdev enP51p1s0f0np0 flavour physical port 0 splittable false
pci/0033:01:00.0/131072: type eth netdev eth0 flavour pcipf controller 1 pfnum 0 external true splittable false
  function:
    hw_addr 00:00:00:00:00:00

$ devlink port add pci/0033:01:00.0 flavour pcisf pfnum 0 sfnum 77 controller 1
pci/0033:01:00.0/163840: type eth netdev eth1 flavour pcisf controller 1 pfnum 0 sfnum 77 splittable false
  function:
    hw_addr 00:00:00:00:00:00 state inactive opstate detached

phys_port_name construction:
$ cat /sys/class/net/eth1/phys_port_name
c1pf0sf77

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Vu Pham <vuhuong@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: SF, Rely on hw table for SF devlink port allocation
Parav Pandit [Mon, 8 Mar 2021 09:18:53 +0000 (11:18 +0200)]
net/mlx5: SF, Rely on hw table for SF devlink port allocation

Supporting SF allocation is currently checked at two places:
(a) SF devlink port allocator and
(b) SF HW table handler.

Both layers are using HCA CAP to identify it using helper routine
mlx5_sf_supported() and mlx5_sf_max_functions().

Instead, rely on the HW table handler to check if SF is supported
or not.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Vu Pham <vuhuong@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: E-Switch, Consider SF ports of host PF
Parav Pandit [Tue, 2 Mar 2021 12:20:21 +0000 (14:20 +0200)]
net/mlx5: E-Switch, Consider SF ports of host PF

Query SF vports count and base id of host PF from the firmware.

Account these ports in the total port calculation whenever it is non
zero.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Vu Pham <vuhuong@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: E-Switch, Use xarray for vport number to vport and rep mapping
Parav Pandit [Fri, 19 Mar 2021 03:21:31 +0000 (05:21 +0200)]
net/mlx5: E-Switch, Use xarray for vport number to vport and rep mapping

Currently vport number to vport and its representor are mapped using an
array and an index.

Vport numbers of different types of functions are not contiguous. Adding
new such discontiguous range using index and number mapping is increasingly
complex and hard to maintain.

Hence, maintain an xarray of vport and rep whose lookup is done based on
the vport number.
Each VF and SF entry is marked with a xarray mark to identify the function
type. Additionally PF and VF needs special handling for legacy inline
mode. They are additionally marked as host function using additional
HOST_FN mark.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Vu Pham <vuhuong@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: E-Switch, Prepare to return total vports from eswitch struct
Parav Pandit [Tue, 2 Mar 2021 12:10:49 +0000 (14:10 +0200)]
net/mlx5: E-Switch, Prepare to return total vports from eswitch struct

Total vports are already stored during eswitch initialization. Instead
of calculating everytime, read directly from eswitch.

Additionally, host PF's SF vport information is available using
QUERY_HCA_CAP command. It is not available through HCA_CAP of the
eswitch manager PF.
Hence, this patch prepares the return total eswitch vport count from the
existing eswitch struct.

This further helps to keep eswitch port counting macros and logic within
eswitch.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: E-Switch, Return eswitch max ports when eswitch is supported
Parav Pandit [Tue, 2 Mar 2021 11:54:42 +0000 (13:54 +0200)]
net/mlx5: E-Switch, Return eswitch max ports when eswitch is supported

mlx5_eswitch_get_total_vports() doesn't honor MLX5_ESWICH Kconfig flag.

When MLX5_ESWITCH is disabled, FS layer continues to initialize eswitch
specific ACL namespaces.
Instead, start honoring MLX5_ESWITCH flag and perform vport specific
initialization only when vport count is non zero.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Vu Pham <vuhuong@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agobpf: Document the pahole release info related to libbpf in bpf_devel_QA.rst
Tiezhu Yang [Fri, 23 Apr 2021 01:23:30 +0000 (09:23 +0800)]
bpf: Document the pahole release info related to libbpf in bpf_devel_QA.rst

pahole starts to use libbpf definitions and APIs since v1.13 after the
commit 21507cd3e97b ("pahole: add libbpf as submodule under lib/bpf").
It works well with the git repository because the libbpf submodule will
use "git submodule update --init --recursive" to update.

Unfortunately, the default github release source code does not contain
libbpf submodule source code and this will cause build issues, the tarball
from https://git.kernel.org/pub/scm/devel/pahole/pahole.git/ is same with
github, you can get the source tarball with corresponding libbpf submodule
codes from

https://fedorapeople.org/~acme/dwarves

This change documents the above issues to give more information so that
we can get the tarball from the right place, early discussion is here:

https://lore.kernel.org/bpf/2de4aad5-fa9e-1c39-3c92-9bb9229d0966@loongson.cn/

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/bpf/1619141010-12521-1-git-send-email-yangtiezhu@loongson.cn
3 years agophy: nxp-c45-tja11xx: add interrupt support
Radu Pirea (NXP OSS) [Fri, 23 Apr 2021 15:00:50 +0000 (18:00 +0300)]
phy: nxp-c45-tja11xx: add interrupt support

Added .config_intr and .handle_interrupt callbacks.

Link event interrupt will trigger an interrupt every time when the link
goes up or down.

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet/atm: Fix spelling mistake "requed" -> "requeued"
Colin Ian King [Fri, 23 Apr 2021 13:28:36 +0000 (14:28 +0100)]
net/atm: Fix spelling mistake "requed" -> "requeued"

There is a spelling mistake in a printk message. Fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoselftests/net: bump timeout to 5 minutes
Po-Hsu Lin [Fri, 23 Apr 2021 11:15:38 +0000 (19:15 +0800)]
selftests/net: bump timeout to 5 minutes

We found that with the latest mainline kernel (5.12.0-051200rc8) on
some KVM instances / bare-metal systems, the following tests will take
longer than the kselftest framework default timeout (45 seconds) to
run and thus got terminated with TIMEOUT error:
* xfrm_policy.sh - took about 2m20s
* pmtu.sh - took about 3m5s
* udpgso_bench.sh - took about 60s

Bump the timeout setting to 5 minutes to allow them have a chance to
finish.

https://bugs.launchpad.net/bugs/1856010
Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'mptcp-msg-flags'
David S. Miller [Fri, 23 Apr 2021 21:06:32 +0000 (14:06 -0700)]
Merge branch 'mptcp-msg-flags'

Mat Martineau says:

====================
mptcp: Compatibility with common msg flags

These patches from the MPTCP tree handle some of the msg flags that are
typically used with TCP, to make it easier to adapt userspace programs
for use with MPTCP.

Patches 1, 2, and 4 add support for MSG_ERRQUEUE (no-op for now),
MSG_TRUNC, and MSG_PEEK on the receive side.

Patch 3 ignores unsupported msg flags for send and receive.

Patch 5 adds a selftest for MSG_PEEK.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoselftests: mptcp: add a test case for MSG_PEEK
Yonglong Li [Fri, 23 Apr 2021 18:17:09 +0000 (11:17 -0700)]
selftests: mptcp: add a test case for MSG_PEEK

Extend mptcp_connect tool with MSG_PEEK support and add a test case in
mptcp_connect.sh that checks the data received from/after recv() with
MSG_PEEK.

Acked-by: Paolo Abeni <pabeni@redhat.com>
Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Yonglong Li <liyonglong@chinatelecom.cn>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>