platform/kernel/linux-starfive.git
22 months agonet: lan966x: make reset optional
Michael Walle [Wed, 31 Aug 2022 11:18:55 +0000 (13:18 +0200)]
net: lan966x: make reset optional

There is no dedicated reset for just the switch core. The reset which
is used up until now, is more of a global reset, resetting almost the
whole SoC and cause spurious errors by doing so. Make it possible to
handle the reset elsewhere and make the reset optional.

Signed-off-by: Michael Walle <michael@walle.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
22 months agodt-bindings: net: sparx5: don't require a reset line
Michael Walle [Wed, 31 Aug 2022 11:18:54 +0000 (13:18 +0200)]
dt-bindings: net: sparx5: don't require a reset line

Make the reset line optional. It turns out, there is no dedicated reset
for the switch. Instead, the reset which was used up until now, was kind
of a global reset. This is now handled elsewhere, thus don't require a
reset.

Signed-off-by: Michael Walle <michael@walle.cc>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
22 months agoipv6: tcp: send consistent autoflowlabel in SYN_RECV state
Eric Dumazet [Wed, 31 Aug 2022 20:37:29 +0000 (13:37 -0700)]
ipv6: tcp: send consistent autoflowlabel in SYN_RECV state

This is a followup of commit c67b85558ff2 ("ipv6: tcp: send consistent
autoflowlabel in TIME_WAIT state"), but for SYN_RECV state.

In some cases, TCP sends a challenge ACK on behalf of a SYN_RECV request.
WHen this happens, we want to use the flow label that was used when
the prior SYNACK packet was sent, instead of another one.

After his patch, following packetdrill passes:

    0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
   +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
   +0 bind(3, ..., ...) = 0
   +0 listen(3, 1) = 0

  +.2 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7>
   +0 > (flowlabel 0x11) S. 0:0(0) ack 1 <...>
// Test if a challenge ack is properly sent (same flowlabel than prior SYNACK)
   +.01 < . 4000000000:4000000000(0) ack 1 win 320
   +0  > (flowlabel 0x11) . 1:1(0) ack 1

Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20220831203729.458000-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agoselftests: net: dsa: symlink the tc_actions.sh test
Vladimir Oltean [Wed, 31 Aug 2022 17:08:39 +0000 (20:08 +0300)]
selftests: net: dsa: symlink the tc_actions.sh test

This has been validated on the Ocelot/Felix switch family (NXP LS1028A)
and should be relevant to any switch driver that offloads the tc-flower
and/or tc-matchall actions trap, drop, accept, mirred, for which DSA has
operations.

TEST: gact drop and ok (skip_hw)                                    [ OK ]
TEST: mirred egress flower redirect (skip_hw)                       [ OK ]
TEST: mirred egress flower mirror (skip_hw)                         [ OK ]
TEST: mirred egress matchall mirror (skip_hw)                       [ OK ]
TEST: mirred_egress_to_ingress (skip_hw)                            [ OK ]
TEST: gact drop and ok (skip_sw)                                    [ OK ]
TEST: mirred egress flower redirect (skip_sw)                       [ OK ]
TEST: mirred egress flower mirror (skip_sw)                         [ OK ]
TEST: mirred egress matchall mirror (skip_sw)                       [ OK ]
TEST: trap (skip_sw)                                                [ OK ]
TEST: mirred_egress_to_ingress (skip_sw)                            [ OK ]

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220831170839.931184-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agonetdevsim: remove redundant variable ret
Jinpeng Cui [Wed, 31 Aug 2022 15:43:29 +0000 (15:43 +0000)]
netdevsim: remove redundant variable ret

Return value directly from nsim_dev_reload_create()
instead of getting value from redundant variable ret.

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Jinpeng Cui <cui.jinpeng2@zte.com.cn>
Link: https://lore.kernel.org/r/20220831154329.305372-1-cui.jinpeng2@zte.com.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agonet: rtnetlink: use netif_oper_up instead of open code
Juhee Kang [Wed, 31 Aug 2022 12:58:45 +0000 (21:58 +0900)]
net: rtnetlink: use netif_oper_up instead of open code

The open code is defined as a new helper function(netif_oper_up) on netdev.h,
the code is dev->operstate == IF_OPER_UP || dev->operstate == IF_OPER_UNKNOWN.
Thus, replace the open code to netif_oper_up. This patch doesn't change logic.

Signed-off-by: Juhee Kang <claudiajkang@gmail.com>
Link: https://lore.kernel.org/r/20220831125845.1333-1-claudiajkang@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agonet: sched: etf: remove true check in etf_enable_offload()
Zhengchao Shao [Wed, 31 Aug 2022 09:29:19 +0000 (17:29 +0800)]
net: sched: etf: remove true check in etf_enable_offload()

etf_enable_offload() is only called when q->offload is false in
etf_init(). So remove true check in etf_enable_offload().

Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Link: https://lore.kernel.org/r/20220831092919.146149-1-shaozhengchao@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Jakub Kicinski [Thu, 1 Sep 2022 19:58:02 +0000 (12:58 -0700)]
Merge git://git./linux/kernel/git/netdev/net

tools/testing/selftests/net/.gitignore
  sort the net-next version and use it

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agoMerge tag 'net-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 1 Sep 2022 16:20:42 +0000 (09:20 -0700)]
Merge tag 'net-6.0-rc4' of git://git./linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from bluetooth, bpf and wireless.

  Current release - regressions:

   - bpf:
      - fix wrong last sg check in sk_msg_recvmsg()
      - fix kernel BUG in purge_effective_progs()

   - mac80211:
      - fix possible leak in ieee80211_tx_control_port()
      - potential NULL dereference in ieee80211_tx_control_port()

  Current release - new code bugs:

   - nfp: fix the access to management firmware hanging

  Previous releases - regressions:

   - ip: fix triggering of 'icmp redirect'

   - sched: tbf: don't call qdisc_put() while holding tree lock

   - bpf: fix corrupted packets for XDP_SHARED_UMEM

   - bluetooth: hci_sync: fix suspend performance regression

   - micrel: fix probe failure

  Previous releases - always broken:

   - tcp: make global challenge ack rate limitation per net-ns and
     default disabled

   - tg3: fix potential hang-up on system reboot

   - mac802154: fix reception for no-daddr packets

  Misc:

   - r8152: add PID for the lenovo onelink+ dock"

* tag 'net-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (56 commits)
  net/smc: Remove redundant refcount increase
  Revert "sch_cake: Return __NET_XMIT_STOLEN when consuming enqueued skb"
  tcp: make global challenge ack rate limitation per net-ns and default disabled
  tcp: annotate data-race around challenge_timestamp
  net: dsa: hellcreek: Print warning only once
  ip: fix triggering of 'icmp redirect'
  sch_cake: Return __NET_XMIT_STOLEN when consuming enqueued skb
  selftests: net: sort .gitignore file
  Documentation: networking: correct possessive "its"
  kcm: fix strp_init() order and cleanup
  mlxbf_gige: compute MDIO period based on i1clk
  ethernet: rocker: fix sleep in atomic context bug in neigh_timer_handler
  net: lan966x: improve error handle in lan966x_fdma_rx_get_frame()
  nfp: fix the access to management firmware hanging
  net: phy: micrel: Make the GPIO to be non-exclusive
  net: virtio_net: fix notification coalescing comments
  net/sched: fix netdevice reference leaks in attach_default_qdiscs()
  net: sched: tbf: don't call qdisc_put() while holding tree lock
  net: Use u64_stats_fetch_begin_irq() for stats fetch.
  net: dsa: xrs700x: Use irqsave variant for u64 stats update
  ...

22 months agoMerge tag 'slab-for-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka...
Linus Torvalds [Thu, 1 Sep 2022 16:14:56 +0000 (09:14 -0700)]
Merge tag 'slab-for-6.0-rc4' of git://git./linux/kernel/git/vbabka/slab

Pull slab fix from Vlastimil Babka:

 - A fix from Waiman Long to avoid a theoretical deadlock reported by
   lockdep.

* tag 'slab-for-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
  mm/slab_common: Deleting kobject in kmem_cache_destroy() without holding slab_mutex/cpu_hotplug_lock

22 months agoMerge tag 'sound-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai...
Linus Torvalds [Thu, 1 Sep 2022 16:05:25 +0000 (09:05 -0700)]
Merge tag 'sound-6.0-rc4' of git://git./linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "Just handful changes at this time. The only major change is the
  regression fix about the x86 WC-page buffer allocation.

  The rest are trivial data-race fixes for ALSA sequencer core, the
  possible out-of-bounds access fixes in the new ALSA control hash code,
  and a few device-specific workarounds and fixes"

* tag 'sound-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: usb-audio: Add quirk for LH Labs Geek Out HD Audio 1V5
  ALSA: hda/realtek: Add speaker AMP init for Samsung laptops with ALC298
  ALSA: control: Re-order bounds checking in get_ctl_id_hash()
  ALSA: control: Fix an out-of-bounds bug in get_ctl_id_hash()
  ALSA: hda: intel-nhlt: Correct the handling of fmt_config flexible array
  ALSA: seq: Fix data-race at module auto-loading
  ALSA: seq: oss: Fix data-race for max_midi_devs access
  ALSA: memalloc: Revive x86-specific WC page allocations again

22 months agonet: sched: gred: remove NULL check before free table->tab in gred_destroy()
Zhengchao Shao [Wed, 31 Aug 2022 04:14:52 +0000 (12:14 +0800)]
net: sched: gred: remove NULL check before free table->tab in gred_destroy()

The kfree invoked by gred_destroy_vq checks whether the input parameter
is empty. Therefore, gred_destroy() doesn't need to check table->tab.

Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Link: https://lore.kernel.org/r/20220831041452.33026-1-shaozhengchao@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
22 months agomm/slab_common: Deleting kobject in kmem_cache_destroy() without holding slab_mutex...
Waiman Long [Fri, 12 Aug 2022 18:30:33 +0000 (14:30 -0400)]
mm/slab_common: Deleting kobject in kmem_cache_destroy() without holding slab_mutex/cpu_hotplug_lock

A circular locking problem is reported by lockdep due to the following
circular locking dependency.

  +--> cpu_hotplug_lock --> slab_mutex --> kn->active --+
  |                                                     |
  +-----------------------------------------------------+

The forward cpu_hotplug_lock ==> slab_mutex ==> kn->active dependency
happens in

  kmem_cache_destroy(): cpus_read_lock(); mutex_lock(&slab_mutex);
  ==> sysfs_slab_unlink()
      ==> kobject_del()
          ==> kernfs_remove()
      ==> __kernfs_remove()
          ==> kernfs_drain(): rwsem_acquire(&kn->dep_map, ...);

The backward kn->active ==> cpu_hotplug_lock dependency happens in

  kernfs_fop_write_iter(): kernfs_get_active();
  ==> slab_attr_store()
      ==> cpu_partial_store()
          ==> flush_all(): cpus_read_lock()

One way to break this circular locking chain is to avoid holding
cpu_hotplug_lock and slab_mutex while deleting the kobject in
sysfs_slab_unlink() which should be equivalent to doing a write_lock
and write_unlock pair of the kn->active virtual lock.

Since the kobject structures are not protected by slab_mutex or the
cpu_hotplug_lock, we can certainly release those locks before doing
the delete operation.

Move sysfs_slab_unlink() and sysfs_slab_release() to the newly
created kmem_cache_release() and call it outside the slab_mutex &
cpu_hotplug_lock critical sections. There will be a slight delay
in the deletion of sysfs files if kmem_cache_release() is called
indirectly from a work function.

Fixes: 5a836bf6b09f ("mm: slub: move flush_cpu_slab() invocations __free_slab() invocations out of IRQ context")
Signed-off-by: Waiman Long <longman@redhat.com>
Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Reviewed-by: Roman Gushchin <roman.gushchin@linux.dev>
Acked-by: David Rientjes <rientjes@google.com>
Link: https://lore.kernel.org/all/YwOImVd+nRUsSAga@hyeyoo/
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
22 months agoMerge branch 'rk3588-ethernet-support'
Paolo Abeni [Thu, 1 Sep 2022 09:53:17 +0000 (11:53 +0200)]
Merge branch 'rk3588-ethernet-support'

Sebastian Reichel says:

====================
RK3588 Ethernet Support

This adds ethernet support for the RK3588(s) SoCs.

Changes since PATCHv2:
 * Rebased to v6.0-rc1
 * Wrap DT bindings additions at 80 characters
 * Add Acked-by from Krzysztof

Changes since PATCHv1:
 * Drop first patch (not required) and rebase second patch accordingly
 * Rename DT property rockchip,php_grf -> rockchip,php-grf
====================

Link: https://lore.kernel.org/r/20220830154559.61506-1-sebastian.reichel@collabora.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
22 months agodt-bindings: net: rockchip-dwmac: add rk3588 gmac compatible
Sebastian Reichel [Tue, 30 Aug 2022 15:45:59 +0000 (17:45 +0200)]
dt-bindings: net: rockchip-dwmac: add rk3588 gmac compatible

Add compatible string for RK3588 gmac, which is similar to the RK3568
one, but needs another syscon device for clock selection.

Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
22 months agonet: ethernet: stmmac: dwmac-rk: Add gmac support for rk3588
David Wu [Tue, 30 Aug 2022 15:45:58 +0000 (17:45 +0200)]
net: ethernet: stmmac: dwmac-rk: Add gmac support for rk3588

Add constants and callback functions for the dwmac on RK3588 soc.
As can be seen, the base structure is the same, only registers
and the bits in them moved slightly.

Signed-off-by: David Wu <david.wu@rock-chips.com>
[rebase, squash fixes]
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
22 months agonet/smc: Remove redundant refcount increase
Yacan Liu [Tue, 30 Aug 2022 15:23:14 +0000 (23:23 +0800)]
net/smc: Remove redundant refcount increase

For passive connections, the refcount increment has been done in
smc_clcsock_accept()-->smc_sock_alloc().

Fixes: 3b2dec2603d5 ("net/smc: restructure client and server code in af_smc")
Signed-off-by: Yacan Liu <liuyacan@corp.netease.com>
Reviewed-by: Tony Lu <tonylu@linux.alibaba.com>
Link: https://lore.kernel.org/r/20220830152314.838736-1-liuyacan@corp.netease.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
22 months agoocteontx2-pf: Add egress PFC support
Suman Ghosh [Tue, 30 Aug 2022 12:03:04 +0000 (17:33 +0530)]
octeontx2-pf: Add egress PFC support

As of now all transmit queues transmit packets out of same scheduler
queue hierarchy. Due to this PFC frames sent by peer are not handled
properly, either all transmit queues are backpressured or none.
To fix this when user enables PFC for a given priority map relavant
transmit queue to a different scheduler queue hierarcy, so that
backpressure is applied only to the traffic egressing out of that TXQ.

Signed-off-by: Suman Ghosh <sumang@marvell.com>
Link: https://lore.kernel.org/r/20220830120304.158060-1-sumang@marvell.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
22 months agonet: sched: remove redundant NULL check in change hook function
Zhengchao Shao [Mon, 29 Aug 2022 07:12:19 +0000 (15:12 +0800)]
net: sched: remove redundant NULL check in change hook function

Currently, the change function can be called by two ways. The one way is
that qdisc_change() will call it. Before calling change function,
qdisc_change() ensures tca[TCA_OPTIONS] is not empty. The other way is
that .init() will call it. The opt parameter is also checked before
calling change function in .init(). Therefore, it's no need to check the
input parameter opt in change function.

Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Acked-by: Toke Høiland-Jørgensen <toke@toke.dk>
Link: https://lore.kernel.org/r/20220829071219.208646-1-shaozhengchao@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
22 months agoRevert "sch_cake: Return __NET_XMIT_STOLEN when consuming enqueued skb"
Jakub Kicinski [Thu, 1 Sep 2022 03:01:32 +0000 (20:01 -0700)]
Revert "sch_cake: Return __NET_XMIT_STOLEN when consuming enqueued skb"

This reverts commit 90fabae8a2c225c4e4936723c38857887edde5cc.

Patch was applied hastily, revert and let the v2 be reviewed.

Fixes: 90fabae8a2c2 ("sch_cake: Return __NET_XMIT_STOLEN when consuming enqueued skb")
Link: https://lore.kernel.org/all/87wnao2ha3.fsf@toke.dk/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agoMerge branch 'tcp-tcp-challenge-ack-fixes'
Jakub Kicinski [Thu, 1 Sep 2022 02:56:53 +0000 (19:56 -0700)]
Merge branch 'tcp-tcp-challenge-ack-fixes'

Eric Dumazet says:

====================
tcp: tcp challenge ack fixes

syzbot found a typical data-race addressed in the first patch.

While we are at it, second patch makes the global rate limit
per net-ns and disabled by default.
====================

Link: https://lore.kernel.org/r/20220830185656.268523-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agotcp: make global challenge ack rate limitation per net-ns and default disabled
Eric Dumazet [Tue, 30 Aug 2022 18:56:56 +0000 (11:56 -0700)]
tcp: make global challenge ack rate limitation per net-ns and default disabled

Because per host rate limiting has been proven problematic (side channel
attacks can be based on it), per host rate limiting of challenge acks ideally
should be per netns and turned off by default.

This is a long due followup of following commits:

083ae308280d ("tcp: enable per-socket rate limiting of all 'challenge acks'")
f2b2c582e824 ("tcp: mitigate ACK loops for connections as tcp_sock")
75ff39ccc1bd ("tcp: make challenge acks less predictable")

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jason Baron <jbaron@akamai.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agotcp: annotate data-race around challenge_timestamp
Eric Dumazet [Tue, 30 Aug 2022 18:56:55 +0000 (11:56 -0700)]
tcp: annotate data-race around challenge_timestamp

challenge_timestamp can be read an written by concurrent threads.

This was expected, but we need to annotate the race to avoid potential issues.

Following patch moves challenge_timestamp and challenge_count
to per-netns storage to provide better isolation.

Fixes: 354e4aa391ed ("tcp: RFC 5961 5.2 Blind Data Injection Attack Mitigation")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agonet: dsa: hellcreek: Print warning only once
Kurt Kanzenbach [Tue, 30 Aug 2022 16:34:48 +0000 (18:34 +0200)]
net: dsa: hellcreek: Print warning only once

In case the source port cannot be decoded, print the warning only once. This
still brings attention to the user and does not spam the logs at the same time.

Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20220830163448.8921-1-kurt@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agonet-next: Fix IP_UNICAST_IF option behavior for connected sockets
Richard Gobert [Mon, 29 Aug 2022 11:18:51 +0000 (13:18 +0200)]
net-next: Fix IP_UNICAST_IF option behavior for connected sockets

The IP_UNICAST_IF socket option is used to set the outgoing interface
for outbound packets.

The IP_UNICAST_IF socket option was added as it was needed by the
Wine project, since no other existing option (SO_BINDTODEVICE socket
option, IP_PKTINFO socket option or the bind function) provided the
needed characteristics needed by the IP_UNICAST_IF socket option. [1]
The IP_UNICAST_IF socket option works well for unconnected sockets,
that is, the interface specified by the IP_UNICAST_IF socket option
is taken into consideration in the route lookup process when a packet
is being sent. However, for connected sockets, the outbound interface
is chosen when connecting the socket, and in the route lookup process
which is done when a packet is being sent, the interface specified by
the IP_UNICAST_IF socket option is being ignored.

This inconsistent behavior was reported and discussed in an issue
opened on systemd's GitHub project [2]. Also, a bug report was
submitted in the kernel's bugzilla [3].

To understand the problem in more detail, we can look at what happens
for UDP packets over IPv4 (The same analysis was done separately in
the referenced systemd issue).
When a UDP packet is sent the udp_sendmsg function gets called and
the following happens:

1. The oif member of the struct ipcm_cookie ipc (which stores the
output interface of the packet) is initialized by the ipcm_init_sk
function to inet->sk.sk_bound_dev_if (the device set by the
SO_BINDTODEVICE socket option).

2. If the IP_PKTINFO socket option was set, the oif member gets
overridden by the call to the ip_cmsg_send function.

3. If no output interface was selected yet, the interface specified
by the IP_UNICAST_IF socket option is used.

4. If the socket is connected and no destination address is
specified in the send function, the struct ipcm_cookie ipc is not
taken into consideration and the cached route, that was calculated in
the connect function is being used.

Thus, for a connected socket, the IP_UNICAST_IF sockopt isn't taken
into consideration.

This patch corrects the behavior of the IP_UNICAST_IF socket option
for connect()ed sockets by taking into consideration the
IP_UNICAST_IF sockopt when connecting the socket.

In order to avoid reconnecting the socket, this option is still
ignored when applied on an already connected socket until connect()
is called again by the Richard Gobert.

Change the __ip4_datagram_connect function, which is called during
socket connection, to take into consideration the interface set by
the IP_UNICAST_IF socket option, in a similar way to what is done in
the udp_sendmsg function.

[1] https://lore.kernel.org/netdev/1328685717.4736.4.camel@edumazet-laptop/T/
[2] https://github.com/systemd/systemd/issues/11935#issuecomment-618691018
[3] https://bugzilla.kernel.org/show_bug.cgi?id=210255

Signed-off-by: Richard Gobert <richardbgobert@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20220829111554.GA1771@debian
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agoip: fix triggering of 'icmp redirect'
Nicolas Dichtel [Mon, 29 Aug 2022 10:01:21 +0000 (12:01 +0200)]
ip: fix triggering of 'icmp redirect'

__mkroute_input() uses fib_validate_source() to trigger an icmp redirect.
My understanding is that fib_validate_source() is used to know if the src
address and the gateway address are on the same link. For that,
fib_validate_source() returns 1 (same link) or 0 (not the same network).
__mkroute_input() is the only user of these positive values, all other
callers only look if the returned value is negative.

Since the below patch, fib_validate_source() didn't return anymore 1 when
both addresses are on the same network, because the route lookup returns
RT_SCOPE_LINK instead of RT_SCOPE_HOST. But this is, in fact, right.
Let's adapat the test to return 1 again when both addresses are on the same
link.

CC: stable@vger.kernel.org
Fixes: 747c14307214 ("ip: fix dflt addr selection for connected nexthop")
Reported-by: kernel test robot <yujie.liu@intel.com>
Reported-by: Heng Qi <hengqi@linux.alibaba.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20220829100121.3821-1-nicolas.dichtel@6wind.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agoMerge branch 'net-sched-remove-unused-variables'
Jakub Kicinski [Thu, 1 Sep 2022 02:39:56 +0000 (19:39 -0700)]
Merge branch 'net-sched-remove-unused-variables'

Zhengchao Shao says:

====================
net: sched: remove unused variables

The variable "other" is unused, remove it.
====================

Link: https://lore.kernel.org/r/20220830092255.281330-1-shaozhengchao@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agonet: sched: gred/red: remove unused variables in struct red_stats
Zhengchao Shao [Tue, 30 Aug 2022 09:22:55 +0000 (17:22 +0800)]
net: sched: gred/red: remove unused variables in struct red_stats

The variable "other" in the struct red_stats is not used. Remove it.

Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agonet: sched: choke: remove unused variables in struct choke_sched_data
Zhengchao Shao [Tue, 30 Aug 2022 09:22:54 +0000 (17:22 +0800)]
net: sched: choke: remove unused variables in struct choke_sched_data

The variable "other" in the struct choke_sched_data is not used. Remove it.

Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agonet: axienet: Switch to 64-bit RX/TX statistics
Robert Hancock [Mon, 29 Aug 2022 23:39:01 +0000 (17:39 -0600)]
net: axienet: Switch to 64-bit RX/TX statistics

The RX and TX byte/packet statistics in this driver could be overflowed
relatively quickly on a 32-bit platform. Switch these stats to use the
u64_stats infrastructure to avoid this.

Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Link: https://lore.kernel.org/r/20220829233901.3429419-1-robert.hancock@calian.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agonet/rds: Pass a pointer to virt_to_page()
Linus Walleij [Mon, 29 Aug 2022 13:20:01 +0000 (15:20 +0200)]
net/rds: Pass a pointer to virt_to_page()

Functions that work on a pointer to virtual memory such as
virt_to_pfn() and users of that function such as
virt_to_page() are supposed to pass a pointer to virtual
memory, ideally a (void *) or other pointer. However since
many architectures implement virt_to_pfn() as a macro,
this function becomes polymorphic and accepts both a
(unsigned long) and a (void *).

If we instead implement a proper virt_to_pfn(void *addr)
function the following happens (occurred on arch/arm):

net/rds/message.c:357:56: warning: passing argument 1
  of 'virt_to_pfn' makes pointer from integer without a
  cast [-Wint-conversion]

Fix this with an explicit cast.

Cc: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Cc: rds-devel@oss.oracle.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20220829132001.114858-1-linus.walleij@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agomm/rmap: Fix anon_vma->degree ambiguity leading to double-reuse
Jann Horn [Wed, 31 Aug 2022 17:06:00 +0000 (19:06 +0200)]
mm/rmap: Fix anon_vma->degree ambiguity leading to double-reuse

anon_vma->degree tracks the combined number of child anon_vmas and VMAs
that use the anon_vma as their ->anon_vma.

anon_vma_clone() then assumes that for any anon_vma attached to
src->anon_vma_chain other than src->anon_vma, it is impossible for it to
be a leaf node of the VMA tree, meaning that for such VMAs ->degree is
elevated by 1 because of a child anon_vma, meaning that if ->degree
equals 1 there are no VMAs that use the anon_vma as their ->anon_vma.

This assumption is wrong because the ->degree optimization leads to leaf
nodes being abandoned on anon_vma_clone() - an existing anon_vma is
reused and no new parent-child relationship is created.  So it is
possible to reuse an anon_vma for one VMA while it is still tied to
another VMA.

This is an issue because is_mergeable_anon_vma() and its callers assume
that if two VMAs have the same ->anon_vma, the list of anon_vmas
attached to the VMAs is guaranteed to be the same.  When this assumption
is violated, vma_merge() can merge pages into a VMA that is not attached
to the corresponding anon_vma, leading to dangling page->mapping
pointers that will be dereferenced during rmap walks.

Fix it by separately tracking the number of child anon_vmas and the
number of VMAs using the anon_vma as their ->anon_vma.

Fixes: 7a3ef208e662 ("mm: prevent endless growth of anon_vma hierarchy")
Cc: stable@kernel.org
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
22 months agor8152: allow userland to disable multicast
Sven van Ashbrook [Tue, 30 Aug 2022 04:59:39 +0000 (04:59 +0000)]
r8152: allow userland to disable multicast

The rtl8152 driver does not disable multicasting when userspace asks
it to. For example:
 $ ifconfig eth0 -multicast -allmulti
 $ tcpdump -p -i eth0  # will still capture multicast frames

Fix by clearing the device multicast filter table when multicast and
allmulti are both unset.

Tested as follows:
- Set multicast on eth0 network interface
- verify that multicast packets are coming in:
  $ tcpdump -p -i eth0
- Clear multicast and allmulti on eth0 network interface
- verify that no more multicast packets are coming in:
  $ tcpdump -p -i eth0

Signed-off-by: Sven van Ashbrook <svenva@chromium.org>
Acked-by: Hayes Wang <hayeswang@realtek.com>
Link: https://lore.kernel.org/r/20220830045923.net-next.v1.1.I4fee0ac057083d4f848caf0fa3a9fd466fc374a0@changeid
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agosch_cake: Return __NET_XMIT_STOLEN when consuming enqueued skb
Toke Høiland-Jørgensen [Wed, 31 Aug 2022 09:21:03 +0000 (11:21 +0200)]
sch_cake: Return __NET_XMIT_STOLEN when consuming enqueued skb

When the GSO splitting feature of sch_cake is enabled, GSO superpackets
will be broken up and the resulting segments enqueued in place of the
original skb. In this case, CAKE calls consume_skb() on the original skb,
but still returns NET_XMIT_SUCCESS. This can confuse parent qdiscs into
assuming the original skb still exists, when it really has been freed. Fix
this by adding the __NET_XMIT_STOLEN flag to the return value in this case.

Fixes: 0c850344d388 ("sch_cake: Conditionally split GSO segments")
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-18231
Link: https://lore.kernel.org/r/20220831092103.442868-1-toke@toke.dk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agonet: ethernet: move from strlcpy with unused retval to strscpy
Wolfram Sang [Tue, 30 Aug 2022 20:14:54 +0000 (22:14 +0200)]
net: ethernet: move from strlcpy with unused retval to strscpy

Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Petr Machata <petrm@nvidia.com> # For drivers/net/ethernet/mellanox/mlxsw
Acked-by: Geoff Levand <geoff@infradead.org> # For ps3_gelic_net and spider_net_ethtool
Acked-by: Tom Lendacky <thomas.lendacky@amd.com> # For drivers/net/ethernet/amd/xgbe/xgbe-ethtool.c
Acked-by: Marcin Wojtas <mw@semihalf.com> # For drivers/net/ethernet/marvell/mvpp2
Reviewed-by: Leon Romanovsky <leonro@nvidia.com> # For drivers/net/ethernet/mellanox/mlx{4|5}
Reviewed-by: Shay Agroskin <shayagr@amazon.com> # For drivers/net/ethernet/amazon/ena
Acked-by: Krzysztof Hałasa <khalasa@piap.pl> # For IXP4xx Ethernet
Link: https://lore.kernel.org/r/20220830201457.7984-3-wsa+renesas@sang-engineering.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agonet: move from strlcpy with unused retval to strscpy
Wolfram Sang [Tue, 30 Aug 2022 20:14:52 +0000 (22:14 +0200)]
net: move from strlcpy with unused retval to strscpy

Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for CAN
Link: https://lore.kernel.org/r/20220830201457.7984-1-wsa+renesas@sang-engineering.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agoselftests: net: sort .gitignore file
Axel Rasmussen [Mon, 29 Aug 2022 18:47:48 +0000 (11:47 -0700)]
selftests: net: sort .gitignore file

This is the result of `sort tools/testing/selftests/net/.gitignore`, but
preserving the comment at the top.

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Link: https://lore.kernel.org/r/20220829184748.1535580-1-axelrasmussen@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agoDocumentation: networking: correct possessive "its"
Randy Dunlap [Mon, 29 Aug 2022 23:54:14 +0000 (16:54 -0700)]
Documentation: networking: correct possessive "its"

Change occurrences of "it's" that are possessive to "its"
so that they don't read as "it is".

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20220829235414.17110-1-rdunlap@infradead.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agonet: phy: smsc: use device-managed clock API
Heiner Kallweit [Sun, 28 Aug 2022 17:26:55 +0000 (19:26 +0200)]
net: phy: smsc: use device-managed clock API

Simplify the code by using the device-managed clock API.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/b222be68-ba7e-999d-0a07-eca0ecedf74e@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agokcm: fix strp_init() order and cleanup
Cong Wang [Sat, 27 Aug 2022 18:13:14 +0000 (11:13 -0700)]
kcm: fix strp_init() order and cleanup

strp_init() is called just a few lines above this csk->sk_user_data
check, it also initializes strp->work etc., therefore, it is
unnecessary to call strp_done() to cancel the freshly initialized
work.

And if sk_user_data is already used by KCM, psock->strp should not be
touched, particularly strp->work state, so we need to move strp_init()
after the csk->sk_user_data check.

This also makes a lockdep warning reported by syzbot go away.

Reported-and-tested-by: syzbot+9fc084a4348493ef65d2@syzkaller.appspotmail.com
Reported-by: syzbot+e696806ef96cdd2d87cd@syzkaller.appspotmail.com
Fixes: e5571240236c ("kcm: Check if sk_user_data already set in kcm_attach")
Fixes: dff8baa26117 ("kcm: Call strp_stop before strp_done in kcm_attach")
Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Link: https://lore.kernel.org/r/20220827181314.193710-1-xiyou.wangcong@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agomlxbf_gige: compute MDIO period based on i1clk
David Thompson [Fri, 26 Aug 2022 15:59:16 +0000 (11:59 -0400)]
mlxbf_gige: compute MDIO period based on i1clk

This patch adds logic to compute the MDIO period based on
the i1clk, and thereafter write the MDIO period into the YU
MDIO config register. The i1clk resource from the ACPI table
is used to provide addressing to YU bootrecord PLL registers.
The values in these registers are used to compute MDIO period.
If the i1clk resource is not present in the ACPI table, then
the current default hardcorded value of 430Mhz is used.
The i1clk clock value of 430MHz is only accurate for boards
with BF2 mid bin and main bin SoCs. The BF2 high bin SoCs
have i1clk = 500MHz, but can support a slower MDIO period.

Fixes: f92e1869d74e ("Add Mellanox BlueField Gigabit Ethernet driver")
Reviewed-by: Asmaa Mnebhi <asmaa@nvidia.com>
Signed-off-by: David Thompson <davthompson@nvidia.com>
Link: https://lore.kernel.org/r/20220826155916.12491-1-davthompson@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agoMerge tag 'fscache-fixes-20220831' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Wed, 31 Aug 2022 17:13:34 +0000 (10:13 -0700)]
Merge tag 'fscache-fixes-20220831' of git://git./linux/kernel/git/dhowells/linux-fs

Pull fscache/cachefiles fixes from David Howells:

 - Fix kdoc on fscache_use/unuse_cookie().

 - Fix the error returned by cachefiles_ondemand_copen() from an upcall
   result.

 - Fix the distribution of requests in on-demand mode in cachefiles to
   be fairer by cycling through them rather than picking the one with
   the lowest ID each time (IDs being reused).

* tag 'fscache-fixes-20220831' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  cachefiles: make on-demand request distribution fairer
  cachefiles: fix error return code in cachefiles_ondemand_copen()
  fscache: fix misdocumented parameter

22 months agoMerge tag 'for-linus-2022083101' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Wed, 31 Aug 2022 16:54:14 +0000 (09:54 -0700)]
Merge tag 'for-linus-2022083101' of git://git./linux/kernel/git/hid/hid

Pull HID fixes from Jiri Kosina:

 - NULL pointer dereference fix for Steam driver (Lee Jones)

 - memory leak fix for hidraw (Karthik Alapati)

 - regression fix for functionality of some UCLogic tables (Benjamin
   Tissoires)

 - a few new device IDs and device-specific quirks

* tag 'for-linus-2022083101' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
  HID: nintendo: fix rumble worker null pointer deref
  HID: intel-ish-hid: ipc: Add Meteor Lake PCI device ID
  HID: input: fix uclogic tablets
  HID: Add Apple Touchbar on T2 Macs in hid_have_special_driver list
  HID: add Lenovo Yoga C630 battery quirk
  HID: AMD_SFH: Add a DMI quirk entry for Chromebooks
  HID: thrustmaster: Add sparco wheel and fix array length
  hid: intel-ish-hid: ishtp: Fix ishtp client sending disordered message
  HID: ishtp-hid-clientHID: ishtp-hid-client: Fix comment typo
  HID: asus: ROG NKey: Ignore portion of 0x5a report
  HID: hidraw: fix memory leak in hidraw_release()
  HID: steam: Prevent NULL pointer dereference in steam_{recv,send}_report

22 months agoMerge tag 'v6.0-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Linus Torvalds [Wed, 31 Aug 2022 16:47:06 +0000 (09:47 -0700)]
Merge tag 'v6.0-p2' of git://git./linux/kernel/git/herbert/crypto-2.6

Pull crypto fix from Herbert Xu:
 "Fix a boot performance regression due to an unnecessary dependency on
  XOR_BLOCKS"

* tag 'v6.0-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: lib - remove unneeded selection of XOR_BLOCKS

22 months agoMerge tag 'lsm-pr-20220829' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm
Linus Torvalds [Wed, 31 Aug 2022 16:23:16 +0000 (09:23 -0700)]
Merge tag 'lsm-pr-20220829' of git://git./linux/kernel/git/pcmoore/lsm

Pull LSM support for IORING_OP_URING_CMD from Paul Moore:
 "Add SELinux and Smack controls to the io_uring IORING_OP_URING_CMD.

  These are necessary as without them the IORING_OP_URING_CMD remains
  outside the purview of the LSMs (Luis' LSM patch, Casey's Smack patch,
  and my SELinux patch). They have been discussed at length with the
  io_uring folks, and Jens has given his thumbs-up on the relevant
  patches (see the commit descriptions).

  There is one patch that is not strictly necessary, but it makes
  testing much easier and is very trivial: the /dev/null
  IORING_OP_URING_CMD patch."

* tag 'lsm-pr-20220829' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm:
  Smack: Provide read control for io_uring_cmd
  /dev/null: add IORING_OP_URING_CMD support
  selinux: implement the security_uring_cmd() LSM hook
  lsm,io_uring: add LSM hooks for the new uring_cmd file op

22 months agocachefiles: make on-demand request distribution fairer
Xin Yin [Thu, 25 Aug 2022 02:09:45 +0000 (10:09 +0800)]
cachefiles: make on-demand request distribution fairer

For now, enqueuing and dequeuing on-demand requests all start from
idx 0, this makes request distribution unfair. In the weighty
concurrent I/O scenario, the request stored in higher idx will starve.

Searching requests cyclically in cachefiles_ondemand_daemon_read,
makes distribution fairer.

Fixes: c8383054506c ("cachefiles: notify the user daemon when looking up cookie")
Reported-by: Yongqing Li <liyongqing@bytedance.com>
Signed-off-by: Xin Yin <yinxin.x@bytedance.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeffle Xu <jefflexu@linux.alibaba.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20220817065200.11543-1-yinxin.x@bytedance.com/
Link: https://lore.kernel.org/r/20220825020945.2293-1-yinxin.x@bytedance.com/
22 months agocachefiles: fix error return code in cachefiles_ondemand_copen()
Sun Ke [Fri, 26 Aug 2022 02:35:15 +0000 (10:35 +0800)]
cachefiles: fix error return code in cachefiles_ondemand_copen()

The cache_size field of copen is specified by the user daemon.
If cache_size < 0, then the OPEN request is expected to fail,
while copen itself shall succeed. However, returning 0 is indeed
unexpected when cache_size is an invalid error code.

Fix this by returning error when cache_size is an invalid error code.

Changes
=======
v4: update the code suggested by Dan
v3: update the commit log suggested by Jingbo.

Fixes: c8383054506c ("cachefiles: notify the user daemon when looking up cookie")
Signed-off-by: Sun Ke <sunke32@huawei.com>
Suggested-by: Jeffle Xu <jefflexu@linux.alibaba.com>
Suggested-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com>
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20220818111935.1683062-1-sunke32@huawei.com/
Link: https://lore.kernel.org/r/20220818125038.2247720-1-sunke32@huawei.com/
Link: https://lore.kernel.org/r/20220826023515.3437469-1-sunke32@huawei.com/
22 months agofscache: fix misdocumented parameter
Khalid Masum [Thu, 18 Aug 2022 04:07:38 +0000 (10:07 +0600)]
fscache: fix misdocumented parameter

This patch fixes two warnings generated by make docs. The functions
fscache_use_cookie and fscache_unuse_cookie, both have a parameter
named cookie. But they are documented with the name "object" with
unclear description. Which generates the warning when creating docs.

This commit will replace the currently misdocumented parameter names
with the correct ones while adding proper descriptions.

CC: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Khalid Masum <khalid.masum.92@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20220521142446.4746-1-khalid.masum.92@gmail.com/
Link: https://lore.kernel.org/r/20220818040738.12036-1-khalid.masum.92@gmail.com/
Link: https://lore.kernel.org/r/880d7d25753fb326ee17ac08005952112fcf9bdb.1657360984.git.mchehab@kernel.org/
22 months agoMerge branch 'hns3-next'
David S. Miller [Wed, 31 Aug 2022 13:10:50 +0000 (14:10 +0100)]
Merge branch 'hns3-next'

Guangbin Huang says:

====================
net: hns3: updates for -next

This series includes some updates for the HNS3 ethernet driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
22 months agonet: hns3: net: hns3: add querying and setting fec off mode from firmware
Guangbin Huang [Tue, 30 Aug 2022 11:11:17 +0000 (19:11 +0800)]
net: hns3: net: hns3: add querying and setting fec off mode from firmware

For some new devices, the FEC mode can not be set to OFF in speed 200G.
In order to flexibly adapt to all types of devices, driver queries
fec ability from firmware to decide whether OFF mode can be supported.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
22 months agonet: hns3: add querying and setting fec llrs mode from firmware
Hao Lan [Tue, 30 Aug 2022 11:11:16 +0000 (19:11 +0800)]
net: hns3: add querying and setting fec llrs mode from firmware

This patch supports llrs fec mode in speed 200G for some new devices, and
suppoprts querying llrs fec ability from firmware.

Signed-off-by: Hao Lan <lanhao@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
22 months agonet: hns3: add querying fec ability from firmware
Guangbin Huang [Tue, 30 Aug 2022 11:11:15 +0000 (19:11 +0800)]
net: hns3: add querying fec ability from firmware

For some new devices, driver can queries fec ability from firmware to
decide which FEC mode can be supported.

If devices of old version which not support querying fec ability, driver
sets fixed ability according to current speed.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
22 months agonet: hns3: add getting capabilities of gro offload and fd from firmware
Guangbin Huang [Tue, 30 Aug 2022 11:11:14 +0000 (19:11 +0800)]
net: hns3: add getting capabilities of gro offload and fd from firmware

As some new devices may not support GRO offload and flow table director,
to support these devices, driver needs to querying capabilities of GRO
offload and flow table director from firmware. Whether the driver
supports these two features depends on capabilities.

For old device of version HNAE3_DEVICE_VERSION_V2, driver sets their
capabilities of these two features to fixed value.

Setting default features of netdev and debugfs also need to identify
whether support these two features.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
22 months agoMerge branch 'thunderbolt-end-to-end-flow-control'
David S. Miller [Wed, 31 Aug 2022 13:05:12 +0000 (14:05 +0100)]
Merge branch 'thunderbolt-end-to-end-flow-control'

Mika Westerberg says:

====================
thunderbolt: net: Enable full end-to-end flow control

Thunderbolt/USB4 host controllers support full end-to-end flow control
that prevents dropping packets if there are not enough hardware receive
buffers. So far it has not been enabled for the networking driver yet
but this series changes that. There is one snag though: the second
generation (Intel Falcon Ridge) had a bug that needs special quirk to
get it working. We had that in the early stages of the Thunderbolt/USB4
driver but it got dropped because it was not needed at the time. Now we
add it back as a quirk for the host controller (NHI).

The first patch of this series is a bugfix that I'm planning to push for
v6.0-rc. Rest are v6.1 material. This also includes a patch that shows
the XDomain link type in sysfs the same way we do for USB4 routers and
updates the networking driver module description.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
22 months agonet: thunderbolt: Update module description with mention of USB4
Mika Westerberg [Tue, 30 Aug 2022 15:32:50 +0000 (18:32 +0300)]
net: thunderbolt: Update module description with mention of USB4

It is Thunderbolt/USB4 now so reflect that in the module description too
to avoid any confusion. No functional changes.

Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
22 months agonet: thunderbolt: Enable full end-to-end flow control
Mika Westerberg [Tue, 30 Aug 2022 15:32:49 +0000 (18:32 +0300)]
net: thunderbolt: Enable full end-to-end flow control

USB4NET protocol allows the networking drivers to take advantage of
end-to-end flow control supported by the USB4 host interface. This
should prevent the receiving side from dropping network packets.

In adddition add a module parameter that can be used to turn this off
just in case it causes problems.

Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
22 months agothunderbolt: Add back Intel Falcon Ridge end-to-end flow control workaround
Mika Westerberg [Tue, 30 Aug 2022 15:32:48 +0000 (18:32 +0300)]
thunderbolt: Add back Intel Falcon Ridge end-to-end flow control workaround

As we are now enabling full end-to-end flow control to the Thunderbolt
networking driver, in order for it to work properly on second generation
Thunderbolt hardware (Falcon Ridge), we need to add back the workaround
that was removed with commit 53f13319d131 ("thunderbolt: Get rid of E2E
workaround"). However, this time we only apply it for Falcon Ridge
controllers as a form of an additional quirk. For non-Falcon Ridge this
does nothing.

While there fix a typo 'reqister' -> 'register' in the comment.

Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
22 months agothunderbolt: Show link type for XDomain connections too
Mika Westerberg [Tue, 30 Aug 2022 15:32:47 +0000 (18:32 +0300)]
thunderbolt: Show link type for XDomain connections too

Following what we do for routers already, extend this to XDomain
connections as well. This will show in sysfs whether the link is in USB4
or Thunderbolt mode.

Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
22 months agonet: thunderbolt: Enable DMA paths only after rings are enabled
Mika Westerberg [Tue, 30 Aug 2022 15:32:46 +0000 (18:32 +0300)]
net: thunderbolt: Enable DMA paths only after rings are enabled

If the other host starts sending packets early on it is possible that we
are still in the middle of populating the initial Rx ring packets to the
ring. This causes the tbnet_poll() to mess over the queue and causes
list corruption. This happens specifically when connected with macOS as
it seems start sending various IP discovery packets as soon as its side
of the paths are configured.

To prevent this we move the DMA path enabling to happen after we have
primed the Rx ring. This makes sure no incoming packets can arrive
before we are ready to handle them.

Fixes: e69b6c02b4c3 ("net: Add support for networking over Thunderbolt cable")
Cc: stable@vger.kernel.org
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
22 months agoethernet: rocker: fix sleep in atomic context bug in neigh_timer_handler
Duoming Zhou [Sat, 27 Aug 2022 15:38:15 +0000 (23:38 +0800)]
ethernet: rocker: fix sleep in atomic context bug in neigh_timer_handler

The function neigh_timer_handler() is a timer handler that runs in an
atomic context. When used by rocker, neigh_timer_handler() calls
"kzalloc(.., GFP_KERNEL)" that may sleep. As a result, the sleep in
atomic context bug will happen. One of the processes is shown below:

ofdpa_fib4_add()
 ...
 neigh_add_timer()

(wait a timer)

neigh_timer_handler()
 neigh_release()
  neigh_destroy()
   rocker_port_neigh_destroy()
    rocker_world_port_neigh_destroy()
     ofdpa_port_neigh_destroy()
      ofdpa_port_ipv4_neigh()
       kzalloc(sizeof(.., GFP_KERNEL) //may sleep

This patch changes the gfp_t parameter of kzalloc() from GFP_KERNEL to
GFP_ATOMIC in order to mitigate the bug.

Fixes: 00fc0c51e35b ("rocker: Change world_ops API and implementation to be switchdev independant")
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
22 months agophy: lan966x: add support for QUSGMII
Maxime Chevallier [Fri, 26 Aug 2022 14:17:22 +0000 (16:17 +0200)]
phy: lan966x: add support for QUSGMII

Makes so that the serdes driver also takes QUSGMII in consideration.
It's configured exactly as QSGMII as far as the serdes driver is
concerned.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
22 months agoMerge branch 'net-dsa-microchip-error-hndling-reg-access-validation'
David S. Miller [Wed, 31 Aug 2022 08:41:49 +0000 (09:41 +0100)]
Merge branch 'net-dsa-microchip-error-hndling-reg-access-validation'

Oleksij Rempel says:

====================
net: dsa: microchip: add error handling and register access validation

changes v4:
- add Reviewed-by: Vladimir Oltean <olteanv@gmail.com> to all patches
- fix checkpatch warnings.

changes v3:
- fix build error in the middle of the patch stack.

changes v2:
- add regmap_ranges for KSZ9477
- drop output clock devicetree in driver validation patches. DTs need
  some more refactoring and can be done in a separate patch set.
- remove some unused variables.

This patch series adds error handling for the PHY read/write path and optional
register access validation.
After adding regmap_ranges for KSZ8563 some bugs was detected, so
critical bug fixes are sorted before ragmap_range patch.

Potentially this bug fixes can be ported to stable kernels, but need to be
reworked.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
22 months agonet: dsa: microchip: remove IS_9893 flag
Oleksij Rempel [Fri, 26 Aug 2022 10:56:34 +0000 (12:56 +0200)]
net: dsa: microchip: remove IS_9893 flag

Use chip_id as other places of this code do it

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
22 months agonet: dsa: microchip: remove unused sgmii variable
Oleksij Rempel [Fri, 26 Aug 2022 10:56:33 +0000 (12:56 +0200)]
net: dsa: microchip: remove unused sgmii variable

This variable is not used. So, remove it.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
22 months agonet: dsa: microchip: ksz9477: remove unused "on" variable
Oleksij Rempel [Fri, 26 Aug 2022 10:56:32 +0000 (12:56 +0200)]
net: dsa: microchip: ksz9477: remove unused "on" variable

This variable is not used on ksz9477 side. Remove it.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
22 months agonet: dsa: microchip: remove unused port phy variable
Oleksij Rempel [Fri, 26 Aug 2022 10:56:31 +0000 (12:56 +0200)]
net: dsa: microchip: remove unused port phy variable

This variable is unused. So, drop it.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
22 months agonet: dsa: microchip: ksz9477: use internal_phy instead of phy_port_cnt
Oleksij Rempel [Fri, 26 Aug 2022 10:56:30 +0000 (12:56 +0200)]
net: dsa: microchip: ksz9477: use internal_phy instead of phy_port_cnt

With code refactoring was introduced new variable internal_phy. Let's
use it.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
22 months agonet: dsa: microchip: add regmap_range for KSZ9477 chip
Oleksij Rempel [Fri, 26 Aug 2022 10:56:29 +0000 (12:56 +0200)]
net: dsa: microchip: add regmap_range for KSZ9477 chip

Add register validation for KSZ9477

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
22 months agonet: dsa: microchip: ksz9477: remove MII_CTRL1000 check from ksz9477_w_phy()
Oleksij Rempel [Fri, 26 Aug 2022 10:56:28 +0000 (12:56 +0200)]
net: dsa: microchip: ksz9477: remove MII_CTRL1000 check from ksz9477_w_phy()

The reason why PHYlib may access MII_CTRL1000 on the chip without GBit
support is only if chip provides wrong information about extended caps
register. This issue is now handled by ksz9477_r_phy_quirks()

With proper regmap_ranges provided for all chips we will be able to
catch this kind of bugs any way. So, remove this sanity check.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
22 months agonet: dsa: microchip: add regmap_range for KSZ8563 chip
Oleksij Rempel [Fri, 26 Aug 2022 10:56:27 +0000 (12:56 +0200)]
net: dsa: microchip: add regmap_range for KSZ8563 chip

Add register validation for KSZ8563.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
22 months agonet: dsa: microchip: add support for regmap_access_tables
Oleksij Rempel [Fri, 26 Aug 2022 10:56:26 +0000 (12:56 +0200)]
net: dsa: microchip: add support for regmap_access_tables

This is complex driver with support for different chips with different
layouts. To detect at least some bugs earlier, we should validate register
accesses by using regmap_access_table support.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
22 months agonet: dsa: microchip: KSZ9893: do not write to not supported Output Clock Control...
Oleksij Rempel [Fri, 26 Aug 2022 10:56:25 +0000 (12:56 +0200)]
net: dsa: microchip: KSZ9893: do not write to not supported Output Clock Control Register

This issue was detected after adding regmap register access validation.
KSZ9893 compatible chips do not have "Output Clock Control Register
0x0103". So, avoid writing to it.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
22 months agonet: dsa: microchip: ksz8795: add error handling to ksz8_r/w_phy
Oleksij Rempel [Fri, 26 Aug 2022 10:56:24 +0000 (12:56 +0200)]
net: dsa: microchip: ksz8795: add error handling to ksz8_r/w_phy

Now ksz_pread/ksz_pwrite can return error value. So, make use of it.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
22 months agonet: dsa: microchip: ksz9477: add error handling to ksz9477_r/w_phy
Oleksij Rempel [Fri, 26 Aug 2022 10:56:23 +0000 (12:56 +0200)]
net: dsa: microchip: ksz9477: add error handling to ksz9477_r/w_phy

Now ksz_pread/ksz_pwrite can return error value. So, make use of it.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
22 months agonet: dsa: microchip: forward error value on all ksz_pread/ksz_pwrite functions
Oleksij Rempel [Fri, 26 Aug 2022 10:56:22 +0000 (12:56 +0200)]
net: dsa: microchip: forward error value on all ksz_pread/ksz_pwrite functions

ksz_read*/ksz_write* are able to return errors, so forward it.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
22 months agonet: dsa: microchip: allow to pass return values for PHY read/write accesses
Oleksij Rempel [Fri, 26 Aug 2022 10:56:21 +0000 (12:56 +0200)]
net: dsa: microchip: allow to pass return values for PHY read/write accesses

PHY access may end with errors on different levels. So, allow to forward
return values where possible.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
22 months agonet: dsa: microchip: don't announce extended register support on non Gbit chips
Oleksij Rempel [Fri, 26 Aug 2022 10:56:20 +0000 (12:56 +0200)]
net: dsa: microchip: don't announce extended register support on non Gbit chips

This issue was detected after adding support of regmap_ranges for KSZ8563R
chip. This chip is reporting extended registers support without having
actual extended registers. This made PHYlib request not existing
registers.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
22 months agonet: dsa: microchip: do per-port Gbit detection instead of per-chip
Oleksij Rempel [Fri, 26 Aug 2022 10:56:19 +0000 (12:56 +0200)]
net: dsa: microchip: do per-port Gbit detection instead of per-chip

KSZ8563 has two 100Mbit PHYs and CPU port with RGMII support. Since
1000Mbit configuration for the RGMII capable MAC is present, we should
use per port validation.

As main part of migration to per-port validation we need to rework
ksz9477_switch_init() function. Which is using undocumented
REG_GLOBAL_OPTIONS register to detect per-chip Gbit support. So, it is
related to some sort of risk for regressions.

To reduce this risk I compared the code with publicly available
documentations. This function will executed on following currently
supported chips:
struct ksz_chip_data            OF compatible
KSZ9477 KSZ9477
KSZ9897 KSZ9897
KSZ9893 KSZ9893, KSZ9563
KSZ8563 KSZ8563
KSZ9567 KSZ9567

Only KSZ9893, KSZ9563, KSZ8563 document existence of 0xf ==
REG_GLOBAL_OPTIONS register with bit field description "SKU ID":
KSZ9893 0x0C
KSZ9563 0x1C
KSZ8563 0x3C

The existence of hidden flags is not documented.

KSZ9477, KSZ9897, KSZ9567 do not document this register at all.

Only KSZ8563 is documented as non Gbit chip: 100Mbit PHYs and RGMII CPU
port. So, this change should not introduce a regression for
configurations with properly used OF compatibles.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
22 months agonet: dsa: microchip: add separate struct ksz_chip_data for KSZ8563 chip
Oleksij Rempel [Fri, 26 Aug 2022 10:56:18 +0000 (12:56 +0200)]
net: dsa: microchip: add separate struct ksz_chip_data for KSZ8563 chip

Add separate entry for the KSZ8563 chip. According to the documentation
it can support Gbit only on RGMII port. So, we will need to be able to
describe in the followup patch.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
22 months agocore: Variable type completion
Xin Gao [Wed, 17 Aug 2022 01:35:29 +0000 (09:35 +0800)]
core: Variable type completion

'unsigned int' is better than 'unsigned'.

Signed-off-by: Xin Gao <gaoxin@cdjrlc.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
22 months agoRevert "net: devlink: add RNLT lock assertion to devlink_compat_switch_id_get()"
Vlad Buslov [Mon, 29 Aug 2022 12:13:24 +0000 (14:13 +0200)]
Revert "net: devlink: add RNLT lock assertion to devlink_compat_switch_id_get()"

This reverts commit 6005a8aecee8afeba826295321a612ab485c230e.

The assertion was intentionally removed in commit 043b8413e8c0 ("net:
devlink: remove redundant rtnl lock assert") and, contrary what is
described in the commit message, the comment reflects that: "Caller must
hold RTNL mutex or reference to dev...".

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Tested-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20220829121324.3980376-1-vladbu@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agoMerge branch 'mlxsw-configure-max-lag-id-for-spectrum-4'
Jakub Kicinski [Wed, 31 Aug 2022 06:20:55 +0000 (23:20 -0700)]
Merge branch 'mlxsw-configure-max-lag-id-for-spectrum-4'

Petr Machata says:

====================
mlxsw: Configure max LAG ID for Spectrum-4

Amit Cohen writes:

In the device, LAG identifiers are stored in the port group table (PGT).
During initialization, firmware reserves a certain amount of entries at
the beginning of this table for LAG identifiers.

In Spectrum-4, the size of the PGT table did not increase, but the
maximum number of LAG identifiers was doubled, leaving less room for
others entries (e.g., flood entries) that also reside in the PGT.

Therefore, in order to avoid a regression and as long as there is no
explicit requirement to support 256 LAGs, configure the firmware to
allocate the same amount of LAG entries (128) as in Spectrum-{2,3}.

This can be done via the 'max_lag' field in CONFIG_PROFILE command.

Patch set overview:
Patch #1 edits the comment of the existing 'max_lag' field.
Patch #2 adds support for configuring 'max_lag' field via CONFIG_PROFILE
command.
Patch #3 adds an helper function to get the actual 'max_lag' in the
device.
Patch #4 adjusts Spectrum-4 to configure 'max_lag' field.
====================

Link: https://lore.kernel.org/r/cover.1661527928.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agomlxsw: spectrum: Add a copy of 'struct mlxsw_config_profile' for Spectrum-4
Amit Cohen [Fri, 26 Aug 2022 16:06:52 +0000 (18:06 +0200)]
mlxsw: spectrum: Add a copy of 'struct mlxsw_config_profile' for Spectrum-4

Starting from Spectrum-4, the maximum number of LAG IDs can be configured
by software via CONFIG_PROFILE command during driver initialization.

Add a dedicated instance of 'struct mlxsw_config_profile' for Spectrum-4
and set the 'max_lag' field to 128, which is the same amount of LAG entries
as in Spectrum-{2,3}. Without this configuration, firmware reserves 256
(the value of 'cap_max_lag' resource) entries at beginning of PGT table for
LAG identifiers, which means that less entries in PGT will be available.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agomlxsw: Add a helper function for getting maximum LAG ID
Amit Cohen [Fri, 26 Aug 2022 16:06:51 +0000 (18:06 +0200)]
mlxsw: Add a helper function for getting maximum LAG ID

Currently the driver queries the maximum supported LAG ID from firmware.
This will not be accurate anymore once the driver will configure 'max_lag'
via CONFIG_PROFILE command.

For resource query, firmware returns the maximum LAG ID which is supported
by hardware. Software can configure firmware to do not allocate entries for
all the supported LAGs, and to limit LAG IDs. In this case, the resource
query will not return the actual maximum LAG ID.

Add a helper function for getting this value. In case that 'max_lag' field
was set during initialization, return the value which was used, otherwise,
query firmware for the maximum supported ID.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agomlxsw: Support configuring 'max_lag' via CONFIG_PROFILE
Amit Cohen [Fri, 26 Aug 2022 16:06:50 +0000 (18:06 +0200)]
mlxsw: Support configuring 'max_lag' via CONFIG_PROFILE

In the device, LAG identifiers are stored in the port group table (PGT).
During initialization, firmware reserves a certain amount of entries at
the beginning of this table for LAG identifiers.

In Spectrum-4, the size of the PGT table did not increase, but the maximum
number of LAG identifiers was doubled, leaving less room for others entries
(e.g., flood entries) that also reside in the PGT.

Therefore, in order to avoid a regression and as long as there is no
explicit requirement to support 256 LAGs, mlxsw driver will configure the
firmware to allocate the same amount of LAG entries (128) as in
Spectrum-{2,3}. This configuration is done using 'max_lag' field in
CONFIG_PROFILE command. Extend 'struct mlxsw_config_profile' to support
'max_lag' field and configure firmware accordingly.

A next patch will adjust Spectrum-4 to configure 'max_lag' field.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agomlxsw: cmd: Edit the comment of 'max_lag' field in CONFIG_PROFILE
Amit Cohen [Fri, 26 Aug 2022 16:06:49 +0000 (18:06 +0200)]
mlxsw: cmd: Edit the comment of 'max_lag' field in CONFIG_PROFILE

Starting from Spectrum-4, the maximum number of LAG IDs can be configured
by software via CONFIG_PROFILE command during driver initialization.

Edit the comment of 'max_lag' field to mention that this field is reserved
in Spectrum-1/2/3 and describe firmware behavior.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agonet: lan966x: improve error handle in lan966x_fdma_rx_get_frame()
Dan Carpenter [Fri, 26 Aug 2022 15:00:30 +0000 (18:00 +0300)]
net: lan966x: improve error handle in lan966x_fdma_rx_get_frame()

Don't just print a warning.  Clean up and return an error as well.

Fixes: c8349639324a ("net: lan966x: Add FDMA functionality")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Link: https://lore.kernel.org/r/YwjgDm/SVd5c1tQU@kili
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agoDocumentation: bonding: clarify supported modes for tlb_dynamic_lb
Fernando Fernandez Mancera [Fri, 26 Aug 2022 15:47:38 +0000 (17:47 +0200)]
Documentation: bonding: clarify supported modes for tlb_dynamic_lb

tlb_dynamic_lb bonding option is compatible with balance-tlb and balance-alb
modes. In order to be consistent with other option documentation, it should
mention both modes not only balance-tlb.

Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Link: https://lore.kernel.org/r/20220826154738.4039-1-ffmancera@riseup.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agomlxsw: minimal: Return -ENOMEM on allocation failure
Dan Carpenter [Fri, 26 Aug 2022 15:03:30 +0000 (18:03 +0300)]
mlxsw: minimal: Return -ENOMEM on allocation failure

These error paths return success but they should return -ENOMEM.

Fixes: 01328e23a476 ("mlxsw: minimal: Extend module to port mapping with slot index")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/YwjgwoJ3M7Kdq9VK@kili
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agonet/mlx5e: Do not use err uninitialized in mlx5e_rep_add_meta_tunnel_rule()
Nathan Chancellor [Thu, 25 Aug 2022 18:06:07 +0000 (11:06 -0700)]
net/mlx5e: Do not use err uninitialized in mlx5e_rep_add_meta_tunnel_rule()

Clang warns:

  drivers/net/ethernet/mellanox/mlx5/core/en_rep.c:481:6: error: variable 'err' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized]
          if (IS_ERR(flow_rule)) {
              ^~~~~~~~~~~~~~~~~
  drivers/net/ethernet/mellanox/mlx5/core/en_rep.c:489:9: note: uninitialized use occurs here
          return err;
                ^~~
  drivers/net/ethernet/mellanox/mlx5/core/en_rep.c:481:2: note: remove the 'if' if its condition is always true
          if (IS_ERR(flow_rule)) {
          ^~~~~~~~~~~~~~~~~~~~~~~
  drivers/net/ethernet/mellanox/mlx5/core/en_rep.c:474:9: note: initialize the variable 'err' to silence this warning
          int err;
                ^
                  = 0
  1 error generated.

There is little reason to have the 'goto + error variable' construct in
this function. Get rid of it and just return the PTR_ERR value in the if
statement and 0 at the end.

Fixes: 430e2d5e2a98 ("net/mlx5: E-Switch, Move send to vport meta rule creation")
Link: https://github.com/ClangBuiltLinux/linux/issues/1695
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20220825180607.2707947-1-nathan@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agonfp: fix the access to management firmware hanging
Gao Xiao [Mon, 29 Aug 2022 10:16:51 +0000 (12:16 +0200)]
nfp: fix the access to management firmware hanging

When running `ethtool -p` with the old management firmware,
the management firmware resource is not correctly released,
which causes firmware related malfunction: all the access
to management firmware hangs.

It releases the management firmware resource when set id
mode operation is not supported.

Fixes: ccb9bc1dfa44 ("nfp: add 'ethtool --identify' support")
Signed-off-by: Gao Xiao <gao.xiao@corigine.com>
Reviewed-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Link: https://lore.kernel.org/r/20220829101651.633840-1-simon.horman@corigine.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agoMerge tag 'ieee802154-for-net-2022-08-29' of git://git.kernel.org/pub/scm/linux/kerne...
Jakub Kicinski [Wed, 31 Aug 2022 06:01:47 +0000 (23:01 -0700)]
Merge tag 'ieee802154-for-net-2022-08-29' of git://git./linux/kernel/git/sschmidt/wpan

Stefan Schmidt says:

====================
ieee802154 for net 2022-08-29

 - repeated word fix from Jilin Yuan.
 - missed return code setting in the cc2520 driver by Li Qiong.
 - fixing a potential race in by defering the workqueue destroy
   in the adf7242 driver by Lin Ma.
 - fixing a long standing problem in the mac802154 rx path to match
   corretcly by Miquel Raynal.

* tag 'ieee802154-for-net-2022-08-29' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan:
  ieee802154: cc2520: add rc code in cc2520_tx()
  net: mac802154: Fix a condition in the receive path
  net/ieee802154: fix repeated words in comments
  ieee802154/adf7242: defer destroy_workqueue call
====================

Link: https://lore.kernel.org/r/20220829100308.2802578-1-stefan@datenfreihafen.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agofuneth: remove pointless check of devlink pointer in create/destroy_netdev() flows
Jiri Pirko [Fri, 26 Aug 2022 11:04:11 +0000 (13:04 +0200)]
funeth: remove pointless check of devlink pointer in create/destroy_netdev() flows

Once devlink port is successfully registered, the devlink pointer is not
NULL. Therefore, the check is going to be always true and therefore
pointless. Remove it.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Dimitris Michailidis <dmichail@fungible.com>
Link: https://lore.kernel.org/r/20220826110411.1409446-1-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agonet: phy: micrel: Make the GPIO to be non-exclusive
Horatiu Vultur [Tue, 30 Aug 2022 06:40:55 +0000 (08:40 +0200)]
net: phy: micrel: Make the GPIO to be non-exclusive

The same GPIO line can be shared by multiple phys for the coma mode pin.
If that is the case then, all the other phys that share the same line
will failed to be probed because the access to the gpio line is not
non-exclusive.
Fix this by making access to the gpio line to be nonexclusive using flag
GPIOD_FLAGS_BIT_NONEXCLUSIVE. This allows all the other PHYs to be
probed.

Fixes: 738871b09250ee ("net: phy: micrel: add coma mode GPIO")
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Link: https://lore.kernel.org/r/20220830064055.2340403-1-horatiu.vultur@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agoMerge branch 'completely-rework-mediatek-mt7530-binding'
Jakub Kicinski [Wed, 31 Aug 2022 05:47:00 +0000 (22:47 -0700)]
Merge branch 'completely-rework-mediatek-mt7530-binding'

Arınç ÜNAL says:

====================
completely rework mediatek,mt7530 binding

This patch series brings complete rework of the mediatek,mt7530 binding.

The binding is checked with "make dt_binding_check
DT_SCHEMA_FILES=mediatek,mt7530.yaml".

If anyone knows the GIC bit for interrupt for multi-chip module MT7530 in
MT7623AI SoC, let me know. I'll add it to the examples.

If anyone got a Unielec U7623 or another MT7623AI board, please reach out.
====================

Link: https://lore.kernel.org/r/20220825082301.409450-1-arinc.unal@arinc9.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agodt-bindings: net: dsa: mediatek,mt7530: update binding description
Arınç ÜNAL [Thu, 25 Aug 2022 08:23:01 +0000 (11:23 +0300)]
dt-bindings: net: dsa: mediatek,mt7530: update binding description

Update the description of the binding.

- Describe the switches, which SoCs they are in, or if they are standalone.
- Explain the various ways of configuring MT7530's port 5.
- Remove phy-mode = "rgmii-txid" from description. Same code path is
followed for delayed rgmii and rgmii phy-mode on mtk_eth_soc.c.

Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agodt-bindings: net: dsa: mediatek,mt7530: define phy-mode per switch
Arınç ÜNAL [Thu, 25 Aug 2022 08:23:00 +0000 (11:23 +0300)]
dt-bindings: net: dsa: mediatek,mt7530: define phy-mode per switch

Define acceptable phy-mode values for the CPU ports of mt7530 and mt7531
switches. Remove relevant information from the description of the binding.

Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agodt-bindings: net: dsa: mediatek,mt7530: update examples
Arınç ÜNAL [Thu, 25 Aug 2022 08:22:59 +0000 (11:22 +0300)]
dt-bindings: net: dsa: mediatek,mt7530: update examples

Update the examples on the binding.

- Add examples which include a wide variation of configurations.
- Make example comments YAML comment instead of DT binding comment.
- Add interrupt controller to the examples. Include header file for
interrupt.
- Change reset line for MT7621 examples.
- Pretty formatting for the examples.
- Change switch reg to 0.
- Change port labels to fit the example, change port 4 label to wan.

Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agodt-bindings: net: dsa: mediatek,mt7530: fix reset lines
Arınç ÜNAL [Thu, 25 Aug 2022 08:22:58 +0000 (11:22 +0300)]
dt-bindings: net: dsa: mediatek,mt7530: fix reset lines

- Add description for reset-gpios.
- Invalidate reset-gpios if mediatek,mcm is used. We cannot use multiple
reset lines at the same time.
- Invalidate mediatek,mcm if the compatible device is mediatek,mt7531.
There is no multi-chip module version of mediatek,mt7531.
- Require mediatek,mcm for mediatek,mt7621 as the compatible string is only
used for the multi-chip module version of MT7530.

Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
22 months agodt-bindings: net: dsa: mediatek,mt7530: fix description of mediatek,mcm
Arınç ÜNAL [Thu, 25 Aug 2022 08:22:57 +0000 (11:22 +0300)]
dt-bindings: net: dsa: mediatek,mt7530: fix description of mediatek,mcm

Fix the description of mediatek,mcm. mediatek,mcm is not used on MT7623NI.

Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>