review.tizen.org Git - platform/kernel/linux-rpi.git/log

netdevsim: Fix memory leak of nsim_dev->fa_cookie

kmemleak reports this issue:

unreferenced object 0xffff8881bac872d0 (size 8):
  comm "sh", pid 58603, jiffies 4481524462 (age 68.065s)
  hex dump (first 8 bytes):
    04 00 00 00 de ad be ef                          ........
  backtrace:
    [<00000000c80b8577>] __kmalloc+0x49/0x150
    [<000000005292b8c6>] nsim_dev_trap_fa_cookie_write+0xc1/0x210 [netdevsim]
    [<0000000093d78e77>] full_proxy_write+0xf3/0x180
    [<000000005a662c16>] vfs_write+0x1c5/0xaf0
    [<000000007aabf84a>] ksys_write+0xed/0x1c0
    [<000000005f1d2e47>] do_syscall_64+0x3b/0x90
    [<000000006001c6ec>] entry_SYSCALL_64_after_hwframe+0x63/0xcd

The issue occurs in the following scenarios:

nsim_dev_trap_fa_cookie_write()
  kmalloc() fa_cookie
  nsim_dev->fa_cookie = fa_cookie
..
nsim_drv_remove()

The fa_cookie allocked in nsim_dev_trap_fa_cookie_write() is not freed. To
fix, add kfree(nsim_dev->fa_cookie) to nsim_drv_remove().

Fixes: d3cbb907ae57 ("netdevsim: add ACL trap reporting cookie as a metadata")
Signed-off-by: Wang Yufen <wangyufen@huawei.com>
Cc: Jiri Pirko <jiri@mellanox.com>
Link: https://lore.kernel.org/r/1668504625-14698-1-git-send-email-wangyufen@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tcp: configurable source port perturb table size

On embedded systems with little memory and no relevant
security concerns, it is beneficial to reduce the size
of the table.

Reducing the size from 2^16 to 2^8 saves 255 KiB
of kernel RAM.

Makes the table size configurable as an expert option.

The size was previously increased from 2^8 to 2^16
in commit 4c2c8f03a5ab ("tcp: increase source port perturb table to
2^16").

Signed-off-by: Gleb Mazovetskiy <glex.spb@gmail.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

l2tp: Serialize access to sk_user_data with sk_callback_lock

sk->sk_user_data has multiple users, which are not compatible with each
other. Writers must synchronize by grabbing the sk->sk_callback_lock.

l2tp currently fails to grab the lock when modifying the underlying tunnel
socket fields. Fix it by adding appropriate locking.

We err on the side of safety and grab the sk_callback_lock also inside the
sk_destruct callback overridden by l2tp, even though there should be no
refs allowing access to the sock at the time when sk_destruct gets called.

v4:
- serialize write to sk_user_data in l2tp sk_destruct

v3:
- switch from sock lock to sk_callback_lock
- document write-protection for sk_user_data

v2:
- update Fixes to point to origin of the bug
- use real names in Reported/Tested-by tags

Cc: Tom Parkin <tparkin@katalix.com>
Fixes: 3557baabf280 ("[L2TP]: PPP over L2TP driver core")
Reported-by: Haowei Yan <g1042620637@gmail.com>
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: thunderbolt: Fix error handling in tbnet_init()

A problem about insmod thunderbolt-net failed is triggered with following
log given while lsmod does not show thunderbolt_net:

insmod: ERROR: could not insert module thunderbolt-net.ko: File exists

The reason is that tbnet_init() returns tb_register_service_driver()
directly without checking its return value, if tb_register_service_driver()
failed, it returns without removing property directory, resulting the
property directory can never be created later.

tbnet_init()
   tb_register_property_dir() # register property directory
   tb_register_service_driver()
     driver_register()
       bus_add_driver()
         priv = kzalloc(...) # OOM happened
   # return without remove property directory

Fix by remove property directory when tb_register_service_driver() returns
error.

Fixes: e69b6c02b4c3 ("net: Add support for networking over Thunderbolt cable")
Signed-off-by: Yuan Can <yuancan@huawei.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'microchip-fixes'

Shang XiaoJing says:

====================
net: microchip: Fix potential null-ptr-deref due to create_singlethread_workqueue()

There are some functions call create_singlethread_workqueue() without
checking ret value, and the NULL workqueue_struct pointer may causes
null-ptr-deref. Will be fixed by this patch.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: microchip: sparx5: Fix potential null-ptr-deref in sparx_stats_init() and sparx5_start()

sparx_stats_init() calls create_singlethread_workqueue() and not
checked the ret value, which may return NULL. And a null-ptr-deref may
happen:

sparx_stats_init()
    create_singlethread_workqueue() # failed, sparx5->stats_queue is NULL
    queue_delayed_work()
        queue_delayed_work_on()
            __queue_delayed_work()  # warning here, but continue
                __queue_work()      # access wq->flags, null-ptr-deref

Check the ret value and return -ENOMEM if it is NULL. So as
sparx5_start().

Fixes: af4b11022e2d ("net: sparx5: add ethtool configuration and statistics support")
Fixes: b37a1bae742f ("net: sparx5: add mactable support")
Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: lan966x: Fix potential null-ptr-deref in lan966x_stats_init()

lan966x_stats_init() calls create_singlethread_workqueue() and not
checked the ret value, which may return NULL. And a null-ptr-deref may
happen:

lan966x_stats_init()
    create_singlethread_workqueue() # failed, lan966x->stats_queue is NULL
    queue_delayed_work()
        queue_delayed_work_on()
            __queue_delayed_work()  # warning here, but continue
                __queue_work()      # access wq->flags, null-ptr-deref

Check the ret value and return -ENOMEM if it is NULL.

Fixes: 12c2d0a5b8e2 ("net: lan966x: add ethtool configuration and statistics")
Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com>
Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: don't leak tagger-owned storage on switch driver unbind

In the initial commit dc452a471dba ("net: dsa: introduce tagger-owned
storage for private and shared data"), we had a call to
tag_ops->disconnect(dst) issued from dsa_tree_free(), which is called at
tree teardown time.

There were problems with connecting to a switch tree as a whole, so this
got reworked to connecting to individual switches within the tree. In
this process, tag_ops->disconnect(ds) was made to be called only from
switch.c (cross-chip notifiers emitted as a result of dynamic tag proto
changes), but the normal driver teardown code path wasn't replaced with
anything.

Solve this problem by adding a function that does the opposite of
dsa_switch_setup_tag_protocol(), which is called from the equivalent
spot in dsa_switch_teardown(). The positioning here also ensures that we
won't have any use-after-free in tagging protocol (*rcv) ops, since the
teardown sequence is as follows:

dsa_tree_teardown
-> dsa_tree_teardown_master
   -> dsa_master_teardown
      -> unsets master->dsa_ptr, making no further packets match the
         ETH_P_XDSA packet type handler
-> dsa_tree_teardown_ports
   -> dsa_port_teardown
      -> dsa_slave_destroy
         -> unregisters DSA net devices, there is even a synchronize_net()
            in unregister_netdevice_many()
-> dsa_tree_teardown_switches
   -> dsa_switch_teardown
      -> dsa_switch_teardown_tag_protocol
         -> finally frees the tagger-owned storage

Fixes: 7f2973149c22 ("net: dsa: make tagging protocols connect to individual switches from a tree")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Saeed Mahameed <saeed@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20221114143551.1906361-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/x25: Fix skb leak in x25_lapb_receive_frame()

x25_lapb_receive_frame() using skb_copy() to get a private copy of
skb, the new skb should be freed in the undersized/fragmented skb
error handling path. Otherwise there is a memory leak.

Fixes: cb101ed2c3c7 ("x25: Handle undersized/fragmented skbs")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Martin Schiller <ms@dev.tdt.de>
Link: https://lore.kernel.org/r/20221114110519.514538-1-weiyongjun@huaweicloud.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: ag71xx: call phylink_disconnect_phy if ag71xx_hw_enable() fail in ag71xx_open()

If ag71xx_hw_enable() fails, call phylink_disconnect_phy() to clean up.
And if phylink_of_phy_connect() fails, nothing needs to be done.
Compile tested only.

Fixes: 892e09153fa3 ("net: ag71xx: port to phylink")
Signed-off-by: Liu Jian <liujian56@huawei.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/20221114095549.40342-1-liujian56@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

bridge: switchdev: Fix memory leaks when changing VLAN protocol

The bridge driver can offload VLANs to the underlying hardware either
via switchdev or the 8021q driver. When the former is used, the VLAN is
marked in the bridge driver with the 'BR_VLFLAG_ADDED_BY_SWITCHDEV'
private flag.

To avoid the memory leaks mentioned in the cited commit, the bridge
driver will try to delete a VLAN via the 8021q driver if the VLAN is not
marked with the previously mentioned flag.

When the VLAN protocol of the bridge changes, switchdev drivers are
notified via the 'SWITCHDEV_ATTR_ID_BRIDGE_VLAN_PROTOCOL' attribute, but
the 8021q driver is also called to add the existing VLANs with the new
protocol and delete them with the old protocol.

In case the VLANs were offloaded via switchdev, the above behavior is
both redundant and buggy. Redundant because the VLANs are already
programmed in hardware and drivers that support VLAN protocol change
(currently only mlx5) change the protocol upon the switchdev attribute
notification. Buggy because the 8021q driver is called despite these
VLANs being marked with 'BR_VLFLAG_ADDED_BY_SWITCHDEV'. This leads to
memory leaks [1] when the VLANs are deleted.

Fix by not calling the 8021q driver for VLANs that were already
programmed via switchdev.

[1]
unreferenced object 0xffff8881f6771200 (size 256):
  comm "ip", pid 446855, jiffies 4298238841 (age 55.240s)
  hex dump (first 32 bytes):
    00 00 7f 0e 83 88 ff ff 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<00000000012819ac>] vlan_vid_add+0x437/0x750
    [<00000000f2281fad>] __br_vlan_set_proto+0x289/0x920
    [<000000000632b56f>] br_changelink+0x3d6/0x13f0
    [<0000000089d25f04>] __rtnl_newlink+0x8ae/0x14c0
    [<00000000f6276baf>] rtnl_newlink+0x5f/0x90
    [<00000000746dc902>] rtnetlink_rcv_msg+0x336/0xa00
    [<000000001c2241c0>] netlink_rcv_skb+0x11d/0x340
    [<0000000010588814>] netlink_unicast+0x438/0x710
    [<00000000e1a4cd5c>] netlink_sendmsg+0x788/0xc40
    [<00000000e8992d4e>] sock_sendmsg+0xb0/0xe0
    [<00000000621b8f91>] ____sys_sendmsg+0x4ff/0x6d0
    [<000000000ea26996>] ___sys_sendmsg+0x12e/0x1b0
    [<00000000684f7e25>] __sys_sendmsg+0xab/0x130
    [<000000004538b104>] do_syscall_64+0x3d/0x90
    [<0000000091ed9678>] entry_SYSCALL_64_after_hwframe+0x46/0xb0

Fixes: 279737939a81 ("net: bridge: Fix VLANs memory leak")
Reported-by: Vlad Buslov <vladbu@nvidia.com>
Tested-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20221114084509.860831-1-idosch@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Merge branch 'net-hns3-this-series-bugfix-for-the-hns3-ethernet-driver'

Hao Lan says:

====================
net: hns3: This series bugfix for the HNS3 ethernet driver.

This series includes some bugfix for the HNS3 ethernet driver.
Patch 1# fix incorrect hw rss hash type of rx packet.
Fixes: 796640778c26 ("net: hns3: support RXD advanced layout")
Fixes: 232fc64b6e62 ("net: hns3: Add HW RSS hash information to RX skb")
Fixes: ea4858670717 ("net: hns3: handle the BD info on the last BD of the packet")

Patch 2# fix return value check bug of rx copybreak.
Fixes: e74a726da2c4 ("net: hns3: refactor hns3_nic_reuse_page()")
Fixes: 99f6b5fb5f63 ("net: hns3: use bounce buffer when rx page can not be reused")

Patch 3# net: hns3: fix setting incorrect phy link ksettings
for firmware in resetting process
Fixes: f5f2b3e4dcc0 ("net: hns3: add support for imp-controlled PHYs")
Fixes: c5ef83cbb1e9 ("net: hns3: fix for phy_addr error in hclge_mac_mdio_config")
Fixes: 2312e050f42b ("net: hns3: Fix for deadlock problem occurring when unregistering ae_algo")
====================

Link: https://lore.kernel.org/r/20221114082048.49450-1-lanhao@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: hns3: fix setting incorrect phy link ksettings for firmware in resetting process

Currently, if driver is in phy-imp(phy controlled by imp firmware) mode, as
driver did not update phy link ksettings after initialization process or
not update advertising when getting phy link ksettings from firmware, it
may set incorrect phy link ksettings for firmware in resetting process.
So fix it.

Fixes: f5f2b3e4dcc0 ("net: hns3: add support for imp-controlled PHYs")
Fixes: c5ef83cbb1e9 ("net: hns3: fix for phy_addr error in hclge_mac_mdio_config")
Fixes: 2312e050f42b ("net: hns3: Fix for deadlock problem occurring when unregistering ae_algo")
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Hao Lan <lanhao@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: hns3: fix return value check bug of rx copybreak

The refactoring of rx copybreak modifies the original return logic, which
will make this feature unavailable. So this patch fixes the return logic of
rx copybreak.

Fixes: e74a726da2c4 ("net: hns3: refactor hns3_nic_reuse_page()")
Fixes: 99f6b5fb5f63 ("net: hns3: use bounce buffer when rx page can not be reused")
Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Hao Lan <lanhao@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: hns3: fix incorrect hw rss hash type of rx packet

Currently, the HNS3 driver reports the rss hash type
of each packet based on the rss hash tuples set. It
always reports PKT_HASH_TYPE_L4, without checking the
type of current packet. It's incorrect.
Fixes it by reporting it base on the packet type.

Fixes: 796640778c26 ("net: hns3: support RXD advanced layout")
Fixes: 232fc64b6e62 ("net: hns3: Add HW RSS hash information to RX skb")
Fixes: ea4858670717 ("net: hns3: handle the BD info on the last BD of the packet")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Hao Lan <lanhao@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: phy: marvell: add sleep time after enabling the loopback bit

Sleep time is added to ensure the phy to be ready after loopback
bit was set. This to prevent the phy loopback test from failing.

Fixes: 020a45aff119 ("net: phy: marvell: add Marvell specific PHY loopback")
Cc: <stable@vger.kernel.org> # 5.15.x
Signed-off-by: Muhammad Husaini Zulkifli <muhammad.husaini.zulkifli@intel.com>
Signed-off-by: Aminuddin Jamaluddin <aminuddin.jamaluddin@intel.com>
Link: https://lore.kernel.org/r/20221114065302.10625-1-aminuddin.jamaluddin@intel.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: ena: Fix error handling in ena_init()

The ena_init() won't destroy workqueue created by
create_singlethread_workqueue() when pci_register_driver() failed.
Call destroy_workqueue() when pci_register_driver() failed to prevent the
resource leak.

Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Yuan Can <yuancan@huawei.com>
Acked-by: Shay Agroskin <shayagr@amazon.com>
Link: https://lore.kernel.org/r/20221114025659.124726-1-yuancan@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

kcm: close race conditions on sk_receive_queue

sk->sk_receive_queue is protected by skb queue lock, but for KCM
sockets its RX path takes mux->rx_lock to protect more than just
skb queue. However, kcm_recvmsg() still only grabs the skb queue
lock, so race conditions still exist.

We can teach kcm_recvmsg() to grab mux->rx_lock too but this would
introduce a potential performance regression as struct kcm_mux can
be shared by multiple KCM sockets.

So we have to enforce skb queue lock in requeue_rx_msgs() and handle
skb peek case carefully in kcm_wait_data(). Fortunately,
skb_recv_datagram() already handles it nicely and is widely used by
other sockets, we can just switch to skb_recv_datagram() after
getting rid of the unnecessary sock lock in kcm_recvmsg() and
kcm_splice_read(). Side note: SOCK_DONE is not used by KCM sockets,
so it is safe to get rid of this check too.

I ran the original syzbot reproducer for 30 min without seeing any
issue.

Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module")
Reported-by: syzbot+278279efdd2730dd14bf@syzkaller.appspotmail.com
Reported-by: shaozhengchao <shaozhengchao@huawei.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Link: https://lore.kernel.org/r/20221114005119.597905-1-xiyou.wangcong@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: ionic: Fix error handling in ionic_init_module()

A problem about ionic create debugfs failed is triggered with the
following log given:

[  415.799514] debugfs: Directory 'ionic' with parent '/' already present!

The reason is that ionic_init_module() returns ionic_bus_register_driver()
directly without checking its return value, if ionic_bus_register_driver()
failed, it returns without destroy the newly created debugfs, resulting
the debugfs of ionic can never be created later.

ionic_init_module()
   ionic_debugfs_create() # create debugfs directory
   ionic_bus_register_driver()
     pci_register_driver()
       driver_register()
         bus_add_driver()
           priv = kzalloc(...) # OOM happened
   # return without destroy debugfs directory

Fix by removing debugfs when ionic_bus_register_driver() returns error.

Fixes: fbfb8031533c ("ionic: Add hardware init and device commands")
Signed-off-by: Yuan Can <yuancan@huawei.com>
Acked-by: Shannon Nelson <snelson@pensando.io>
Link: https://lore.kernel.org/r/20221113092929.19161-1-yuancan@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

mlxsw: Avoid warnings when not offloaded FDB entry with IPv6 is removed

FDB entries that perform VXLAN encapsulation with an IPv6 underlay hold
a reference on a resource - the KVDL entry where the IPv6 underlay
destination IP is stored. For that, the driver maintains two hash tables:
1. Maps IPv6 to KVDL index
2. Maps {MAC, FID index} to IPv6 address

When a FDB entry is removed, the second table is used to find the relevant
IPv6 address and the first table is used to remove the reference count and
free the index if is not used anymore.

In order for a packet to be forwarded to a single remote VTEP, FDB
entries need to be configured at both the bridge and VXLAN devices' FDB
tables. Both entries are squashed into one {MAC, VLAN/VNI} -> IP entry
in the hardware. Therefore, in case one entry is removed, the entry will
be removed from the hardware and the remaining entry will be unmarked
with 'offload' flag since it is not offloaded anymore.

For example, the two FDB entries should be added to allow packets to be
forwarded via vx10:
$ bridge fdb add dev vx10 aa:bb:cc:dd:ee:ff self static dst 2001:db8:5::1
$ bridge fdb add dev vx10 aa:bb:cc:dd:ee:ff master static vlan 10

When one entry will be removed, the second one will not be offloaded
anymore. When the first entry (in VXLAN FDB) will be removed / will not be
offloaded anymore, the two mappings in IPv6 hash tables will be removed.

In case that the second entry is removed before the first one, unexpected
warnings[1][2] will be shown in user space as a result of removing the
first entry. The issue is that not offloaded entry is removed, the driver
tries to search the relevant entries in the hash tables, does not find them
and therefore warns.

Do not handle removing of not offloaded VXLAN FDB entries, as they were
already removed when the offload flag was removed.

[1]:
WARNING: CPU: 1 PID: 239 at drivers/net/ethernet/mellanox/mlxsw/spectrum_nve.c:914 mlxsw_sp_nve_ipv6_addr_map_del+0x6b/0x80 [mlxsw_spectrum]
...
Hardware name: Mellanox Technologies Ltd. Mellanox switch/Mellanox switch, BIOS 4.6.5 05/21/2015
Workqueue: mlxsw_core_ordered mlxsw_sp_switchdev_vxlan_fdb_event_work [mlxsw_spectrum]
RIP: 0010:mlxsw_sp_nve_ipv6_addr_map_del+0x6b/0x80 [mlxsw_spectrum]
...
Call Trace:
  <TASK>
  mlxsw_sp_port_fdb_tunnel_uc_op+0x6cf/0x7b0 [mlxsw_spectrum]
  mlxsw_sp_switchdev_vxlan_fdb_event_work+0x17c/0x420 [mlxsw_spectrum]
  ? finish_task_switch.isra.0+0x8c/0x290
  process_one_work+0x1cd/0x390
  worker_thread+0x48/0x3c0
  ? process_one_work+0x390/0x390
  kthread+0xe0/0x110
  ? kthread_complete_and_exit+0x20/0x20
  ret_from_fork+0x1f/0x30
  </TASK>

[2]:
WARNING: CPU: 0 PID: 239 at drivers/net/ethernet/mellanox/mlxsw/spectrum.c:3035 mlxsw_sp_ipv6_addr_put+0x142/0x220 [mlxsw_spectrum]
...
Hardware name: Mellanox Technologies Ltd. Mellanox switch/Mellanox switch, BIOS 4.6.5 05/21/2015
Workqueue: mlxsw_core_ordered mlxsw_sp_switchdev_vxlan_fdb_event_work [mlxsw_spectrum]
RIP: 0010:mlxsw_sp_ipv6_addr_put+0x142/0x220 [mlxsw_spectrum]
...
Call Trace:
  <TASK>
  ? mlxsw_sp_port_fdb_tun_uc_op6_sfd_write+0x5c1/0x610 [mlxsw_spectrum]
  mlxsw_sp_port_fdb_tunnel_uc_op+0x6ec/0x7b0 [mlxsw_spectrum]
  mlxsw_sp_switchdev_vxlan_fdb_event_work+0x17c/0x420 [mlxsw_spectrum]
  ? finish_task_switch.isra.0+0x8c/0x290
  process_one_work+0x1cd/0x390
  worker_thread+0x48/0x3c0
  ? process_one_work+0x390/0x390
  kthread+0xe0/0x110
  ? kthread_complete_and_exit+0x20/0x20
  ret_from_fork+0x1f/0x30
  </TASK>

Fixes: 0860c7641634 ("mlxsw: spectrum_nve: Keep track of IPv6 addresses used by FDB entries")
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Link: https://lore.kernel.org/r/c186de8cbd28e3eb661e06f31f7f2f2dff30020f.1668184350.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: dsa: make dsa_master_ioctl() see through port_hwtstamp_get() shims

There are multi-generational drivers like mv88e6xxx which have code like
this:

int mv88e6xxx_port_hwtstamp_get(struct dsa_switch *ds, int port,
struct ifreq *ifr)
{
if (!chip->info->ptp_support)
return -EOPNOTSUPP;

...
}

DSA wants to deny PTP timestamping on the master if the switch supports
timestamping too. However it currently relies on the presence of the
port_hwtstamp_get() callback to determine PTP capability, and this
clearly does not work in that case (method is present but returns
-EOPNOTSUPP).

We should not deny PTP on the DSA master for those switches which truly
do not support hardware timestamping.

Create a dsa_port_supports_hwtstamp() method which actually probes for
support by calling port_hwtstamp_get() and seeing whether that returned
-EOPNOTSUPP or not.

Fixes: f685e609a301 ("net: dsa: Deny PTP on master if switch supports it")
Link: https://patchwork.kernel.org/project/netdevbpf/patch/20221110124345.3901389-1-festevam@gmail.com/
Reported-by: Fabio Estevam <festevam@gmail.com>
Reported-by: Steffen Bätz <steffen@innosonix.de>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Fabio Estevam <festevam@denx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: mhi: Fix memory leak in mhi_net_dellink()

MHI driver registers network device without setting the
needs_free_netdev flag, and does NOT call free_netdev() when
unregisters network device, which causes a memory leak.

This patch calls free_netdev() to fix it since netdev_priv
is used after unregister.

Fixes: 13adac032982 ("net: mhi_net: Register wwan_ops for link creation")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'octeon_ep-fixes'

Ziyang Xuan says:

====================
octeon_ep: fix several bugs in exception paths

Find several obvious bugs during code review in exception paths. Provide
this patchset to fix them. Not tested, just compiled.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

octeon_ep: ensure get mac address successfully before eth_hw_addr_set()

octep_get_mac_addr() can fail because send mbox message failed. If this
happens, octep_dev->mac_addr will be zero. It should not continue to
initialize. Add exception handling for octep_get_mac_addr() to fix it.

Fixes: 862cd659a6fb ("octeon_ep: Add driver framework and device initialization")
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

octeon_ep: fix potential memory leak in octep_device_setup()

When occur unsupported_dev and mbox init errors, it did not free oct->conf
and iounmap() oct->mmio[i].hw_addr. That would trigger memory leak problem.
Add kfree() for oct->conf and iounmap() for oct->mmio[i].hw_addr under
unsupported_dev and mbox init errors to fix the problem.

Fixes: 862cd659a6fb ("octeon_ep: Add driver framework and device initialization")
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

octeon_ep: ensure octep_get_link_status() successfully before octep_link_up()

octep_get_link_status() can fail because send mbox message failed, then
octep_get_link_status() will return ret less than 0. Excute octep_link_up()
as long as ret is not equal to 0 in octep_open() now. That is not correct.

The value type of link.state is enum octep_ctrl_net_state. Positive value
represents up. Excute octep_link_up() when ret is bigger than 0.

Fixes: 862cd659a6fb ("octeon_ep: Add driver framework and device initialization")
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

octeon_ep: delete unnecessary napi rollback under set_queues_err in octep_open()

octep_napi_add() and octep_napi_enable() are all after
netif_set_real_num_{tx,rx}_queues() in octep_open(), so it is unnecessary
napi rollback under set_queues_err. Delete them to fix it.

Fixes: 37d79d059606 ("octeon_ep: add Tx/Rx processing and interrupt support")
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnxt_en: Remove debugfs when pci_register_driver failed

When pci_register_driver failed, we need to remove debugfs,
which will caused a resource leak, fix it.

Resource leak logs as follows:
[ 52.184456] debugfs: Directory 'bnxt_en' with parent '/' already present!

Fixes: cabfb09d87bd ("bnxt_en: add debugfs support for DIM")
Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: caif: fix double disconnect client in chnl_net_open()

When connecting to client timeout, disconnect client for twice in
chnl_net_open(). Remove one. Compile tested only.

Fixes: 2aa40aef9deb ("caif: Use link layer MTU instead of fixed MTU")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: macvlan: Use built-in RCU list checking

hlist_for_each_entry_rcu() has built-in RCU and lock checking.

Pass cond argument to hlist_for_each_entry_rcu() to silence false
lockdep warning when CONFIG_PROVE_RCU_LIST is enabled.

Execute as follow:

ip link add link eth0 type macvlan mode source macaddr add <MAC-ADDR>

The rtnl_lock is held when macvlan_hash_lookup_source() or
macvlan_fill_info_macaddr() are called in the non-RCU read side section.
So, pass lockdep_rtnl_is_held() to silence false lockdep warning.

Fixes: 79cf79abce71 ("macvlan: add source mode")
Signed-off-by: Chuang Wang <nashuiliang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mISDN: fix misuse of put_device() in mISDN_register_device()

We should not release reference by put_device() before calling device_initialize().

Fixes: e7d1d4d9ac0d ("mISDN: fix possible memory leak in mISDN_register_device()")
Signed-off-by: Wang ShaoBo <bobo.shaobowang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: liquidio: release resources when liquidio driver open failed

When liquidio driver open failed, it doesn't release resources. Compile
tested only.

Fixes: 5b07aee11227 ("liquidio: MSIX support for CN23XX")
Fixes: dbc97bfd3918 ("net: liquidio: Add missing null pointer checks")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mctp i2c: don't count unused / invalid keys for flow release

We're currently hitting the WARN_ON in mctp_i2c_flow_release:

if (midev->release_count > midev->i2c_lock_count) {
WARN_ONCE(1, "release count overflow");

This may be hit if we expire a flow before sending the first packet it
contains - as we will not be pairing the increment of release_count
(performed on flow release) with the i2c lock operation (only
performed on actual TX).

To fix this, only release a flow if we've encountered it previously (ie,
dev_flow_state does not indicate NEW), as we will mark the flow as
ACTIVE at the same time as accounting for the i2c lock operation. We
also need to add an INVALID flow state, to indicate when we've done the
release.

Fixes: f5b8abf9fc3d ("mctp i2c: MCTP I2C binding driver")
Reported-by: Jian Zhang <zhangjian.3032@bytedance.com>
Tested-by: Jian Zhang <zhangjian.3032@bytedance.com>
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Link: https://lore.kernel.org/r/20221110053135.329071-1-jk@codeconstruct.com.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/tls: Fix memory leak in tls_enc_skb() and tls_sw_fallback_init()

'aead_req' and 'aead_send' is allocated but not freed in default switch
case. This commit fixes the potential memory leak by freeing them under
the situation.

Note that the default cases here should never be reached as they'd
mean we allowed offloading an unsupported algorithm.

Fixes: ea7a9d88ba21 ("net/tls: Use cipher sizes structs")
Signed-off-by: Yu Liao <liaoyu15@huawei.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Link: https://lore.kernel.org/r/20221110090329.2036382-1-liaoyu15@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: stmmac: ensure tx function is not running in stmmac_xdp_release()

When stmmac_xdp_release() is called, there is a possibility that tx
function is still running on other queues which will lead to tx queue
timed out and reset adapter.

This commit ensure that tx function is not running xdp before release
flow continue to run.

Fixes: ac746c8520d9 ("net: stmmac: enhance XDP ZC driver level switching performance")
Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com>
Signed-off-by: Mohd Faizal Abdul Rahim <faizal.abdul.rahim@intel.com>
Signed-off-by: Noor Azura Ahmad Tarmizi <noor.azura.ahmad.tarmizi@intel.com>
Link: https://lore.kernel.org/r/20221110064552.22504-1-noor.azura.ahmad.tarmizi@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: phy: dp83867: Fix SGMII FIFO depth for non OF devices

Current driver code will read device tree node information,
and set default values if there is no info provided.

This is not done in non-OF devices leading to SGMII fifo depths being
set to the smallest size.

This patch sets the value to the default value of the PHY as stated in the
PHY datasheet.

Fixes: 4dc08dcc9f6f ("net: phy: dp83867: introduce critical chip default init for non-of platform")
Signed-off-by: Michael Sit Wei Hong <michael.wei.hong.sit@intel.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20221110054938.925347-1-michael.wei.hong.sit@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: hinic: Fix error handling in hinic_module_init()

A problem about hinic create debugfs failed is triggered with the
following log given:

[  931.419023] debugfs: Directory 'hinic' with parent '/' already present!

The reason is that hinic_module_init() returns pci_register_driver()
directly without checking its return value, if pci_register_driver()
failed, it returns without destroy the newly created debugfs, resulting
the debugfs of hinic can never be created later.

hinic_module_init()
   hinic_dbg_register_debugfs() # create debugfs directory
   pci_register_driver()
     driver_register()
       bus_add_driver()
         priv = kzalloc(...) # OOM happened
   # return without destroy debugfs directory

Fix by removing debugfs when pci_register_driver() returns error.

Fixes: 253ac3a97921 ("hinic: add support to query sq info")
Signed-off-by: Yuan Can <yuancan@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20221110021642.80378-1-yuancan@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

mISDN: fix possible memory leak in mISDN_dsp_element_register()

Afer commit 1fa5ae857bb1 ("driver core: get rid of struct device's
bus_id string array"), the name of device is allocated dynamically,
use put_device() to give up the reference, so that the name can be
freed in kobject_cleanup() when the refcount is 0.

The 'entry' is going to be freed in mISDN_dsp_dev_release(), so the
kfree() is removed. list_del() is called in mISDN_dsp_dev_release(),
so it need be initialized.

Fixes: 1fa5ae857bb1 ("driver core: get rid of struct device's bus_id string array")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20221109132832.3270119-1-yangyingliang@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: bgmac: Drop free_netdev() from bgmac_enet_remove()

netdev is allocated in bgmac_alloc() with devm_alloc_etherdev() and will
be auto released in ->remove and ->probe failure path. Using free_netdev()
in bgmac_enet_remove() leads to double free.

Fixes: 34a5102c3235 ("net: bgmac: allocate struct bgmac just once & don't copy it")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Link: https://lore.kernel.org/r/20221109150136.2991171-1-weiyongjun@huaweicloud.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge https://git./linux/kernel/git/bpf/bpf

Andrii Nakryiko says:

====================
bpf 2022-11-11

We've added 11 non-merge commits during the last 8 day(s) which contain
a total of 11 files changed, 83 insertions(+), 74 deletions(-).

The main changes are:

1) Fix strncpy_from_kernel_nofault() to prevent out-of-bounds writes,
   from Alban Crequy.

2) Fix for bpf_prog_test_run_skb() to prevent wrong alignment,
   from Baisong Zhong.

3) Switch BPF_DISPATCHER to static_call() instead of ftrace infra, with
   a small build fix on top, from Peter Zijlstra and Nathan Chancellor.

4) Fix memory leak in BPF verifier in some error cases, from Wang Yufen.

5) 32-bit compilation error fixes for BPF selftests, from Pu Lehui and
   Yang Jihong.

6) Ensure even distribution of per-CPU free list elements, from Xu Kuohai.

7) Fix copy_map_value() to track special zeroed out areas properly,
   from Xu Kuohai.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf: Fix offset calculation error in __copy_map_value and zero_map_value
  bpf: Initialize same number of free nodes for each pcpu_freelist
  selftests: bpf: Add a test when bpf_probe_read_kernel_str() returns EFAULT
  maccess: Fix writing offset in case of fault in strncpy_from_kernel_nofault()
  selftests/bpf: Fix test_progs compilation failure in 32-bit arch
  selftests/bpf: Fix casting error when cross-compiling test_verifier for 32-bit platforms
  bpf: Fix memory leaks in __check_func_call
  bpf: Add explicit cast to 'void *' for __BPF_DISPATCHER_UPDATE()
  bpf: Convert BPF_DISPATCHER to use static_call() (not ftrace)
  bpf: Revert ("Fix dispatcher patchable function entry to 5 bytes nop")
  bpf, test_run: Fix alignment problem in bpf_prog_test_run_skb()
====================

Link: https://lore.kernel.org/r/20221111231624.938829-1-andrii@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

bpf: Fix offset calculation error in __copy_map_value and zero_map_value

Function __copy_map_value and zero_map_value miscalculated copy offset,
resulting in possible copy of unwanted data to user or kernel.

Fix it.

Fixes: cc48755808c6 ("bpf: Add zero_map_value to zero map value with special fields")
Fixes: 4d7d7f69f4b1 ("bpf: Adapt copy_map_value for multiple offset case")
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/bpf/20221111125620.754855-1-xukuohai@huaweicloud.com

bpf: Initialize same number of free nodes for each pcpu_freelist

pcpu_freelist_populate() initializes nr_elems / num_possible_cpus() + 1
free nodes for some CPUs, and then possibly one CPU with fewer nodes,
followed by remaining cpus with 0 nodes. For example, when nr_elems == 256
and num_possible_cpus() == 32, CPU 0~27 each gets 9 free nodes, CPU 28 gets
4 free nodes, CPU 29~31 get 0 free nodes, while in fact each CPU should get
8 nodes equally.

This patch initializes nr_elems / num_possible_cpus() free nodes for each
CPU firstly, then allocates the remaining free nodes by one for each CPU
until no free nodes left.

Fixes: e19494edab82 ("bpf: introduce percpu_freelist")
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20221110122128.105214-1-xukuohai@huawei.com

Merge branch 'Fix offset when fault occurs in strncpy_from_kernel_nofault()'

Alban Crequy says:

====================

Hi,

This is v2 of the fix & selftest previously sent at:
https://lore.kernel.org/linux-mm/20221108195211.214025-1-flaniel@linux.microsoft.com/

Changes v1 to v2:
- add 'cc:stable', 'Fixes:' and review/ack tags
- update commitmsg and fix my email
- rebase on bpf tree and tag for bpf tree

Thanks!
====================

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

selftests: bpf: Add a test when bpf_probe_read_kernel_str() returns EFAULT

This commit tests previous fix of bpf_probe_read_kernel_str().

The BPF helper bpf_probe_read_kernel_str should return -EFAULT when
given a bad source pointer and the target buffer should only be modified
to make the string NULL terminated.

bpf_probe_read_kernel_str() was previously inserting a NULL before the
beginning of the dst buffer. This test should ensure that the
implementation stays correct for now on.

Without the fix, this test will fail as follows:
  $ cd tools/testing/selftests/bpf
  $ make
  $ sudo ./test_progs --name=varlen
  ...
  test_varlen:FAIL:check got 0 != exp 66

Signed-off-by: Alban Crequy <albancrequy@linux.microsoft.com>
Signed-off-by: Francis Laniel <flaniel@linux.microsoft.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20221110085614.111213-3-albancrequy@linux.microsoft.com
Changes v1 to v2:
- add ack tag
- fix my email
- rebase on bpf tree and tag for bpf tree

maccess: Fix writing offset in case of fault in strncpy_from_kernel_nofault()

If a page fault occurs while copying the first byte, this function resets one
byte before dst.
As a consequence, an address could be modified and leaded to kernel crashes if
case the modified address was accessed later.

Fixes: b58294ead14c ("maccess: allow architectures to provide kernel probing directly")
Signed-off-by: Alban Crequy <albancrequy@linux.microsoft.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Tested-by: Francis Laniel <flaniel@linux.microsoft.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: <stable@vger.kernel.org> [5.8]
Link: https://lore.kernel.org/bpf/20221110085614.111213-2-albancrequy@linux.microsoft.com

nfp: change eeprom length to max length enumerators

Extend the size of QSFP EEPROM for types SSF8436 and SFF8636
from 256 to 640 bytes in order to expose all the EEPROM pages by
ethtool.

For SFF-8636 and SFF-8436 specifications, the driver exposes
256 bytes of EEPROM data for ethtool's get_module_eeprom()
callback, resulting in "netlink error: Invalid argument" when
an EEPROM read with an offset larger than 256 bytes is attempted.

Changing the length enumerators to the _MAX_LEN
variants exposes all 640 bytes of the EEPROM allowing upper
pages 1, 2 and 3 to be read.

Fixes: 96d971e307cc ("ethtool: Add fallback to get_module_eeprom from netlink command")
Signed-off-by: Jaco Coetzee <jaco.coetzee@corigine.com>
Reviewed-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'net-6.1-rc5' of git://git./linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
"Including fixes from netfilter, wifi, can and bpf.

  Current release - new code bugs:

   - can: af_can: can_exit(): add missing dev_remove_pack() of
     canxl_packet

  Previous releases - regressions:

   - bpf, sockmap: fix the sk->sk_forward_alloc warning

   - wifi: mac80211: fix general-protection-fault in
     ieee80211_subif_start_xmit()

   - can: af_can: fix NULL pointer dereference in can_rx_register()

   - can: dev: fix skb drop check, avoid o-o-b access

   - nfnetlink: fix potential dead lock in nfnetlink_rcv_msg()

  Previous releases - always broken:

   - bpf: fix wrong reg type conversion in release_reference()

   - gso: fix panic on frag_list with mixed head alloc types

   - wifi: brcmfmac: fix buffer overflow in brcmf_fweh_event_worker()

   - wifi: mac80211: set TWT Information Frame Disabled bit as 1

   - eth: macsec offload related fixes, make sure to clear the keys from
     memory

   - tun: fix memory leaks in the use of napi_get_frags

   - tun: call napi_schedule_prep() to ensure we own a napi

   - tcp: prohibit TCP_REPAIR_OPTIONS if data was already sent

   - ipv6: addrlabel: fix infoleak when sending struct ifaddrlblmsg to
     network

   - tipc: fix a msg->req tlv length check

   - sctp: clear out_curr if all frag chunks of current msg are pruned,
     avoid list corruption

   - mctp: fix an error handling path in mctp_init(), avoid leaks"

* tag 'net-6.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (101 commits)
  eth: sp7021: drop free_netdev() from spl2sw_init_netdev()
  MAINTAINERS: Move Vivien to CREDITS
  net: macvlan: fix memory leaks of macvlan_common_newlink
  ethernet: tundra: free irq when alloc ring failed in tsi108_open()
  net: mv643xx_eth: disable napi when init rxq or txq failed in mv643xx_eth_open()
  ethernet: s2io: disable napi when start nic failed in s2io_card_up()
  net: atlantic: macsec: clear encryption keys from the stack
  net: phy: mscc: macsec: clear encryption keys when freeing a flow
  stmmac: dwmac-loongson: fix missing of_node_put() while module exiting
  stmmac: dwmac-loongson: fix missing pci_disable_device() in loongson_dwmac_probe()
  stmmac: dwmac-loongson: fix missing pci_disable_msi() while module exiting
  cxgb4vf: shut down the adapter when t4vf_update_port_info() failed in cxgb4vf_open()
  mctp: Fix an error handling path in mctp_init()
  stmmac: intel: Update PCH PTP clock rate from 200MHz to 204.8MHz
  net: cxgb3_main: disable napi when bind qsets failed in cxgb_up()
  net: cpsw: disable napi in cpsw_ndo_open()
  iavf: Fix VF driver counting VLAN 0 filters
  ice: Fix spurious interrupt during removal of trusted VF
  net/mlx5e: TC, Fix slab-out-of-bounds in parse_tc_actions
  net/mlx5e: E-Switch, Fix comparing termination table instance
  ...

Merge tag 'mlx5-fixes-2022-11-09' of git://git./linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5 fixes 2022-11-02

This series provides bug fixes to mlx5 driver.

* tag 'mlx5-fixes-2022-11-09' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5e: TC, Fix slab-out-of-bounds in parse_tc_actions
  net/mlx5e: E-Switch, Fix comparing termination table instance
  net/mlx5e: TC, Fix wrong rejection of packet-per-second policing
  net/mlx5e: Fix tc acts array not to be dependent on enum order
  net/mlx5e: Fix usage of DMA sync API
  net/mlx5e: Add missing sanity checks for max TX WQE size
  net/mlx5: fw_reset: Don't try to load device in case PCI isn't working
  net/mlx5: E-switch, Set to legacy mode if failed to change switchdev mode
  net/mlx5: Allow async trigger completion execution on single CPU systems
  net/mlx5: Bridge, verify LAG state when adding bond to bridge
====================

Link: https://lore.kernel.org/r/20221109184050.108379-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch '100GbE' of git://git./linux/kernel/git/tnguy/net-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2022-11-09 (ice, iavf)

This series contains updates to ice and iavf drivers.

Norbert stops disabling VF queues that are not enabled for ice driver.

Michal stops accounting of VLAN 0 filter to match expectations of PF
driver for iavf.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
iavf: Fix VF driver counting VLAN 0 filters
ice: Fix spurious interrupt during removal of trusted VF
====================

Link: https://lore.kernel.org/r/20221110003744.201414-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

eth: sp7021: drop free_netdev() from spl2sw_init_netdev()

It's not necessary to free netdev allocated with devm_alloc_etherdev()
and using free_netdev() leads to double free.

Fixes: fd3040b9394c ("net: ethernet: Add driver for Sunplus SP7021")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Link: https://lore.kernel.org/r/20221109150116.2988194-1-weiyongjun@huaweicloud.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

MAINTAINERS: Move Vivien to CREDITS

Last patch from Vivien was nearly 3 years ago and he has not reviewed or
responded to DSA patches since then, move to CREDITS.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20221109231907.621678-1-f.fainelli@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'for-6.1-rc4-tag' of git://git./linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

- revert memory optimization for scrub blocks, this misses errors in
   2nd and following blocks

- add exception for ENOMEM as reason for transaction abort to not print
   stack trace, syzbot has reported many

- zoned fixes:
      - fix locking imbalance during scrub
      - initialize zones for seeding device
      - initialize zones for cloned device structures

- when looking up device, change assertion to a real check as some of
   the search parameters can be passed by ioctl, reported by syzbot

- fix error pointer check in self tests

* tag 'for-6.1-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: zoned: fix locking imbalance on scrub
  btrfs: zoned: initialize device's zone info for seeding
  btrfs: zoned: clone zoned device info when cloning a device
  Revert "btrfs: scrub: use larger block size for data extent scrub"
  btrfs: don't print stack trace when transaction is aborted due to ENOMEM
  btrfs: selftests: fix wrong error check in btrfs_free_dummy_root()
  btrfs: fix match incorrectly in dev_args_match_device

Merge tag 'soundwire-6.1-fixes' of git://git./linux/kernel/git/vkoul/soundwire

Pull soundwire fixes from Vinod Koul:
"Two qcom driver fixes for broadcast completion reinit and check for
  outanding writes. And a lone Intel driver fix for clock stop timeout"

* tag 'soundwire-6.1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
  soundwire: qcom: check for outanding writes before doing a read
  soundwire: qcom: reinit broadcast completion
  soundwire: intel: Initialize clock stop timeout

Merge tag 'phy-fixes-6.1' of git://git./linux/kernel/git/phy/linux-phy

Pull phy fixes from Vinod Koul:
"A bunch of odd driver fixes and a MAINTAINER email update:

   - Update Kishon's email

   - stms32 error code fix in driver probe

   - tegra: fix for checking valid pointer

   - qcom_qmp: null deref fix

   - sunplus: error check fix

   - ralink: add missing sentinel to table"

* tag 'phy-fixes-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy:
  phy: ralink: mt7621-pci: add sentinel to quirks table
  phy: sunplus: Fix an IS_ERR() vs NULL bug in sp_usb_phy_probe
  phy: qcom-qmp-combo: fix NULL-deref on runtime resume
  phy: tegra: xusb: Fix crash during pad power on/down
  phy: stm32: fix an error code in probe
  MAINTAINERS: Update Kishon's email address in GENERIC PHY FRAMEWORK

Merge tag 'hwlock-v6.1' of git://git./linux/kernel/git/remoteproc/linux

Pull hwspinlock updates from Bjorn Andersson:
"I apparently had missed tagging and sending this set of changes out
  during the 6.1 merge window. But did get the associated dts changes
  depending on this merged. The result is a regression in 6.1-rc on the
  affected, older, Qualcomm platforms - in for form of them not booting.

  So while these weren't regression fixes originally, they are now. It's
  not introducing new beahavior, but simply extending the existing new
  Devicetree model, to cover remaining platforms:

   - extend the DeviceTree binding and implementation for the Qualcomm
     hardware spinlock on some older platforms to follow the style of
     the newer ones where the DeviceTree representation does not rely on
     an intermediate syscon node"

* tag 'hwlock-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux:
  dt-bindings: hwlock: qcom-hwspinlock: add syscon to MSM8974
  hwspinlock: qcom: add support for MMIO on older SoCs
  hwspinlock: qcom: correct MMIO max register for newer SoCs
  dt-bindings: hwlock: qcom-hwspinlock: correct example indentation
  dt-bindings: hwlock: qcom-hwspinlock: add support for MMIO on older SoCs

net: macvlan: fix memory leaks of macvlan_common_newlink

kmemleak reports memory leaks in macvlan_common_newlink, as follows:

ip link add link eth0 name .. type macvlan mode source macaddr add
<MAC-ADDR>

kmemleak reports:

unreferenced object 0xffff8880109bb140 (size 64):
  comm "ip", pid 284, jiffies 4294986150 (age 430.108s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 b8 aa 5a 12 80 88 ff ff  ..........Z.....
    80 1b fa 0d 80 88 ff ff 1e ff ac af c7 c1 6b 6b  ..............kk
  backtrace:
    [<ffffffff813e06a7>] kmem_cache_alloc_trace+0x1c7/0x300
    [<ffffffff81b66025>] macvlan_hash_add_source+0x45/0xc0
    [<ffffffff81b66a67>] macvlan_changelink_sources+0xd7/0x170
    [<ffffffff81b6775c>] macvlan_common_newlink+0x38c/0x5a0
    [<ffffffff81b6797e>] macvlan_newlink+0xe/0x20
    [<ffffffff81d97f8f>] __rtnl_newlink+0x7af/0xa50
    [<ffffffff81d98278>] rtnl_newlink+0x48/0x70
    ...

In the scenario where the macvlan mode is configured as 'source',
macvlan_changelink_sources() will be execured to reconfigure list of
remote source mac addresses, at the same time, if register_netdevice()
return an error, the resource generated by macvlan_changelink_sources()
is not cleaned up.

Using this patch, in the case of an error, it will execute
macvlan_flush_sources() to ensure that the resource is cleaned up.

Fixes: aa5fd0fb7748 ("driver: macvlan: Destroy new macvlan port if macvlan_common_newlink failed.")
Signed-off-by: Chuang Wang <nashuiliang@gmail.com>
Link: https://lore.kernel.org/r/20221109090735.690500-1-nashuiliang@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

ethernet: tundra: free irq when alloc ring failed in tsi108_open()

When alloc tx/rx ring failed in tsi108_open(), it doesn't free irq. Fix
it.

Fixes: 5e123b844a1c ("[PATCH] Add tsi108/9 On Chip Ethernet device driver support")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Link: https://lore.kernel.org/r/20221109044016.126866-1-shaozhengchao@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: mv643xx_eth: disable napi when init rxq or txq failed in mv643xx_eth_open()

When failed to init rxq or txq in mv643xx_eth_open() for opening device,
napi isn't disabled. When open mv643xx_eth device next time, it will
trigger a BUG_ON() in napi_enable(). Compile tested only.

Fixes: 2257e05c1705 ("mv643xx_eth: get rid of receive-side locking")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Link: https://lore.kernel.org/r/20221109025432.80900-1-shaozhengchao@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

ethernet: s2io: disable napi when start nic failed in s2io_card_up()

When failed to start nic or add interrupt service routine in
s2io_card_up() for opening device, napi isn't disabled. When open
s2io device next time, it will trigger a BUG_ON()in napi_enable().
Compile tested only.

Fixes: 5f490c968056 ("S2io: Fixed synchronization between scheduling of napi with card reset and close")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Link: https://lore.kernel.org/r/20221109023741.131552-1-shaozhengchao@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Merge branch 'macsec-clear-encryption-keys-in-h-w-drivers'

Antoine Tenart says:

====================
macsec: clear encryption keys in h/w drivers

Commit aaab73f8fba4 ("macsec: clear encryption keys from the stack after
setting up offload") made sure to clean encryption keys from the stack
after setting up offloading but some h/w drivers did a copy of the key
which need to be zeroed as well.

The MSCC PHY driver can actually be converted not to copy the encryption
key at all, but such patch would be quite difficult to backport. I'll
send a following up patch doing this in net-next once this series lands.

Tested on the MSCC PHY but not on the atlantic NIC.
====================

Link: https://lore.kernel.org/r/20221108153459.811293-1-atenart@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: atlantic: macsec: clear encryption keys from the stack

Commit aaab73f8fba4 ("macsec: clear encryption keys from the stack after
setting up offload") made sure to clean encryption keys from the stack
after setting up offloading, but the atlantic driver made a copy and did
not clear it. Fix this.

[4 Fixes tags below, all part of the same series, no need to split this]

Fixes: 9ff40a751a6f ("net: atlantic: MACSec ingress offload implementation")
Fixes: b8f8a0b7b5cb ("net: atlantic: MACSec ingress offload HW bindings")
Fixes: 27736563ce32 ("net: atlantic: MACSec egress offload implementation")
Fixes: 9d106c6dd81b ("net: atlantic: MACSec egress offload HW bindings")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: phy: mscc: macsec: clear encryption keys when freeing a flow

Commit aaab73f8fba4 ("macsec: clear encryption keys from the stack after
setting up offload") made sure to clean encryption keys from the stack
after setting up offloading, but the MSCC PHY driver made a copy, kept
it in the flow data and did not clear it when freeing a flow. Fix this.

Fixes: 28c5107aa904 ("net: phy: mscc: macsec support")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Merge branch 'stmmac-dwmac-loongson-fixes-three-leaks'

Yang Yingliang says:

====================
stmmac: dwmac-loongson: fixes three leaks

patch #2 fixes missing pci_disable_device() in the error path in probe()
patch #1 and pach #3 fix missing pci_disable_msi() and of_node_put() in
error and remove() path.
====================

Link: https://lore.kernel.org/r/20221108114647.4144952-1-yangyingliang@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

stmmac: dwmac-loongson: fix missing of_node_put() while module exiting

The node returned by of_get_child_by_name() with refcount decremented,
of_node_put() needs be called when finish using it. So add it in the
error path in loongson_dwmac_probe() and in loongson_dwmac_remove().

Fixes: 2ae34111fe4e ("stmmac: dwmac-loongson: fix invalid mdio_node")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

stmmac: dwmac-loongson: fix missing pci_disable_device() in loongson_dwmac_probe()

Add missing pci_disable_device() in the error path in loongson_dwmac_probe().

Fixes: 30bba69d7db4 ("stmmac: pci: Add dwmac support for Loongson")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

stmmac: dwmac-loongson: fix missing pci_disable_msi() while module exiting

pci_enable_msi() has been called in loongson_dwmac_probe(),
so pci_disable_msi() needs be called in remove path and error
path of probe().

Fixes: 30bba69d7db4 ("stmmac: pci: Add dwmac support for Loongson")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

cxgb4vf: shut down the adapter when t4vf_update_port_info() failed in cxgb4vf_open()

When t4vf_update_port_info() failed in cxgb4vf_open(), resources applied
during adapter goes up are not cleared. Fix it. Only be compiled, not be
tested.

Fixes: 18d79f721e0a ("cxgb4vf: Update port information in cxgb4vf_open()")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Link: https://lore.kernel.org/r/20221109012100.99132-1-shaozhengchao@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

mctp: Fix an error handling path in mctp_init()

If mctp_neigh_init() return error, the routes resources should
be released in the error handling path. Otherwise some resources
leak.

Fixes: 4d8b9319282a ("mctp: Add neighbour implementation")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Matt Johnston <matt@codeconstruct.com.au>
Link: https://lore.kernel.org/r/20221108095517.620115-1-weiyongjun@huaweicloud.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

stmmac: intel: Update PCH PTP clock rate from 200MHz to 204.8MHz

Current Intel platform has an output of ~976ms interval
when probed on 1 Pulse-per-Second(PPS) hardware pin.

The correct PTP clock frequency for PCH GbE should be 204.8MHz
instead of 200MHz. PSE GbE PTP clock rate remains at 200MHz.

Fixes: 58da0cfa6cf1 ("net: stmmac: create dwmac-intel.c to contain all Intel platform")
Signed-off-by: Ling Pei Lee <pei.lee.ling@intel.com>
Signed-off-by: Tan, Tee Min <tee.min.tan@intel.com>
Signed-off-by: Voon Weifeng <weifeng.voon@intel.com>
Signed-off-by: Gan Yi Fang <yi.fang.gan@intel.com>
Link: https://lore.kernel.org/r/20221108020811.12919-1-yi.fang.gan@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: cxgb3_main: disable napi when bind qsets failed in cxgb_up()

When failed to bind qsets in cxgb_up() for opening device, napi isn't
disabled. When open cxgb3 device next time, it will trigger a BUG_ON()
in napi_enable(). Compile tested only.

Fixes: 48c4b6dbb7e2 ("cxgb3 - fix port up/down error path")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Link: https://lore.kernel.org/r/20221109021451.121490-1-shaozhengchao@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: cpsw: disable napi in cpsw_ndo_open()

When failed to create xdp rxqs or fill rx channels in cpsw_ndo_open() for
opening device, napi isn't disabled. When open cpsw device next time, it
will report a invalid opcode issue. Compiled tested only.

Fixes: d354eb85d618 ("drivers: net: cpsw: dual_emac: simplify napi usage")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Link: https://lore.kernel.org/r/20221109011537.96975-1-shaozhengchao@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

iavf: Fix VF driver counting VLAN 0 filters

VF driver mistakenly counts VLAN 0 filters, when no PF driver
counts them.
Do not count VLAN 0 filters, when VLAN_V2 is engaged.
Counting those filters in, will affect filters size by -1, when
sending batched VLAN addition message.

Fixes: 968996c070ef ("iavf: Fix VLAN_V2 addition/rejection")
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Michal Jaron <michalx.jaron@intel.com>
Signed-off-by: Kamil Maziarz <kamil.maziarz@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

ice: Fix spurious interrupt during removal of trusted VF

Previously, during removal of trusted VF when VF is down there was
number of spurious interrupt equal to number of queues on VF.

Add check if VF already has inactive queues. If VF is disabled and
has inactive rx queues then do not disable rx queues.
Add check in ice_vsi_stop_tx_ring if it's VF's vsi and if VF is
disabled.

Fixes: efe41860008e ("ice: Fix memory corruption in VF driver")
Signed-off-by: Norbert Zulinski <norbertx.zulinski@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

Merge tag 'slab-for-6.1-rc4-fixes' of git://git./linux/kernel/git/vbabka/slab

Pull slab fixes from Vlastimil Babka:
"Most are small fixups as described below.

  The !CONFIG_TRACING fix is a bit bigger and would normally be done in
  the next merge window as part of upcoming hardening changes. But we
  realized it can make the kmalloc waste tracking introduced in this
  window inaccurate, so decided to go with it now.

  Summary:

   - Remove !CONFIG_TRACING kmalloc() wrappers intended to save a
     function call, due to incompatilibity with recently introduced
     wasted space tracking and planned hardening changes.

   - A tracing parameter regression fix, by Kees Cook.

   - Two kernel-doc warning fixups, by Lukas Bulwahn and myself

* tag 'slab-for-6.1-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
  mm, slab: remove duplicate kernel-doc comment for ksize()
  mm/slab_common: Restore passing "caller" for tracing
  mm/slab: remove !CONFIG_TRACING variants of kmalloc_[node_]trace()
  mm/slab_common: repair kernel-doc for __ksize()

selftests/bpf: Fix test_progs compilation failure in 32-bit arch

test_progs fails to be compiled in the 32-bit arch, log is as follows:

  test_progs.c:1013:52: error: format '%ld' expects argument of type 'long int', but argument 3 has type 'size_t' {aka 'unsigned int'} [-Werror=format=]
   1013 |                 sprintf(buf, "MSG_TEST_LOG (cnt: %ld, last: %d)",
        |                                                  ~~^
        |                                                    |
        |                                                    long int
        |                                                  %d
   1014 |                         strlen(msg->test_log.log_buf),
        |                         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        |                         |
        |                         size_t {aka unsigned int}

Fix it.

Fixes: 91b2c0afd00c ("selftests/bpf: Add parallelism to test_progs")
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20221108015857.132457-1-yangjihong1@huawei.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>

selftests/bpf: Fix casting error when cross-compiling test_verifier for 32-bit platforms

When cross-compiling test_verifier for 32-bit platforms, the casting error is shown below:

test_verifier.c:1263:27: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
1263 | info.xlated_prog_insns = (__u64)*buf;
| ^
cc1: all warnings being treated as errors

Fix it by adding zero-extension for it.

Fixes: 933ff53191eb ("selftests/bpf: specify expected instructions in test_verifier tests")
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20221108121945.4104644-1-pulehui@huaweicloud.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>

net/mlx5e: TC, Fix slab-out-of-bounds in parse_tc_actions

esw_attr is only allocated if namespace is fdb.

BUG: KASAN: slab-out-of-bounds in parse_tc_actions+0xdc6/0x10e0 [mlx5_core]
Write of size 4 at addr ffff88815f185b04 by task tc/2135

CPU: 5 PID: 2135 Comm: tc Not tainted 6.1.0-rc2+ #2
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x57/0x7d
print_report+0x170/0x471
? parse_tc_actions+0xdc6/0x10e0 [mlx5_core]
kasan_report+0xbc/0xf0
? parse_tc_actions+0xdc6/0x10e0 [mlx5_core]
parse_tc_actions+0xdc6/0x10e0 [mlx5_core]

Fixes: 94d651739e17 ("net/mlx5e: TC, Fix cloned flow attr instance dests are not zeroed")
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: E-Switch, Fix comparing termination table instance

The pkt_reformat pointer being saved under flow_act and not
dest attribute in the termination table instance.
Fix the comparison pointers.

Also fix returning success if one pkt_reformat pointer is null
and the other is not.

Fixes: 249ccc3c95bd ("net/mlx5e: Add support for offloading traffic from uplink to uplink")
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Chris Mi <cmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: TC, Fix wrong rejection of packet-per-second policing

In the bellow commit, we added support for PPS policing without
removing the check which block offload of such cases.
Fix it by removing this check.

Fixes: a8d52b024d6d ("net/mlx5e: TC, Support offloading police action")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: Fix tc acts array not to be dependent on enum order

The tc acts array should not be dependent on kernel internal
flow action id enum. Fix the array initialization.

Fixes: fad547906980 ("net/mlx5e: Add tc action infrastructure")
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: Fix usage of DMA sync API

DMA sync functions should use the same direction that was used by DMA
mapping. Use DMA_BIDIRECTIONAL for XDP_TX from regular RQ, which reuses
the same mapping that was used for RX, and DMA_TO_DEVICE for XDP_TX from
XSK RQ and XDP_REDIRECT, which establish a new mapping in this
direction. On the RX side, use the same direction that was used when
setting up the mapping (DMA_BIDIRECTIONAL for XDP, DMA_FROM_DEVICE
otherwise).

Also don't skip sync for device when establishing a DMA_FROM_DEVICE
mapping for RX, as some architectures (ARM) may require invalidating
caches before the device can use the mapping. It doesn't break the
bugfix made in
commit 0b7cfa4082fb ("net/mlx5e: Fix page DMA map/unmap attributes"),
since the bug happened on unmap.

Fixes: 0b7cfa4082fb ("net/mlx5e: Fix page DMA map/unmap attributes")
Fixes: b5503b994ed5 ("net/mlx5e: XDP TX forwarding support")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: Add missing sanity checks for max TX WQE size

The commit cited below started using the firmware capability for the
maximum TX WQE size. This commit adds an important check to verify that
the driver doesn't attempt to exceed this capability, and also restores
another check mistakenly removed in the cited commit (a WQE must not
exceed the page size).

Fixes: c27bd1718c06 ("net/mlx5e: Read max WQEBBs on the SQ from firmware")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5: fw_reset: Don't try to load device in case PCI isn't working

In case PCI reads fail after unload, there is no use in trying to
load the device.

Fixes: 5ec697446f46 ("net/mlx5: Add support for devlink reload action fw activate")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5: E-switch, Set to legacy mode if failed to change switchdev mode

No need to rollback to the other mode because probably will fail
again. Just set to legacy mode and clear fdb table created flag.
So that fdb table will not be cleared again.

Fixes: f019679ea5f2 ("net/mlx5: E-switch, Remove dependency between sriov and eswitch mode")
Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5: Allow async trigger completion execution on single CPU systems

For a single CPU system, the kernel thread executing mlx5_cmd_flush()
never releases the CPU but calls down_trylock(&cmd→sem) in a busy loop.
On a single processor system, this leads to a deadlock as the kernel
thread which executes mlx5_cmd_invoke() never gets scheduled. Fix this,
by adding the cond_resched() call to the loop, allow the command
completion kernel thread to execute.

Fixes: 8e715cd613a1 ("net/mlx5: Set command entry semaphore up once got index free")
Signed-off-by: Alexander Schmidt <alexschm@de.ibm.com>
Signed-off-by: Roy Novich <royno@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5: Bridge, verify LAG state when adding bond to bridge

Mlx5 LAG is initialized asynchronously on a workqueue which means that for
a brief moment after setting mlx5 UL representors as lower devices of a
bond netdevice the LAG itself is not fully initialized in the driver. When
adding such bond device to a bridge mlx5 bridge code will not consider it
as offload-capable, skip creating necessary bookkeeping and fail any
further bridge offload-related commands with it (setting VLANs, offloading
FDBs, etc.). In order to make the error explicit during bridge
initialization stage implement the code that detects such condition during
NETDEV_PRECHANGEUPPER event and returns an error.

Fixes: ff9b7521468b ("net/mlx5: Bridge, support LAG")
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

Merge git://git./linux/kernel/git/netfilter/nf

Pablo Neira Ayuso says:

====================
The following patchset contains Netfilter fixes for net:

1) Fix deadlock in nfnetlink due to missing mutex release in error path,
from Ziyang Xuan.

2) Clean up pending autoload module list from nf_tables_exit_net() path,
from Shigeru Yoshida.

3) Fixes for the netfilter's reverse path selftest, from Phil Sutter.

All of these bugs have been around for several releases.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'wwan-iosm-fixes'

M Chetan Kumar says:

====================
net: wwan: iosm: fixes

This patch series contains iosm fixes.

PATCH1: Fix memory leak in ipc_pcie_read_bios_cfg.

PATCH2: Fix driver not working with INTEL_IOMMU disabled config.

PATCH3: Fix invalid mux header type.

PATCH4: Fix kernel build robot reported errors.

Please refer to individual commit message for details.

--
v2:
* PATCH1: No Change
* PATCH2: Kconfig change
- Add dependency on PCI to resolve kernel build robot errors.
* PATCH3: No Change
* PATCH4: New (Fix kernel build robot errors)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: wwan: iosm: fix kernel test robot reported errors

Include linux/vmalloc.h in iosm_ipc_coredump.c &
iosm_ipc_devlink.c to resolve kernel test robot errors.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: wwan: iosm: fix invalid mux header type

Data stall seen during peak DL throughput test & packets are
dropped by mux layer due to invalid header type in datagram.

During initlization Mux aggregration protocol is set to default
UL/DL size and TD count of Mux lite protocol. This configuration
mismatch between device and driver is resulting in data stall/packet
drops.

Override the UL/DL size and TD count for Mux aggregation protocol.

Fixes: 1f52d7b62285 ("net: wwan: iosm: Enable M.2 7360 WWAN card support")
Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: wwan: iosm: fix driver not working with INTEL_IOMMU disabled

With INTEL_IOMMU disable config or by forcing intel_iommu=off from
grub some of the features of IOSM driver like browsing, flashing &
coredump collection is not working.

When driver calls DMA API - dma_map_single() for tx transfers. It is
resulting in dma mapping error.

Set the device DMA addressing capabilities using dma_set_mask() and
remove the INTEL_IOMMU dependency in kconfig so that driver follows
the platform config either INTEL_IOMMU enable or disable.

Fixes: f7af616c632e ("net: iosm: infrastructure")
Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: wwan: iosm: fix memory leak in ipc_pcie_read_bios_cfg

ipc_pcie_read_bios_cfg() is using the acpi_evaluate_dsm() to
obtain the wwan power state configuration from BIOS but is
not freeing the acpi_object. The acpi_evaluate_dsm() returned
acpi_object to be freed.

Free the acpi_object after use.

Fixes: 7e98d785ae61 ("net: iosm: entry point")
Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

selftests: netfilter: Fix and review rpath.sh

Address a few problems with the initial test script version:

* On systems with ip6tables but no ip6tables-legacy, testing for
ip6tables was disabled by accident.
* Firewall setup phase did not respect possibly unavailable tools.
* Consistently call nft via '$nft'.

Fixes: 6e31ce831c63b ("selftests: netfilter: Test reverse path filtering")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

ibmveth: Reduce default tx queues to 8

Previously, the default number of transmit queues was 16. Due to
resource concerns, set to 8 queues instead. Still allow the user
to set more queues (max 16) if they like.

Since the driver is virtualized away from the physical NIC, the purpose
of multiple queues is purely to allow for parallel calls to the
hypervisor. Therefore, there is no noticeable effect on performance by
reducing queue count to 8.

Fixes: d926793c1de9 ("ibmveth: Implement multi queue on xmit")
Reported-by: Dave Taht <dave.taht@gmail.com>
Signed-off-by: Nick Child <nnac123@linux.ibm.com>
Link: https://lore.kernel.org/r/20221107203215.58206-1-nnac123@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: nixge: disable napi when enable interrupts failed in nixge_open()

When failed to enable interrupts in nixge_open() for opening device,
napi isn't disabled. When open nixge device next time, it will reports
a invalid opcode issue. Fix it. Only be compiled, not be tested.

Fixes: 492caffa8a1a ("net: ethernet: nixge: Add support for National Instruments XGE netdev")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Link: https://lore.kernel.org/r/20221107101443.120205-1-shaozhengchao@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: tun: call napi_schedule_prep() to ensure we own a napi

A recent patch exposed another issue in napi_get_frags()
caught by syzbot [1]

Before feeding packets to GRO, and calling napi_complete()
we must first grab NAPI_STATE_SCHED.

[1]
WARNING: CPU: 0 PID: 3612 at net/core/dev.c:6076 napi_complete_done+0x45b/0x880 net/core/dev.c:6076
Modules linked in:
CPU: 0 PID: 3612 Comm: syz-executor408 Not tainted 6.1.0-rc3-syzkaller-00175-g1118b2049d77 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
RIP: 0010:napi_complete_done+0x45b/0x880 net/core/dev.c:6076
Code: c1 ea 03 0f b6 14 02 4c 89 f0 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 24 04 00 00 41 89 5d 1c e9 73 fc ff ff e8 b5 53 22 fa <0f> 0b e9 82 fe ff ff e8 a9 53 22 fa 48 8b 5c 24 08 31 ff 48 89 de
RSP: 0018:ffffc90003c4f920 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 0000000000000030 RCX: 0000000000000000
RDX: ffff8880251c0000 RSI: ffffffff875a58db RDI: 0000000000000007
RBP: 0000000000000001 R08: 0000000000000007 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000001 R12: ffff888072d02628
R13: ffff888072d02618 R14: ffff888072d02634 R15: 0000000000000000
FS: 0000555555f13300(0000) GS:ffff8880b9a00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055c44d3892b8 CR3: 00000000172d2000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
napi_complete include/linux/netdevice.h:510 [inline]
tun_get_user+0x206d/0x3a60 drivers/net/tun.c:1980
tun_chr_write_iter+0xdb/0x200 drivers/net/tun.c:2027
call_write_iter include/linux/fs.h:2191 [inline]
do_iter_readv_writev+0x20b/0x3b0 fs/read_write.c:735
do_iter_write+0x182/0x700 fs/read_write.c:861
vfs_writev+0x1aa/0x630 fs/read_write.c:934
do_writev+0x133/0x2f0 fs/read_write.c:977
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f37021a3c19

Fixes: 1118b2049d77 ("net: tun: Fix memory leaks of napi_get_frags")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Wang Yufen <wangyufen@huawei.com>
Link: https://lore.kernel.org/r/20221107180011.188437-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: marvell: prestera: fix memory leak in prestera_rxtx_switch_init()

When prestera_sdma_switch_init() failed, the memory pointed to by
sw->rxtx isn't released. Fix it. Only be compiled, not be tested.

Fixes: 501ef3066c89 ("net: marvell: prestera: Add driver for Prestera family ASIC devices")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Reviewed-by: Vadym Kochan <vadym.kochan@plvision.eu>
Link: https://lore.kernel.org/r/20221108025607.338450-1-shaozhengchao@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'linux-can-fixes-for-6.1-20221107' of git://git./linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
can 2022-11-07

The first patch is by Chen Zhongjin and adds a missing
dev_remove_pack() to the AF_CAN protocol.

Zhengchao Shao's patch fixes a potential NULL pointer deref in
AF_CAN's can_rx_register().

The next patch is by Oliver Hartkopp and targets the CAN ISO-TP
protocol, and fixes the state handling for echo TX processing.

Oliver Hartkopp's patch for the j1939 protocol adds a missing
initialization of the CAN headers inside outgoing skbs.

Another patch by Oliver Hartkopp fixes an out of bounds read in the
check for invalid CAN frames in the xmit callback of virtual CAN
devices. This touches all non virtual device drivers as we decided to
rename the function requiring that netdev_priv points to a struct
can_priv.
(Note: This patch will create a merge conflict with net-next where the
pch_can driver has removed.)

The last patch is by Geert Uytterhoeven and adds the missing ECC error
checks for the channels 2-7 in the rcar_canfd driver.

* tag 'linux-can-fixes-for-6.1-20221107' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
  can: rcar_canfd: Add missing ECC error checks for channels 2-7
  can: dev: fix skb drop check
  can: j1939: j1939_send_one(): fix missing CAN header initialization
  can: isotp: fix tx state handling for echo tx processing
  can: af_can: fix NULL pointer dereference in can_rx_register()
  can: af_can: can_exit(): add missing dev_remove_pack() of canxl_packet
====================

Link: https://lore.kernel.org/r/20221107133217.59861-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

netfilter: Cleanup nft_net->module_list from nf_tables_exit_net()

syzbot reported a warning like below [1]:

WARNING: CPU: 3 PID: 9 at net/netfilter/nf_tables_api.c:10096 nf_tables_exit_net+0x71c/0x840
Modules linked in:
CPU: 2 PID: 9 Comm: kworker/u8:0 Tainted: G        W          6.1.0-rc3-00072-g8e5423e991e8 #47
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-1.fc36 04/01/2014
Workqueue: netns cleanup_net
RIP: 0010:nf_tables_exit_net+0x71c/0x840
...
Call Trace:
<TASK>
? __nft_release_table+0xfc0/0xfc0
ops_exit_list+0xb5/0x180
cleanup_net+0x506/0xb10
? unregister_pernet_device+0x80/0x80
process_one_work+0xa38/0x1730
? pwq_dec_nr_in_flight+0x2b0/0x2b0
? rwlock_bug.part.0+0x90/0x90
? _raw_spin_lock_irq+0x46/0x50
worker_thread+0x67e/0x10e0
? process_one_work+0x1730/0x1730
kthread+0x2e5/0x3a0
? kthread_complete_and_exit+0x40/0x40
ret_from_fork+0x1f/0x30
</TASK>

In nf_tables_exit_net(), there is a case where nft_net->commit_list is
empty but nft_net->module_list is not empty.  Such a case occurs with
the following scenario:

1. nfnetlink_rcv_batch() is called
2. nf_tables_newset() returns -EAGAIN and NFNL_BATCH_FAILURE bit is
   set to status
3. nf_tables_abort() is called with NFNL_ABORT_AUTOLOAD
   (nft_net->commit_list is released, but nft_net->module_list is not
   because of NFNL_ABORT_AUTOLOAD flag)
4. Jump to replay label
5. netlink_skb_clone() fails and returns from the function (this is
   caused by fault injection in the reproducer of syzbot)

This patch fixes this issue by calling __nf_tables_abort() when
nft_net->module_list is not empty in nf_tables_exit_net().

Fixes: eb014de4fd41 ("netfilter: nf_tables: autoload modules from the abort path")
Link: https://syzkaller.appspot.com/bug?id=802aba2422de4218ad0c01b46c9525cc9d4e4aa3
Reported-by: syzbot+178efee9e2d7f87f5103@syzkaller.appspotmail.com
Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>

netfilter: nfnetlink: fix potential dead lock in nfnetlink_rcv_msg()

When type is NFNL_CB_MUTEX and -EAGAIN error occur in nfnetlink_rcv_msg(),
it does not execute nfnl_unlock(). That would trigger potential dead lock.

Fixes: 50f2db9e368f ("netfilter: nfnetlink: consolidate callback types")
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Signed-off-by: Florian Westphal <fw@strlen.de>