platform/kernel/linux-rpi.git
10 months agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Jakub Kicinski [Fri, 18 Aug 2023 19:44:22 +0000 (12:44 -0700)]
Merge git://git./linux/kernel/git/netdev/net

Cross-merge networking fixes after downstream PR.

Conflicts:

drivers/net/ethernet/sfc/tc.c
  fa165e194997 ("sfc: don't unregister flow_indr if it was never registered")
  3bf969e88ada ("sfc: add MAE table machinery for conntrack table")
https://lore.kernel.org/all/20230818112159.7430e9b4@canb.auug.org.au/

No adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agoMerge tag 'net-6.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Fri, 18 Aug 2023 04:52:23 +0000 (06:52 +0200)]
Merge tag 'net-6.5-rc7' of git://git./linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from ipsec and netfilter.

  No known outstanding regressions.

  Fixes to fixes:

   - virtio-net: set queues after driver_ok, avoid a potential race
     added by recent fix

   - Revert "vlan: Fix VLAN 0 memory leak", it may lead to a warning
     when VLAN 0 is registered explicitly

   - nf_tables:
      - fix false-positive lockdep splat in recent fixes
      - don't fail inserts if duplicate has expired (fix test failures)
      - fix races between garbage collection and netns dismantle

  Current release - new code bugs:

   - mlx5: Fix mlx5_cmd_update_root_ft() error flow

  Previous releases - regressions:

   - phy: fix IRQ-based wake-on-lan over hibernate / power off

  Previous releases - always broken:

   - sock: fix misuse of sk_under_memory_pressure() preventing system
     from exiting global TCP memory pressure if a single cgroup is under
     pressure

   - fix the RTO timer retransmitting skb every 1ms if linear option is
     enabled

   - af_key: fix sadb_x_filter validation, amment netlink policy

   - ipsec: fix slab-use-after-free in decode_session6()

   - macb: in ZynqMP resume always configure PS GTR for non-wakeup
     source

  Misc:

   - netfilter: set default timeout to 3 secs for sctp shutdown send and
     recv state (from 300ms), align with protocol timers"

* tag 'net-6.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (49 commits)
  ice: Block switchdev mode when ADQ is active and vice versa
  qede: fix firmware halt over suspend and resume
  net: do not allow gso_size to be set to GSO_BY_FRAGS
  sock: Fix misuse of sk_under_memory_pressure()
  sfc: don't fail probe if MAE/TC setup fails
  sfc: don't unregister flow_indr if it was never registered
  net: dsa: mv88e6xxx: Wait for EEPROM done before HW reset
  net/mlx5: Fix mlx5_cmd_update_root_ft() error flow
  net/mlx5e: XDP, Fix fifo overrun on XDP_REDIRECT
  i40e: fix misleading debug logs
  iavf: fix FDIR rule fields masks validation
  ipv6: fix indentation of a config attribute
  mailmap: add entries for Simon Horman
  broadcom: b44: Use b44_writephy() return value
  net: openvswitch: reject negative ifindex
  team: Fix incorrect deletion of ETH_P_8021AD protocol vid from slaves
  net: phy: broadcom: stub c45 read/write for 54810
  netfilter: nft_dynset: disallow object maps
  netfilter: nf_tables: GC transaction race with netns dismantle
  netfilter: nf_tables: fix GC transaction races with netns and netlink event exit path
  ...

10 months agoMerge tag 'drm-fixes-2023-08-18-1' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Fri, 18 Aug 2023 04:37:34 +0000 (06:37 +0200)]
Merge tag 'drm-fixes-2023-08-18-1' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Regular enough week, mostly the usual amdgpu and i915 fixes.  Also
  qaic, nouveau, qxl and a revert for an EDID patch that had some side
  effects, along with a couple of panel fixes.

  edid:
   - revert mode parsing fix that had side effects.

  i915:
   - Fix the flow for ignoring GuC SLPC efficient frequency selection
   - Fix SDVO panel_type initialization
   - Fix display probe for IVB Q and IVB D GT2 server

  nouveau:
   - fix use-after-free in connector code

  qaic:
   - integer overflow check fix
   - fix slicing memory leak

  panel:
   - fix JDI LT070ME05000 probing
   - fix AUO G121EAN01 timings

  amdgpu:
   - SMU 13.x fixes
   - Fix mcbp parameter for gfx9
   - SMU 11.x fixes
   - Temporary fix for large numbers of XCP partitions
   - S0ix fixes
   - DCN 2.0 fix

  qxl:
   - fix use after free race in dumb object allocation"

* tag 'drm-fixes-2023-08-18-1' of git://anongit.freedesktop.org/drm/drm:
  drm/qxl: fix UAF on handle creation
  Revert "drm/edid: Fix csync detailed mode parsing"
  drm/nouveau/disp: fix use-after-free in error handling of nouveau_connector_create
  Revert "Revert "drm/amdgpu/display: change pipe policy for DCN 2.0""
  drm/amd: flush any delayed gfxoff on suspend entry
  drm/amdgpu: skip fence GFX interrupts disable/enable for S0ix
  drm/amdgpu: skip xcp drm device allocation when out of drm resource
  drm/amd/pm: Update pci link width for smu v13.0.6
  drm/amd/pm: Fix temperature unit of SMU v13.0.6
  drm/amdgpu/pm: fix throttle_status for other than MP1 11.0.7
  drm/amdgpu: disable mcbp if parameter zero is set
  drm/amd/pm: disallow the fan setting if there is no fan on smu 13.0.0
  accel/qaic: Clean up integer overflow checking in map_user_pages()
  accel/qaic: Fix slicing memory leak
  drm/i915: fix display probe for IVB Q and IVB D GT2 server
  drm/i915/sdvo: fix panel_type initialization
  drm/i915/guc/slpc: Restore efficient freq earlier
  drm/panel: simple: Fix AUO G121EAN01 panel timings according to the docs
  drm/panel: JDI LT070ME05000 simplify with dev_err_probe()

10 months agoMerge branch 'netconsole-enable-compile-time-configuration'
Jakub Kicinski [Fri, 18 Aug 2023 02:25:44 +0000 (19:25 -0700)]
Merge branch 'netconsole-enable-compile-time-configuration'

Breno Leitao says:

====================
netconsole: Enable compile time configuration

Enable netconsole features to be set at compilation time. Create two
Kconfig options that allow users to set extended logs and release
prepending features at compilation time.

The first patch de-duplicates the initialization code, and the second
patch adds the support in the de-duplicated code, avoiding touching two
different functions with the same change.
====================

Link: https://lore.kernel.org/r/20230811093158.1678322-1-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agonetconsole: Enable compile time configuration
Breno Leitao [Fri, 11 Aug 2023 09:31:58 +0000 (02:31 -0700)]
netconsole: Enable compile time configuration

Enable netconsole features to be set at compilation time. Create two
Kconfig options that allow users to set extended logs and release
prepending features at compilation time.

Right now, the user needs to pass command line parameters to netconsole,
such as "+"/"r" to enable extended logs and version prepending features.

With these two options, the user could set the default values for the
features at compile time, and don't need to pass it in the command line
to get them enabled, simplifying the command line.

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20230811093158.1678322-3-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agonetconsole: Create a allocation helper
Breno Leitao [Fri, 11 Aug 2023 09:31:57 +0000 (02:31 -0700)]
netconsole: Create a allocation helper

De-duplicate the initialization and allocation code for struct
netconsole_target.

The same allocation and initialization code is duplicated in two
different places in the netconsole subsystem, when the netconsole target
is initialized by command line parameters (alloc_param_target()), and
dynamically by sysfs (make_netconsole_target()).

Create a helper function, and call it from the two different functions.

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20230811093158.1678322-2-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agonet: mdio: fix -Wvoid-pointer-to-enum-cast warning
Justin Stitt [Tue, 15 Aug 2023 20:35:59 +0000 (20:35 +0000)]
net: mdio: fix -Wvoid-pointer-to-enum-cast warning

When building with clang 18 I see the following warning:
|       drivers/net/mdio/mdio-xgene.c:338:13: warning: cast to smaller integer
|               type 'enum xgene_mdio_id' from 'const void *' [-Wvoid-pointer-to-enum-cast]
|         338 |                 mdio_id = (enum xgene_mdio_id)of_id->data;

This is due to the fact that `of_id->data` is a void* while `enum
xgene_mdio_id` has the size of an int. This leads to truncation and
possible data loss.

Link: https://github.com/ClangBuiltLinux/linux/issues/1910
Reported-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Justin Stitt <justinstitt@google.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/20230815-void-drivers-net-mdio-mdio-xgene-v1-1-5304342e0659@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agoMerge branch 'netem-use-a-seeded-prng-for-loss-and-corruption-events'
Jakub Kicinski [Fri, 18 Aug 2023 02:15:07 +0000 (19:15 -0700)]
Merge branch 'netem-use-a-seeded-prng-for-loss-and-corruption-events'

François Michel says:

====================
netem: use a seeded PRNG for loss and corruption events

In order to reproduce bugs or performance evaluation of
network protocols and applications, it is useful to have
reproducible test suites and tools. This patch adds
a way to specify a PRNG seed through the
TCA_NETEM_PRNG_SEED attribute for generating netem
loss and corruption events. Initializing the qdisc
with the same seed leads to the exact same loss
and corruption patterns. If no seed is explicitly
specified, the qdisc generates a random seed using
get_random_u64().

This patch can be and has been tested using tc from
the following iproute2-next fork:
https://github.com/francoismichel/iproute2-next

For instance, setting the seed 42424242 on the loopback
with a loss rate of 10% will systematically drop the 5th,
12th and 24th packet when sending 25 packets.
====================

Link: https://lore.kernel.org/r/20230815092348.1449179-1-francois.michel@uclouvain.be
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agonetem: use seeded PRNG for correlated loss events
François Michel [Tue, 15 Aug 2023 09:23:40 +0000 (11:23 +0200)]
netem: use seeded PRNG for correlated loss events

Use prandom_u32_state() instead of get_random_u32() to generate
the correlated loss events of netem.

Signed-off-by: François Michel <francois.michel@uclouvain.be>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Link: https://lore.kernel.org/r/20230815092348.1449179-4-francois.michel@uclouvain.be
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agonetem: use a seeded PRNG for generating random losses
François Michel [Tue, 15 Aug 2023 09:23:39 +0000 (11:23 +0200)]
netem: use a seeded PRNG for generating random losses

Use prandom_u32_state() instead of get_random_u32() to generate
the random loss events of netem. The state of the prng is part
of the prng attribute of struct netem_sched_data.

Signed-off-by: François Michel <francois.michel@uclouvain.be>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Link: https://lore.kernel.org/r/20230815092348.1449179-3-francois.michel@uclouvain.be
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agonetem: add prng attribute to netem_sched_data
François Michel [Tue, 15 Aug 2023 09:23:38 +0000 (11:23 +0200)]
netem: add prng attribute to netem_sched_data

Add prng attribute to struct netem_sched_data and
allows setting the seed of the PRNG through netlink
using the new TCA_NETEM_PRNG_SEED attribute.
The PRNG attribute is not actually used yet.

Signed-off-by: François Michel <francois.michel@uclouvain.be>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Link: https://lore.kernel.org/r/20230815092348.1449179-2-francois.michel@uclouvain.be
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agonet: ena: Use pci_dev_id() to simplify the code
Jialin Zhang [Tue, 15 Aug 2023 02:42:48 +0000 (10:42 +0800)]
net: ena: Use pci_dev_id() to simplify the code

PCI core API pci_dev_id() can be used to get the BDF number for a pci
device. We don't need to compose it manually. Use pci_dev_id() to
simplify the code a little bit.

Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Shay Agroskin <shayagr@amazon.com>
Link: https://lore.kernel.org/r/20230815024248.3519068-1-zhangjialin11@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agotun: add __exit annotations to module exit func tun_cleanup()
Ziyang Xuan [Mon, 14 Aug 2023 08:30:00 +0000 (16:30 +0800)]
tun: add __exit annotations to module exit func tun_cleanup()

Add missing __exit annotations to module exit func tun_cleanup().

Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20230814083000.3893589-1-william.xuanziyang@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agoMerge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Jakub Kicinski [Thu, 17 Aug 2023 21:35:34 +0000 (14:35 -0700)]
Merge branch '40GbE' of git://git./linux/kernel/git/tnguy/net-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2023-08-16 (iavf, i40e)

This series contains updates to iavf and i40e drivers.

Piotr adds checks for unsupported Flow Director rules on iavf.

Andrii replaces incorrect 'write' messaging on read operations for i40e.

* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  i40e: fix misleading debug logs
  iavf: fix FDIR rule fields masks validation
====================

Link: https://lore.kernel.org/r/20230816193308.1307535-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agodrm/qxl: fix UAF on handle creation
Wander Lairson Costa [Mon, 14 Aug 2023 16:51:19 +0000 (13:51 -0300)]
drm/qxl: fix UAF on handle creation

qxl_mode_dumb_create() dereferences the qobj returned by
qxl_gem_object_create_with_handle(), but the handle is the only one
holding a reference to it.

A potential attacker could guess the returned handle value and closes it
between the return of qxl_gem_object_create_with_handle() and the qobj
usage, triggering a use-after-free scenario.

Reproducer:

int dri_fd =-1;
struct drm_mode_create_dumb arg = {0};

void gem_close(int handle);

void* trigger(void* ptr)
{
int ret;
arg.width = arg.height = 0x20;
arg.bpp = 32;
ret = ioctl(dri_fd, DRM_IOCTL_MODE_CREATE_DUMB, &arg);
if(ret)
{
perror("[*] DRM_IOCTL_MODE_CREATE_DUMB Failed");
exit(-1);
}
gem_close(arg.handle);
while(1) {
struct drm_mode_create_dumb args = {0};
args.width = args.height = 0x20;
args.bpp = 32;
ret = ioctl(dri_fd, DRM_IOCTL_MODE_CREATE_DUMB, &args);
if (ret) {
perror("[*] DRM_IOCTL_MODE_CREATE_DUMB Failed");
exit(-1);
}

printf("[*] DRM_IOCTL_MODE_CREATE_DUMB created, %d\n", args.handle);
gem_close(args.handle);
}
return NULL;
}

void gem_close(int handle)
{
struct drm_gem_close args;
args.handle = handle;
int ret = ioctl(dri_fd, DRM_IOCTL_GEM_CLOSE, &args); // gem close handle
if (!ret)
printf("gem close handle %d\n", args.handle);
}

int main(void)
{
dri_fd= open("/dev/dri/card0", O_RDWR);
printf("fd:%d\n", dri_fd);

if(dri_fd == -1)
return -1;

pthread_t tid1;

if(pthread_create(&tid1,NULL,trigger,NULL)){
perror("[*] thread_create tid1\n");
return -1;
}
while (1)
{
gem_close(arg.handle);
}
return 0;
}

This is a KASAN report:

==================================================================
BUG: KASAN: slab-use-after-free in qxl_mode_dumb_create+0x3c2/0x400 linux/drivers/gpu/drm/qxl/qxl_dumb.c:69
Write of size 1 at addr ffff88801136c240 by task poc/515

CPU: 1 PID: 515 Comm: poc Not tainted 6.3.0 #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-debian-1.16.0-4 04/01/2014
Call Trace:
<TASK>
__dump_stack linux/lib/dump_stack.c:88
dump_stack_lvl+0x48/0x70 linux/lib/dump_stack.c:106
print_address_description linux/mm/kasan/report.c:319
print_report+0xd2/0x660 linux/mm/kasan/report.c:430
kasan_report+0xd2/0x110 linux/mm/kasan/report.c:536
__asan_report_store1_noabort+0x17/0x30 linux/mm/kasan/report_generic.c:383
qxl_mode_dumb_create+0x3c2/0x400 linux/drivers/gpu/drm/qxl/qxl_dumb.c:69
drm_mode_create_dumb linux/drivers/gpu/drm/drm_dumb_buffers.c:96
drm_mode_create_dumb_ioctl+0x1f5/0x2d0 linux/drivers/gpu/drm/drm_dumb_buffers.c:102
drm_ioctl_kernel+0x21d/0x430 linux/drivers/gpu/drm/drm_ioctl.c:788
drm_ioctl+0x56f/0xcc0 linux/drivers/gpu/drm/drm_ioctl.c:891
vfs_ioctl linux/fs/ioctl.c:51
__do_sys_ioctl linux/fs/ioctl.c:870
__se_sys_ioctl linux/fs/ioctl.c:856
__x64_sys_ioctl+0x13d/0x1c0 linux/fs/ioctl.c:856
do_syscall_x64 linux/arch/x86/entry/common.c:50
do_syscall_64+0x5b/0x90 linux/arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x72/0xdc linux/arch/x86/entry/entry_64.S:120
RIP: 0033:0x7ff5004ff5f7
Code: 00 00 00 48 8b 05 99 c8 0d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 69 c8 0d 00 f7 d8 64 89 01 48

RSP: 002b:00007ff500408ea8 EFLAGS: 00000286 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ff5004ff5f7
RDX: 00007ff500408ec0 RSI: 00000000c02064b2 RDI: 0000000000000003
RBP: 00007ff500408ef0 R08: 0000000000000000 R09: 000000000000002a
R10: 0000000000000000 R11: 0000000000000286 R12: 00007fff1c6cdafe
R13: 00007fff1c6cdaff R14: 00007ff500408fc0 R15: 0000000000802000
</TASK>

Allocated by task 515:
kasan_save_stack+0x38/0x70 linux/mm/kasan/common.c:45
kasan_set_track+0x25/0x40 linux/mm/kasan/common.c:52
kasan_save_alloc_info+0x1e/0x40 linux/mm/kasan/generic.c:510
____kasan_kmalloc linux/mm/kasan/common.c:374
__kasan_kmalloc+0xc3/0xd0 linux/mm/kasan/common.c:383
kasan_kmalloc linux/./include/linux/kasan.h:196
kmalloc_trace+0x48/0xc0 linux/mm/slab_common.c:1066
kmalloc linux/./include/linux/slab.h:580
kzalloc linux/./include/linux/slab.h:720
qxl_bo_create+0x11a/0x610 linux/drivers/gpu/drm/qxl/qxl_object.c:124
qxl_gem_object_create+0xd9/0x360 linux/drivers/gpu/drm/qxl/qxl_gem.c:58
qxl_gem_object_create_with_handle+0xa1/0x180 linux/drivers/gpu/drm/qxl/qxl_gem.c:89
qxl_mode_dumb_create+0x1cd/0x400 linux/drivers/gpu/drm/qxl/qxl_dumb.c:63
drm_mode_create_dumb linux/drivers/gpu/drm/drm_dumb_buffers.c:96
drm_mode_create_dumb_ioctl+0x1f5/0x2d0 linux/drivers/gpu/drm/drm_dumb_buffers.c:102
drm_ioctl_kernel+0x21d/0x430 linux/drivers/gpu/drm/drm_ioctl.c:788
drm_ioctl+0x56f/0xcc0 linux/drivers/gpu/drm/drm_ioctl.c:891
vfs_ioctl linux/fs/ioctl.c:51
__do_sys_ioctl linux/fs/ioctl.c:870
__se_sys_ioctl linux/fs/ioctl.c:856
__x64_sys_ioctl+0x13d/0x1c0 linux/fs/ioctl.c:856
do_syscall_x64 linux/arch/x86/entry/common.c:50
do_syscall_64+0x5b/0x90 linux/arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x72/0xdc linux/arch/x86/entry/entry_64.S:120

Freed by task 515:
kasan_save_stack+0x38/0x70 linux/mm/kasan/common.c:45
kasan_set_track+0x25/0x40 linux/mm/kasan/common.c:52
kasan_save_free_info+0x2e/0x60 linux/mm/kasan/generic.c:521
____kasan_slab_free linux/mm/kasan/common.c:236
____kasan_slab_free+0x180/0x1f0 linux/mm/kasan/common.c:200
__kasan_slab_free+0x12/0x30 linux/mm/kasan/common.c:244
kasan_slab_free linux/./include/linux/kasan.h:162
slab_free_hook linux/mm/slub.c:1781
slab_free_freelist_hook+0xd2/0x1a0 linux/mm/slub.c:1807
slab_free linux/mm/slub.c:3787
__kmem_cache_free+0x196/0x2d0 linux/mm/slub.c:3800
kfree+0x78/0x120 linux/mm/slab_common.c:1019
qxl_ttm_bo_destroy+0x140/0x1a0 linux/drivers/gpu/drm/qxl/qxl_object.c:49
ttm_bo_release+0x678/0xa30 linux/drivers/gpu/drm/ttm/ttm_bo.c:381
kref_put linux/./include/linux/kref.h:65
ttm_bo_put+0x50/0x80 linux/drivers/gpu/drm/ttm/ttm_bo.c:393
qxl_gem_object_free+0x3e/0x60 linux/drivers/gpu/drm/qxl/qxl_gem.c:42
drm_gem_object_free+0x5c/0x90 linux/drivers/gpu/drm/drm_gem.c:974
kref_put linux/./include/linux/kref.h:65
__drm_gem_object_put linux/./include/drm/drm_gem.h:431
drm_gem_object_put linux/./include/drm/drm_gem.h:444
qxl_gem_object_create_with_handle+0x151/0x180 linux/drivers/gpu/drm/qxl/qxl_gem.c:100
qxl_mode_dumb_create+0x1cd/0x400 linux/drivers/gpu/drm/qxl/qxl_dumb.c:63
drm_mode_create_dumb linux/drivers/gpu/drm/drm_dumb_buffers.c:96
drm_mode_create_dumb_ioctl+0x1f5/0x2d0 linux/drivers/gpu/drm/drm_dumb_buffers.c:102
drm_ioctl_kernel+0x21d/0x430 linux/drivers/gpu/drm/drm_ioctl.c:788
drm_ioctl+0x56f/0xcc0 linux/drivers/gpu/drm/drm_ioctl.c:891
vfs_ioctl linux/fs/ioctl.c:51
__do_sys_ioctl linux/fs/ioctl.c:870
__se_sys_ioctl linux/fs/ioctl.c:856
__x64_sys_ioctl+0x13d/0x1c0 linux/fs/ioctl.c:856
do_syscall_x64 linux/arch/x86/entry/common.c:50
do_syscall_64+0x5b/0x90 linux/arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x72/0xdc linux/arch/x86/entry/entry_64.S:120

The buggy address belongs to the object at ffff88801136c000
which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 576 bytes inside of
freed 1024-byte region [ffff88801136c000ffff88801136c400)

The buggy address belongs to the physical page:
page:0000000089fc329b refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11368
head:0000000089fc329b order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0
flags: 0xfffffc0010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff)
raw: 000fffffc0010200 ffff888007841dc0 dead000000000122 0000000000000000
raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
ffff88801136c100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff88801136c180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88801136c200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff88801136c280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff88801136c300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
Disabling lock debugging due to kernel taint

Instead of returning a weak reference to the qxl_bo object, return the
created drm_gem_object and let the caller decrement the reference count
when it no longer needs it. As a convenience, if the caller is not
interested in the gobj object, it can pass NULL to the parameter and the
reference counting is descremented internally.

The bug and the reproducer were originally found by the Zero Day Initiative project (ZDI-CAN-20940).

Link: https://www.zerodayinitiative.com/
Signed-off-by: Wander Lairson Costa <wander@redhat.com>
Cc: stable@vger.kernel.org
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230814165119.90847-1-wander@redhat.com
10 months agoMerge tag 'amd-drm-fixes-6.5-2023-08-16' of https://gitlab.freedesktop.org/agd5f...
Dave Airlie [Thu, 17 Aug 2023 20:09:21 +0000 (06:09 +1000)]
Merge tag 'amd-drm-fixes-6.5-2023-08-16' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-6.5-2023-08-16:

amdgpu:
- SMU 13.x fixes
- Fix mcbp parameter for gfx9
- SMU 11.x fixes
- Temporary fix for large numbers of XCP partitions
- S0ix fixes
- DCN 2.0 fix

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230816200226.10771-1-alexander.deucher@amd.com
10 months agoMerge tag 'drm-misc-fixes-2023-08-17' of git://anongit.freedesktop.org/drm/drm-misc...
Dave Airlie [Thu, 17 Aug 2023 20:08:54 +0000 (06:08 +1000)]
Merge tag 'drm-misc-fixes-2023-08-17' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

One EPROBE_DEFER handling fix for the JDI LT070ME05000, a timing fix for
the AUO G121EAN01 panel, an integer overflow and a memory leak fixes for
the qaic accel, a use-after-free fix for nouveau and a revert for an
alleged fix in EDID parsing.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/3olqt33em5uhxzjbqghwcwnvmw73h7bxkbdxookmnkecymd4vc@7ogm6gewpprq
10 months agoMerge tag 'drm-intel-fixes-2023-08-17' of git://anongit.freedesktop.org/drm/drm-intel...
Dave Airlie [Thu, 17 Aug 2023 20:07:59 +0000 (06:07 +1000)]
Merge tag 'drm-intel-fixes-2023-08-17' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

- Fix the flow for ignoring GuC SLPC efficient frequency selection (Vinay)
- Fix SDVO panel_type initialization (Jani)
- Fix display probe for IVB Q and IVB D GT2 server (Jani)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZN4yduyBU1Ev9dc7@intel.com
10 months agoMerge tag 'mlx5-fixes-2023-08-16' of git://git.kernel.org/pub/scm/linux/kernel/git...
Jakub Kicinski [Thu, 17 Aug 2023 18:56:44 +0000 (11:56 -0700)]
Merge tag 'mlx5-fixes-2023-08-16' of git://git./linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5 fixes 2023-08-16

This series provides bug fixes to mlx5 driver.

* tag 'mlx5-fixes-2023-08-16' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5: Fix mlx5_cmd_update_root_ft() error flow
  net/mlx5e: XDP, Fix fifo overrun on XDP_REDIRECT
====================

Link: https://lore.kernel.org/r/20230816204108.53819-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agoice: Block switchdev mode when ADQ is active and vice versa
Marcin Szycik [Wed, 16 Aug 2023 19:34:05 +0000 (12:34 -0700)]
ice: Block switchdev mode when ADQ is active and vice versa

ADQ and switchdev are not supported simultaneously. Enabling both at the
same time can result in nullptr dereference.

To prevent this, check if ADQ is active when changing devlink mode to
switchdev mode, and check if switchdev is active when enabling ADQ.

Fixes: fbc7b27af0f9 ("ice: enable ndo_setup_tc support for mqprio_qdisc")
Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20230816193405.1307580-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agoqede: fix firmware halt over suspend and resume
Manish Chopra [Wed, 16 Aug 2023 15:07:11 +0000 (20:37 +0530)]
qede: fix firmware halt over suspend and resume

While performing certain power-off sequences, PCI drivers are
called to suspend and resume their underlying devices through
PCI PM (power management) interface. However this NIC hardware
does not support PCI PM suspend/resume operations so system wide
suspend/resume leads to bad MFW (management firmware) state which
causes various follow-up errors in driver when communicating with
the device/firmware afterwards.

To fix this driver implements PCI PM suspend handler to indicate
unsupported operation to the PCI subsystem explicitly, thus avoiding
system to go into suspended/standby mode.

Without this fix device/firmware does not recover unless system
is power cycled.

Fixes: 2950219d87b0 ("qede: Add basic network device support")
Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Alok Prasad <palok@marvell.com>
Reviewed-by: John Meneghini <jmeneghi@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20230816150711.59035-1-manishc@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agonet: do not allow gso_size to be set to GSO_BY_FRAGS
Eric Dumazet [Wed, 16 Aug 2023 14:21:58 +0000 (14:21 +0000)]
net: do not allow gso_size to be set to GSO_BY_FRAGS

One missing check in virtio_net_hdr_to_skb() allowed
syzbot to crash kernels again [1]

Do not allow gso_size to be set to GSO_BY_FRAGS (0xffff),
because this magic value is used by the kernel.

[1]
general protection fault, probably for non-canonical address 0xdffffc000000000e: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000070-0x0000000000000077]
CPU: 0 PID: 5039 Comm: syz-executor401 Not tainted 6.5.0-rc5-next-20230809-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/26/2023
RIP: 0010:skb_segment+0x1a52/0x3ef0 net/core/skbuff.c:4500
Code: 00 00 00 e9 ab eb ff ff e8 6b 96 5d f9 48 8b 84 24 00 01 00 00 48 8d 78 70 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e ea 21 00 00 48 8b 84 24 00 01
RSP: 0018:ffffc90003d3f1c8 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: 000000000001fffe RCX: 0000000000000000
RDX: 000000000000000e RSI: ffffffff882a3115 RDI: 0000000000000070
RBP: ffffc90003d3f378 R08: 0000000000000005 R09: 000000000000ffff
R10: 000000000000ffff R11: 5ee4a93e456187d6 R12: 000000000001ffc6
R13: dffffc0000000000 R14: 0000000000000008 R15: 000000000000ffff
FS: 00005555563f2380(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020020000 CR3: 000000001626d000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
udp6_ufo_fragment+0x9d2/0xd50 net/ipv6/udp_offload.c:109
ipv6_gso_segment+0x5c4/0x17b0 net/ipv6/ip6_offload.c:120
skb_mac_gso_segment+0x292/0x610 net/core/gso.c:53
__skb_gso_segment+0x339/0x710 net/core/gso.c:124
skb_gso_segment include/net/gso.h:83 [inline]
validate_xmit_skb+0x3a5/0xf10 net/core/dev.c:3625
__dev_queue_xmit+0x8f0/0x3d60 net/core/dev.c:4329
dev_queue_xmit include/linux/netdevice.h:3082 [inline]
packet_xmit+0x257/0x380 net/packet/af_packet.c:276
packet_snd net/packet/af_packet.c:3087 [inline]
packet_sendmsg+0x24c7/0x5570 net/packet/af_packet.c:3119
sock_sendmsg_nosec net/socket.c:727 [inline]
sock_sendmsg+0xd9/0x180 net/socket.c:750
____sys_sendmsg+0x6ac/0x940 net/socket.c:2496
___sys_sendmsg+0x135/0x1d0 net/socket.c:2550
__sys_sendmsg+0x117/0x1e0 net/socket.c:2579
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7ff27cdb34d9

Fixes: 3953c46c3ac7 ("sk_buff: allow segmenting based on frag sizes")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Xin Long <lucien.xin@gmail.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230816142158.1779798-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agosock: Fix misuse of sk_under_memory_pressure()
Abel Wu [Wed, 16 Aug 2023 09:12:22 +0000 (17:12 +0800)]
sock: Fix misuse of sk_under_memory_pressure()

The status of global socket memory pressure is updated when:

  a) __sk_mem_raise_allocated():

enter: sk_memory_allocated(sk) >  sysctl_mem[1]
leave: sk_memory_allocated(sk) <= sysctl_mem[0]

  b) __sk_mem_reduce_allocated():

leave: sk_under_memory_pressure(sk) &&
sk_memory_allocated(sk) < sysctl_mem[0]

So the conditions of leaving global pressure are inconstant, which
may lead to the situation that one pressured net-memcg prevents the
global pressure from being cleared when there is indeed no global
pressure, thus the global constrains are still in effect unexpectedly
on the other sockets.

This patch fixes this by ignoring the net-memcg's pressure when
deciding whether should leave global memory pressure.

Fixes: e1aab161e013 ("socket: initial cgroup code.")
Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
Link: https://lore.kernel.org/r/20230816091226.1542-1-wuyun.abel@bytedance.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agosfc: don't fail probe if MAE/TC setup fails
Edward Cree [Tue, 15 Aug 2023 15:57:28 +0000 (16:57 +0100)]
sfc: don't fail probe if MAE/TC setup fails

Existing comment in the source explains why we don't want efx_init_tc()
 failure to be fatal.  Cited commit erroneously consolidated failure
 paths causing the probe to be failed in this case.

Fixes: 7e056e2360d9 ("sfc: obtain device mac address based on firmware handle for ef100")
Reviewed-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://lore.kernel.org/r/aa7f589dd6028bd1ad49f0a85f37ab33c09b2b45.1692114888.git.ecree.xilinx@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agosfc: don't unregister flow_indr if it was never registered
Edward Cree [Tue, 15 Aug 2023 15:57:27 +0000 (16:57 +0100)]
sfc: don't unregister flow_indr if it was never registered

In efx_init_tc(), move the setting of efx->tc->up after the
 flow_indr_dev_register() call, so that if it fails, efx_fini_tc()
 won't call flow_indr_dev_unregister().

Fixes: 5b2e12d51bd8 ("sfc: bind indirect blocks for TC offload on EF100")
Suggested-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Reviewed-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://lore.kernel.org/r/a81284d7013aba74005277bd81104e4cfbea3f6f.1692114888.git.ecree.xilinx@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agoMerge tag 'nfsd-6.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Linus Torvalds [Thu, 17 Aug 2023 14:38:48 +0000 (16:38 +0200)]
Merge tag 'nfsd-6.5-4' of git://git./linux/kernel/git/cel/linux

Pull nfsd fix from Chuck Lever:

 - Fix new MSG_SPLICE_PAGES support in server's TCP sendmsg helper

* tag 'nfsd-6.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
  sunrpc: set the bv_offset of first bvec in svc_tcp_sendmsg

10 months agoRevert "drm/edid: Fix csync detailed mode parsing"
Jani Nikula [Tue, 15 Aug 2023 10:19:07 +0000 (13:19 +0300)]
Revert "drm/edid: Fix csync detailed mode parsing"

This reverts commit ca62297b2085b5b3168bd891ca24862242c635a1.

Commit ca62297b2085 ("drm/edid: Fix csync detailed mode parsing") fixed
EDID detailed mode sync parsing. Unfortunately, there are quite a few
displays out there that have bogus (zero) sync field that are broken by
the change. Zero means analog composite sync, which is not right for
digital displays, and the modes get rejected. Regardless, it used to
work, and it needs to continue to work. Revert the change.

Rejecting modes with analog composite sync was the part that fixed the
gitlab issue 8146 [1]. We'll need to get back to the drawing board with
that.

[1] https://gitlab.freedesktop.org/drm/intel/-/issues/8146

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/8789
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/8930
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/9044
Fixes: ca62297b2085 ("drm/edid: Fix csync detailed mode parsing")
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: dri-devel@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v6.4+
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230815101907.2900768-1-jani.nikula@intel.com
10 months agonet: dsa: mv88e6xxx: Wait for EEPROM done before HW reset
Alfred Lee [Tue, 15 Aug 2023 00:13:23 +0000 (17:13 -0700)]
net: dsa: mv88e6xxx: Wait for EEPROM done before HW reset

If the switch is reset during active EEPROM transactions, as in
just after an SoC reset after power up, the I2C bus transaction
may be cut short leaving the EEPROM internal I2C state machine
in the wrong state.  When the switch is reset again, the bad
state machine state may result in data being read from the wrong
memory location causing the switch to enter unexpected mode
rendering it inoperational.

Fixes: a3dcb3e7e70c ("net: dsa: mv88e6xxx: Wait for EEPROM done after HW reset")
Signed-off-by: Alfred Lee <l00g33k@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20230815001323.24739-1-l00g33k@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agoMerge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf...
Jakub Kicinski [Thu, 17 Aug 2023 03:09:43 +0000 (20:09 -0700)]
Merge tag 'for-netdev' of https://git./linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2023-08-16

We've added 17 non-merge commits during the last 6 day(s) which contain
a total of 20 files changed, 1179 insertions(+), 37 deletions(-).

The main changes are:

1) Add a BPF hook in sys_socket() to change the protocol ID
   from IPPROTO_TCP to IPPROTO_MPTCP to cover migration for legacy
   applications, from Geliang Tang.

2) Follow-up/fallout fix from the SO_REUSEPORT + bpf_sk_assign work
   to fix a splat on non-fullsock sks in inet[6]_steal_sock,
   from Lorenz Bauer.

3) Improvements to struct_ops links to avoid forcing presence of
   update/validate callbacks. Also add bpf_struct_ops fields documentation,
   from David Vernet.

4) Ensure libbpf sets close-on-exec flag on gzopen, from Marco Vedovati.

5) Several new tcx selftest additions and bpftool link show support for
   tcx and xdp links, from Daniel Borkmann.

6) Fix a smatch warning on uninitialized symbol in
   bpf_perf_link_fill_kprobe, from Yafang Shao.

7) BPF selftest fixes e.g. misplaced break in kfunc_call test,
   from Yipeng Zou.

8) Small cleanup to remove unused declaration bpf_link_new_file,
   from Yue Haibing.

9) Small typo fix to bpftool's perf help message, from Daniel T. Lee.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
  selftests/bpf: Add mptcpify test
  selftests/bpf: Fix error checks of mptcp open_and_load
  selftests/bpf: Add two mptcp netns helpers
  bpf: Add update_socket_protocol hook
  bpftool: Implement link show support for xdp
  bpftool: Implement link show support for tcx
  selftests/bpf: Add selftest for fill_link_info
  bpf: Fix uninitialized symbol in bpf_perf_link_fill_kprobe()
  net: Fix slab-out-of-bounds in inet[6]_steal_sock
  bpf: Document struct bpf_struct_ops fields
  bpf: Support default .validate() and .update() behavior for struct_ops links
  selftests/bpf: Add various more tcx test cases
  selftests/bpf: Clean up fmod_ret in bench_rename test script
  selftests/bpf: Fix repeat option when kfunc_call verification fails
  libbpf: Set close-on-exec flag on gzopen
  bpftool: fix perf help message
  bpf: Remove unused declaration bpf_link_new_file()
====================

Link: https://lore.kernel.org/r/20230816212840.1539-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agoRevert "net: ethernet: ti: am65-cpsw: add mqprio qdisc offload in channel mode"
Jakub Kicinski [Thu, 17 Aug 2023 02:33:44 +0000 (19:33 -0700)]
Revert "net: ethernet: ti: am65-cpsw: add mqprio qdisc offload in channel mode"

This reverts commit 90bc21aaef4adaefceda2d385756138fc247c0c2.

Patch was merged too hastily, Vladimir requested changes in:
https://lore.kernel.org/all/20230816121305.5dio5tk3chge2ndh@skbuf/

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agodrm/nouveau/disp: fix use-after-free in error handling of nouveau_connector_create
Karol Herbst [Mon, 14 Aug 2023 14:49:32 +0000 (16:49 +0200)]
drm/nouveau/disp: fix use-after-free in error handling of nouveau_connector_create

We can't simply free the connector after calling drm_connector_init on it.
We need to clean up the drm side first.

It might not fix all regressions from commit 2b5d1c29f6c4
("drm/nouveau/disp: PIOR DP uses GPIO for HPD, not PMGR AUX interrupts"),
but at least it fixes a memory corruption in error handling related to
that commit.

Link: https://lore.kernel.org/lkml/20230806213107.GFZNARG6moWpFuSJ9W@fat_crate.local/
Fixes: 95983aea8003 ("drm/nouveau/disp: add connector class")
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230814144933.3956959-1-kherbst@redhat.com
10 months agonet/mlx5: Fix mlx5_cmd_update_root_ft() error flow
Shay Drory [Sun, 25 Jun 2023 07:43:03 +0000 (10:43 +0300)]
net/mlx5: Fix mlx5_cmd_update_root_ft() error flow

The cited patch change mlx5_cmd_update_root_ft() to work with multiple
peer devices. However, it didn't align the error flow as well.
Hence, Fix the error code to work with multiple peer devices.

Fixes: 222dd185833e ("{net/RDMA}/mlx5: introduce lag_for_each_peer")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
10 months agonet/mlx5e: XDP, Fix fifo overrun on XDP_REDIRECT
Dragos Tatulea [Tue, 1 Aug 2023 17:41:03 +0000 (20:41 +0300)]
net/mlx5e: XDP, Fix fifo overrun on XDP_REDIRECT

Before this fix, running high rate traffic through XDP_REDIRECT
with multibuf could overrun the fifo used to release the
xdp frames after tx completion. This resulted in corrupted data
being consumed on the free side.

The culplirt was a miscalculation of the fifo size: the maximum ratio
between fifo entries / data segments was incorrect. This ratio serves to
calculate the max fifo size for a full sq where each packet uses the
worst case number of entries in the fifo.

This patch fixes the formula and names the constant. It also makes sure
that future values will use a power of 2 number of entries for the fifo
mask to work.

Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Fixes: 3f734b8c594b ("net/mlx5e: XDP, Use multiple single-entry objects in xdpi_fifo")
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
10 months agoRevert "Revert "drm/amdgpu/display: change pipe policy for DCN 2.0""
Alex Deucher [Tue, 15 Aug 2023 21:25:37 +0000 (17:25 -0400)]
Revert "Revert "drm/amdgpu/display: change pipe policy for DCN 2.0""

This reverts commit 27dd79c00aeab36cd7542c7a4481a32549038659.

It appears MPC_SPLIT_DYNAMIC still causes problems with multiple
displays on DCN2.0 hardware.  Switch back to MPC_SPLIT_AVOID_MULT_DISP.
This increases power usage with multiple displays, but avoids hangs.

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2475
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.4.x
10 months agodrm/amd: flush any delayed gfxoff on suspend entry
Mario Limonciello [Thu, 18 May 2023 16:52:51 +0000 (11:52 -0500)]
drm/amd: flush any delayed gfxoff on suspend entry

DCN 3.1.4 is reported to hang on s2idle entry if graphics activity
is happening during entry.  This is because GFXOFF was scheduled as
delayed but RLC gets disabled in s2idle entry sequence which will
hang GFX IP if not already in GFXOFF.

To help this problem, flush any delayed work for GFXOFF early in
s2idle entry sequence to ensure that it's off when RLC is changed.

commit 4b31b92b143f ("drm/amdgpu: complete gfxoff allow signal during
suspend without delay") modified power gating flow so that if called
in s0ix that it ensured that GFXOFF wasn't put in work queue but
instead processed immediately.

This is dead code due to commit 10cb67eb8a1b ("drm/amdgpu: skip
CG/PG for gfx during S0ix") because GFXOFF will now not be explicitly
called as part of the suspend entry code.  Remove that dead code.

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
10 months agodrm/amdgpu: skip fence GFX interrupts disable/enable for S0ix
Tim Huang [Mon, 14 Aug 2023 07:13:04 +0000 (15:13 +0800)]
drm/amdgpu: skip fence GFX interrupts disable/enable for S0ix

GFX v11.0.1 reported fence fallback timer expired issue on
SDMA and GFX rings after S0ix resume. This is generated by
EOP interrupts are disabled when S0ix suspend but fails to
re-enable when resume because of the GFX is in GFXOFF.

[  203.349571] [drm] Fence fallback timer expired on ring sdma0
[  203.349572] [drm] Fence fallback timer expired on ring gfx_0.0.0
[  203.861635] [drm] Fence fallback timer expired on ring gfx_0.0.0

For S0ix, GFX is in GFXOFF state, avoid to touch the GFX registers
to configure the fence driver interrupts for rings that belong to GFX.
The interrupts configuration will be restored by GFXOFF exit.

Signed-off-by: Tim Huang <Tim.Huang@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
10 months agodrm/amdgpu: skip xcp drm device allocation when out of drm resource
James Zhu [Wed, 9 Aug 2023 20:45:04 +0000 (16:45 -0400)]
drm/amdgpu: skip xcp drm device allocation when out of drm resource

Return 0 when drm device alloc failed with -ENOSPC in
order to  allow amdgpu drive loading. But the xcp without
drm device node assigned won't be visiable in user space.
This helps amdgpu driver loading on system which has more
than 64 nodes, the current limitation.

The proposal to add more drm nodes is discussed in public,
which will support up to 2^20 nodes totally.
kernel drm:
https://lore.kernel.org/lkml/20230724211428.3831636-1-michal.winiarski@intel.com/T/
libdrm:
https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/305

Signed-off-by: James Zhu <James.Zhu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
10 months agodrm/amd/pm: Update pci link width for smu v13.0.6
Asad Kamal [Thu, 10 Aug 2023 12:44:54 +0000 (20:44 +0800)]
drm/amd/pm: Update pci link width for smu v13.0.6

Update addresses of PCIE link width registers,
& link width format used to populate gpu metrics
table for smu v13.0.6

v2:
Removed ESM register update

v3:
Updated patch subject and message

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
10 months agodrm/amd/pm: Fix temperature unit of SMU v13.0.6
Lijo Lazar [Thu, 10 Aug 2023 10:40:03 +0000 (16:10 +0530)]
drm/amd/pm: Fix temperature unit of SMU v13.0.6

Temperature needs to be reported in millidegree Celsius.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
10 months agodrm/amdgpu/pm: fix throttle_status for other than MP1 11.0.7
Umio Yasuno [Tue, 8 Aug 2023 06:40:42 +0000 (06:40 +0000)]
drm/amdgpu/pm: fix throttle_status for other than MP1 11.0.7

Use the right metrics table version based on the firmware.

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2720
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Umio Yasuno <coelacanth_dream@protonmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
10 months agodrm/amdgpu: disable mcbp if parameter zero is set
Jiadong Zhu [Tue, 8 Aug 2023 02:59:25 +0000 (10:59 +0800)]
drm/amdgpu: disable mcbp if parameter zero is set

The parameter amdgpu_mcbp shall have priority against the default value
calculated from the chip version.
User could disable mcbp by setting the parameter mcbp as zero.

v2: do not trigger preemption in sw ring muxer when mcbp is disabled.

Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
10 months agoMerge branch 'bpf: Force to MPTCP'
Martin KaFai Lau [Wed, 16 Aug 2023 17:22:17 +0000 (10:22 -0700)]
Merge branch 'bpf: Force to MPTCP'

Geliang Tang says:

====================
As is described in the "How to use MPTCP?" section in MPTCP wiki [1]:

"Your app should create sockets with IPPROTO_MPTCP as the proto:
( socket(AF_INET, SOCK_STREAM, IPPROTO_MPTCP); ). Legacy apps can be
forced to create and use MPTCP sockets instead of TCP ones via the
mptcpize command bundled with the mptcpd daemon."

But the mptcpize (LD_PRELOAD technique) command has some limitations
[2]:

 - it doesn't work if the application is not using libc (e.g. GoLang
apps)
 - in some envs, it might not be easy to set env vars / change the way
apps are launched, e.g. on Android
 - mptcpize needs to be launched with all apps that want MPTCP: we could
have more control from BPF to enable MPTCP only for some apps or all the
ones of a netns or a cgroup, etc.
 - it is not in BPF, we cannot talk about it at netdev conf.

So this patchset attempts to use BPF to implement functions similer to
mptcpize.

The main idea is to add a hook in sys_socket() to change the protocol id
from IPPROTO_TCP (or 0) to IPPROTO_MPTCP.

[1]
https://github.com/multipath-tcp/mptcp_net-next/wiki
[2]
https://github.com/multipath-tcp/mptcp_net-next/issues/79

v14:
 - Use getsockopt(MPTCP_INFO) to verify mptcp protocol intead of using
nstat command.

v13:
 - drop "Use random netns name for mptcp" patch.

v12:
 - update diag_* log of update_socket_protocol.
 - add 'ip netns show' after 'ip netns del' to check if there is
a test did not clean up its netns.
 - return libbpf_get_error() instead of -EIO for the error from
open_and_load().
 - Use getsockopt(SOL_PROTOCOL) to verify mptcp protocol intead of
using 'ss -tOni'.

v11:
 - add comments about outputs of 'ss' and 'nstat'.
 - use "err = verify_mptcpify()" instead of using =+.

v10:
 - drop "#ifdef CONFIG_BPF_JIT".
 - include vmlinux.h and bpf_tracing_net.h to avoid defining some
macros.
 - drop unneeded checks for mptcp.

v9:
 - update comment for 'update_socket_protocol'.

v8:
 - drop the additional checks on the 'protocol' value after the
'update_socket_protocol()' call.

v7:
 - add __weak and __diag_* for update_socket_protocol.

v6:
 - add update_socket_protocol.

v5:
 - add bpf_mptcpify helper.

v4:
 - use lsm_cgroup/socket_create

v3:
 - patch 8: char cmd[128]; -> char cmd[256];

v2:
 - Fix build selftests errors reported by CI

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/79
====================

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
10 months agoselftests/bpf: Add mptcpify test
Geliang Tang [Wed, 16 Aug 2023 01:11:59 +0000 (09:11 +0800)]
selftests/bpf: Add mptcpify test

Implement a new test program mptcpify: if the family is AF_INET or
AF_INET6, the type is SOCK_STREAM, and the protocol ID is 0 or
IPPROTO_TCP, set it to IPPROTO_MPTCP. It will be hooked in
update_socket_protocol().

Extend the MPTCP test base, add a selftest test_mptcpify() for the
mptcpify case. Open and load the mptcpify test prog to mptcpify the
TCP sockets dynamically, then use start_server() and connect_to_fd()
to create a TCP socket, but actually what's created is an MPTCP
socket, which can be verified through 'getsockopt(SOL_PROTOCOL)'
and 'getsockopt(MPTCP_INFO)'.

Acked-by: Yonghong Song <yonghong.song@linux.dev>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Link: https://lore.kernel.org/r/364e72f307e7bb38382ec7442c182d76298a9c41.1692147782.git.geliang.tang@suse.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
10 months agoselftests/bpf: Fix error checks of mptcp open_and_load
Geliang Tang [Wed, 16 Aug 2023 01:11:58 +0000 (09:11 +0800)]
selftests/bpf: Fix error checks of mptcp open_and_load

Return libbpf_get_error(), instead of -EIO, for the error from
mptcp_sock__open_and_load().

Load success means prog_fd and map_fd are always valid. So drop these
unneeded ASSERT_GE checks for them in mptcp run_test().

Acked-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Link: https://lore.kernel.org/r/db5fcb93293df9ab173edcbaf8252465b80da6f2.1692147782.git.geliang.tang@suse.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
10 months agoselftests/bpf: Add two mptcp netns helpers
Geliang Tang [Wed, 16 Aug 2023 01:11:57 +0000 (09:11 +0800)]
selftests/bpf: Add two mptcp netns helpers

Add two netns helpers for mptcp tests: create_netns() and
cleanup_netns(). Use them in test_base().

These new helpers will be re-used in the following commits
introducing new tests.

Acked-by: Yonghong Song <yonghong.song@linux.dev>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Link: https://lore.kernel.org/r/7506371fb6c417b401cc9d7365fe455754f4ba3f.1692147782.git.geliang.tang@suse.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
10 months agobpf: Add update_socket_protocol hook
Geliang Tang [Wed, 16 Aug 2023 01:11:56 +0000 (09:11 +0800)]
bpf: Add update_socket_protocol hook

Add a hook named update_socket_protocol in __sys_socket(), for bpf
progs to attach to and update socket protocol. One user case is to
force legacy TCP apps to create and use MPTCP sockets instead of
TCP ones.

Define a fmod_ret set named bpf_mptcp_fmodret_ids, add the hook
update_socket_protocol into this set, and register it in
bpf_mptcp_kfunc_init().

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/79
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Link: https://lore.kernel.org/r/ac84be00f97072a46f8a72b4e2be46cbb7fa5053.1692147782.git.geliang.tang@suse.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
10 months agobpftool: Implement link show support for xdp
Daniel Borkmann [Wed, 16 Aug 2023 09:56:51 +0000 (11:56 +0200)]
bpftool: Implement link show support for xdp

Add support to dump XDP link information to bpftool. This reuses the
recently added show_link_ifindex_{plain,json}(). The XDP link info only
exposes the ifindex.

Below shows an example link dump output, and a cgroup link is included
for comparison, too:

  # bpftool link
  [...]
  10: cgroup  prog 2466
        cgroup_id 1  attach_type cgroup_inet6_post_bind
  [...]
  16: xdp  prog 2477
        ifindex enp5s0(3)
  [...]

Equivalent json output:

  # bpftool link --json
  [...]
  {
    "id": 10,
    "type": "cgroup",
    "prog_id": 2466,
    "cgroup_id": 1,
    "attach_type": "cgroup_inet6_post_bind"
  },
  [...]
  {
    "id": 16,
    "type": "xdp",
    "prog_id": 2477,
    "devname": "enp5s0",
    "ifindex": 3
  }
  [...]

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/r/20230816095651.10014-2-daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
10 months agobpftool: Implement link show support for tcx
Daniel Borkmann [Wed, 16 Aug 2023 09:56:50 +0000 (11:56 +0200)]
bpftool: Implement link show support for tcx

Add support to dump tcx link information to bpftool. This adds a
common helper show_link_ifindex_{plain,json}() which can be reused
also for other link types. The plain text and json device output is
the same format as in bpftool net dump.

Below shows an example link dump output along with a cgroup link
for comparison:

  # bpftool link
  [...]
  10: cgroup  prog 1977
        cgroup_id 1  attach_type cgroup_inet6_post_bind
  [...]
  13: tcx  prog 2053
        ifindex enp5s0(3)  attach_type tcx_ingress
  14: tcx  prog 2080
        ifindex enp5s0(3)  attach_type tcx_egress
  [...]

Equivalent json output:

  # bpftool link --json
  [...]
  {
    "id": 10,
    "type": "cgroup",
    "prog_id": 1977,
    "cgroup_id": 1,
    "attach_type": "cgroup_inet6_post_bind"
  },
  [...]
  {
    "id": 13,
    "type": "tcx",
    "prog_id": 2053,
    "devname": "enp5s0",
    "ifindex": 3,
    "attach_type": "tcx_ingress"
  },
  {
    "id": 14,
    "type": "tcx",
    "prog_id": 2080,
    "devname": "enp5s0",
    "ifindex": 3,
    "attach_type": "tcx_egress"
  }
  [...]

Suggested-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Acked-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/r/20230816095651.10014-1-daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
10 months agodrm/amd/pm: disallow the fan setting if there is no fan on smu 13.0.0
Kenneth Feng [Wed, 9 Aug 2023 10:06:05 +0000 (18:06 +0800)]
drm/amd/pm: disallow the fan setting if there is no fan on smu 13.0.0

drm/amd/pm: disallow the fan setting if there is no fan on smu 13.0.0
V2: depend on pm.no_fan to check

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
10 months agoi40e: fix misleading debug logs
Andrii Staikov [Wed, 2 Aug 2023 07:47:32 +0000 (09:47 +0200)]
i40e: fix misleading debug logs

Change "write" into the actual "read" word.
Change parameters description.

Fixes: 7073f46e443e ("i40e: Add AQ commands for NVM Update for X722")
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Andrii Staikov <andrii.staikov@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
10 months agoiavf: fix FDIR rule fields masks validation
Piotr Gardocki [Mon, 7 Aug 2023 14:46:04 +0000 (16:46 +0200)]
iavf: fix FDIR rule fields masks validation

Return an error if a field's mask is neither full nor empty. When a mask
is only partial the field is not being used for rule programming but it
gives a wrong impression it is used. Fix by returning an error on any
partial mask to make it clear they are not supported.
The ip_ver assignment is moved earlier in code to allow using it in
iavf_validate_fdir_fltr_masks.

Fixes: 527691bf0682 ("iavf: Support IPv4 Flow Director filters")
Fixes: e90cbc257a6f ("iavf: Support IPv6 Flow Director filters")
Signed-off-by: Piotr Gardocki <piotrx.gardocki@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
10 months agoselftests/bpf: Add selftest for fill_link_info
Yafang Shao [Sun, 13 Aug 2023 14:19:00 +0000 (14:19 +0000)]
selftests/bpf: Add selftest for fill_link_info

Add selftest for the fill_link_info of uprobe, kprobe and tracepoint.
The result:

  $ tools/testing/selftests/bpf/test_progs --name=fill_link_info
  #79/1    fill_link_info/kprobe_link_info:OK
  #79/2    fill_link_info/kretprobe_link_info:OK
  #79/3    fill_link_info/kprobe_invalid_ubuff:OK
  #79/4    fill_link_info/tracepoint_link_info:OK
  #79/5    fill_link_info/uprobe_link_info:OK
  #79/6    fill_link_info/uretprobe_link_info:OK
  #79/7    fill_link_info/kprobe_multi_link_info:OK
  #79/8    fill_link_info/kretprobe_multi_link_info:OK
  #79/9    fill_link_info/kprobe_multi_invalid_ubuff:OK
  #79      fill_link_info:OK
  Summary: 1/9 PASSED, 0 SKIPPED, 0 FAILED

The test case for kprobe_multi won't be run on aarch64, as it is not
supported.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20230813141900.1268-3-laoar.shao@gmail.com
10 months agobpf: Fix uninitialized symbol in bpf_perf_link_fill_kprobe()
Yafang Shao [Sun, 13 Aug 2023 14:18:59 +0000 (14:18 +0000)]
bpf: Fix uninitialized symbol in bpf_perf_link_fill_kprobe()

The commit 1b715e1b0ec5 ("bpf: Support ->fill_link_info for perf_event") leads
to the following Smatch static checker warning:

    kernel/bpf/syscall.c:3416 bpf_perf_link_fill_kprobe()
    error: uninitialized symbol 'type'.

That can happens when uname is NULL. So fix it by verifying the uname when we
really need to fill it.

Fixes: 1b715e1b0ec5 ("bpf: Support ->fill_link_info for perf_event")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Closes: https://lore.kernel.org/bpf/85697a7e-f897-4f74-8b43-82721bebc462@kili.mountain
Link: https://lore.kernel.org/bpf/20230813141900.1268-2-laoar.shao@gmail.com
10 months agoMerge branch 'ipv6-expired-routes'
David S. Miller [Wed, 16 Aug 2023 11:26:44 +0000 (12:26 +0100)]
Merge branch 'ipv6-expired-routes'

Kui-Feng Lee says:

====================
Remove expired routes with a separated list of routes.

FIB6 GC walks trees of fib6_tables to remove expired routes. Walking a tree
can be expensive if the number of routes in a table is big, even if most of
them are permanent. Checking routes in a separated list of routes having
expiration will avoid this potential issue.

Background
==========

The size of a Linux IPv6 routing table can become a big problem if not
managed appropriately.  Now, Linux has a garbage collector to remove
expired routes periodically.  However, this may lead to a situation in
which the routing path is blocked for a long period due to an
excessive number of routes.

For example, years ago, there is a commit c7bb4b89033b ("ipv6: tcp:
drop silly ICMPv6 packet too big messages").  The root cause is that
malicious ICMPv6 packets were sent back for every small packet sent to
them. These packets add routes with an expiration time that prompts
the GC to periodically check all routes in the tables, including
permanent ones.

Why Route Expires
=================

Users can add IPv6 routes with an expiration time manually. However,
the Neighbor Discovery protocol may also generate routes that can
expire.  For example, Router Advertisement (RA) messages may create a
default route with an expiration time. [RFC 4861] For IPv4, it is not
possible to set an expiration time for a route, and there is no RA, so
there is no need to worry about such issues.

Create Routes with Expires
==========================

You can create routes with expires with the  command.

For example,

    ip -6 route add 2001:b000:591::3 via fe80::5054:ff:fe12:3457 \
        dev enp0s3 expires 30

The route that has been generated will be deleted automatically in 30
seconds.

GC of FIB6
==========

The function called fib6_run_gc() is responsible for performing
garbage collection (GC) for the Linux IPv6 stack. It checks for the
expiration of every route by traversing the trees of routing
tables. The time taken to traverse a routing table increases with its
size. Holding the routing table lock during traversal is particularly
undesirable. Therefore, it is preferable to keep the lock for the
shortest possible duration.

Solution
========

The cause of the issue is keeping the routing table locked during the
traversal of large trees. To solve this problem, we can create a separate
list of routes that have expiration. This will prevent GC from checking
permanent routes.

Result
======

We conducted a test to measure the execution times of fib6_gc_timer_cb()
and observed that it enhances the GC of FIB6. During the test, we added
permanent routes with the following numbers: 1000, 3000, 6000, and
9000. Additionally, we added a route with an expiration time.

Here are the average execution times for the kernel without the patch.
 - 120020 ns with 1000 permanent routes
 - 308920 ns with 3000 ...
 - 581470 ns with 6000 ...
 - 855310 ns with 9000 ...

The kernel with the patch consistently takes around 14000 ns to execute,
regardless of the number of permanent routes that are installed.

Major changes from v7:

 - Fix warings raised by the patchwork.

Major changes from v6:

 - Remove unnecessary check of tb6 in fib6_clean_expires_locked().

 - Use ib6_clean_expires_locked() instead in fib6_purge_rt().

Major changes from v5:

 - Change the order of adding new routes to the GC list and starting
   GC timer.

 - Remove time measurements from the test case.

 - Stop forcing GC flush.

Major changes from v4:

 - Detect existence of 'strace' in the test case.

Major changes from v3:

 - Fix the type of arg according to feedback.

 - Add 1k temporary routes and 5K permanent routes in the test case.
   Measure time spending on GC with strace.

Major changes from v2:

 - Remove unnecessary and incorrect sysctl restoring in the test case.

Major changes from v1:

 - Moved gc_link to avoid creating a hole in fib6_info.

 - Moved fib6_set_expires*() and fib6_clean_expires*() to the header
   file and inlined. And removed duplicated lines.

 - Added a test case.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoselftests: fib_tests: Add a test case for IPv6 garbage collection
Kui-Feng Lee [Tue, 15 Aug 2023 18:07:06 +0000 (11:07 -0700)]
selftests: fib_tests: Add a test case for IPv6 garbage collection

Add 1000 IPv6 routes with expiration time (w/ and w/o additional 5000
permanet routes in the background.)  Wait for a few seconds to make sure
they are removed correctly.

The expected output of the test looks like the following example.

> Fib6 garbage collection test
>     TEST: ipv6 route garbage collection [ OK ]

Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agonet/ipv6: Remove expired routes with a separated list of routes.
Kui-Feng Lee [Tue, 15 Aug 2023 18:07:05 +0000 (11:07 -0700)]
net/ipv6: Remove expired routes with a separated list of routes.

FIB6 GC walks trees of fib6_tables to remove expired routes. Walking a tree
can be expensive if the number of routes in a table is big, even if most of
them are permanent. Checking routes in a separated list of routes having
expiration will avoid this potential issue.

Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoe1000e: Use PME poll to circumvent unreliable ACPI wake
Kai-Heng Feng [Tue, 15 Aug 2023 17:01:11 +0000 (10:01 -0700)]
e1000e: Use PME poll to circumvent unreliable ACPI wake

On some I219 devices, ethernet cable plugging detection only works once
from PCI D3 state. Subsequent cable plugging does set PME bit correctly,
but device still doesn't get woken up.

Since I219 connects to the root complex directly, it relies on platform
firmware (ACPI) to wake it up. In this case, the GPE from _PRW only
works for first cable plugging but fails to notify the driver for
subsequent plugging events.

The issue was originally found on CNP, but the same issue can be found
on ADL too. So workaround the issue by continuing use PME poll after
first ACPI wake. As PME poll is always used, the runtime suspend
restriction for CNP can also be removed.

Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Acked-by: Sasha Neftin <sasha.neftin@intel.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agonet-memcg: Fix scope of sockmem pressure indicators
Abel Wu [Mon, 14 Aug 2023 07:09:11 +0000 (15:09 +0800)]
net-memcg: Fix scope of sockmem pressure indicators

Now there are two indicators of socket memory pressure sit inside
struct mem_cgroup, socket_pressure and tcpmem_pressure, indicating
memory reclaim pressure in memcg->memory and ->tcpmem respectively.

When in legacy mode (cgroupv1), the socket memory is charged into
->tcpmem which is independent of ->memory, so socket_pressure has
nothing to do with socket's pressure at all. Things could be worse
by taking socket_pressure into consideration in legacy mode, as a
pressure in ->memory can lead to premature reclamation/throttling
in socket.

While for the default mode (cgroupv2), the socket memory is charged
into ->memory, and ->tcpmem/->tcpmem_pressure are simply not used.

So {socket,tcpmem}_pressure are only used in default/legacy mode
respectively for indicating socket memory pressure. This patch fixes
the pieces of code that make mixed use of both.

Fixes: 8e8ae645249b ("mm: memcontrol: hook up vmpressure to socket pressure")
Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agonfp: update maintainer
Louis Peens [Tue, 15 Aug 2023 12:43:25 +0000 (14:43 +0200)]
nfp: update maintainer

Take over maintainership of the nfp driver from Simon as he
is moving away from Corigine.

Signed-off-by: Louis Peens <louis.peens@corigine.com>
Acked-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agonet: ethernet: ti: am65-cpsw: add mqprio qdisc offload in channel mode
Grygorii Strashko [Tue, 15 Aug 2023 08:21:05 +0000 (11:21 +0300)]
net: ethernet: ti: am65-cpsw: add mqprio qdisc offload in channel mode

This patch adds MQPRIO Qdisc offload in full 'channel' mode which allows
not only setting up pri:tc mapping, but also configuring TX shapers on
external port FIFOs. The K3 CPSW MQPRIO Qdisc offload is expected to work
with VLAN/priority tagged packets. Non-tagged packets have to be mapped
only to TC0.

- TX traffic classes must be rated starting from TC that has highest
priority and with no gaps
- Traffic classes are used starting from 0, that has highest priority
- min_rate defines Committed Information Rate (guaranteed)
- max_rate defines Excess Information Rate (non guaranteed) and offloaded
as (max_rate[i] - tcX_min_rate[i])
- VLAN/priority tagged packets mapped to TC0 will exit switch with VLAN tag
priority 0

The configuration example:
 ethtool -L eth1 tx 5
 ethtool --set-priv-flags eth1 p0-rx-ptype-rrobin off

 tc qdisc add dev eth1 parent root handle 100: mqprio num_tc 3 \
 map 0 0 1 2 0 0 0 0 0 0 0 0 0 0 0 0 \
 queues 1@0 1@1 1@2 hw 1 mode channel \
 shaper bw_rlimit min_rate 0 100mbit 200mbit max_rate 0 101mbit 202mbit

 tc qdisc replace dev eth2 handle 100: parent root mqprio num_tc 1 \
 map 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 queues 1@0 hw 1

 ip link add link eth1 name eth1.100 type vlan id 100
 ip link set eth1.100 type vlan egress 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7

In the above example two ports share the same TX CPPI queue 0 for low
priority traffic. 3 traffic classes are defined for eth1 and mapped to:
TC0 - low priority, TX CPPI queue 0 -> ext Port 1 fifo0, no rate limit
TC1 - prio 2, TX CPPI queue 1 -> ext Port 1 fifo1, CIR=100Mbit/s, EIR=1Mbit/s
TC2 - prio 3, TX CPPI queue 2 -> ext Port 1 fifo2, CIR=200Mbit/s, EIR=2Mbit/s

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoMerge tag 'nf-23-08-16' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
David S. Miller [Wed, 16 Aug 2023 10:11:24 +0000 (11:11 +0100)]
Merge tag 'nf-23-08-16' of https://git./linux/kernel/git/netfilter/nf

Florisn Westphal says:

====================
These are netfilter fixes for the *net* tree.

First patch resolves a false-positive lockdep splat:
rcu_dereference is used outside of rcu read lock.  Let lockdep
validate that the transaction mutex is locked.

Second patch fixes a kdoc warning added in previous PR.

Third patch fixes a memory leak:
The catchall element isn't disabled correctly, this allows
userspace to deactivate the element again. This results in refcount
underflow which in turn prevents memory release. This was always
broken since the feature was added in 5.13.

Patch 4 fixes an incorrect change in the previous pull request:
Adding a duplicate key to a set should work if the duplicate key
has expired, restore this behaviour. All from myself.

Patch #5 resolves an old historic artifact in sctp conntrack:
a 300ms timeout for shutdown_ack. Increase this to 3s.  From Xin Long.

Patch #6 fixes a sysctl data race in ipvs, two threads can clobber the
sysctl value, from Sishuai Gong. This is a day-0 bug that predates git
history.

Patches 7, 8 and 9, from Pablo Neira Ayuso, are also followups
for the previous GC rework in nf_tables: The netlink notifier and the
netns exit path must both increment the gc worker seqcount, else worker
may encounter stale (free'd) pointers.
================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoMerge branch 'inet-data-races'
David S. Miller [Wed, 16 Aug 2023 10:09:18 +0000 (11:09 +0100)]
Merge branch 'inet-data-races'

Eric Dumazet says:

====================
inet: socket lock and data-races avoidance

In this series, I converted 20 bits in "struct inet_sock" and made
them truly atomic.

This allows to implement many IP_ socket options in a lockless
fashion (no need to acquire socket lock), and fixes data-races
that were showing up in various KCSAN reports.

I also took care of IP_TTL/IP_MINTTL, but left few other options
for another series.

v4: Rebased after recent mptcp changes.
  Added Reviewed-by: tags from Simon (thanks !)

v3: fixed patch 7, feedback from build bot about ipvs set_mcast_loop()

v2: addressed a feedback from a build bot in patch 9 by removing
 unused issk variable in mptcp_setsockopt_sol_ip_set_transparent()
 Added Acked-by: tags from Soheil (thanks !)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoinet: implement lockless IP_MINTTL
Eric Dumazet [Wed, 16 Aug 2023 08:15:47 +0000 (08:15 +0000)]
inet: implement lockless IP_MINTTL

inet->min_ttl is already read with READ_ONCE().

Implementing IP_MINTTL socket option set/read
without holding the socket lock is easy.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoinet: implement lockless IP_TTL
Eric Dumazet [Wed, 16 Aug 2023 08:15:46 +0000 (08:15 +0000)]
inet: implement lockless IP_TTL

ip_select_ttl() is racy, because it reads inet->uc_ttl
without proper locking.

Add READ_ONCE()/WRITE_ONCE() annotations while
allowing IP_TTL socket option to be set/read without
holding the socket lock.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoinet: move inet->defer_connect to inet->inet_flags
Eric Dumazet [Wed, 16 Aug 2023 08:15:45 +0000 (08:15 +0000)]
inet: move inet->defer_connect to inet->inet_flags

Make room in struct inet_sock by removing this bit field,
using one available bit in inet_flags instead.

Also move local_port_range to fill the resulting hole,
saving 8 bytes on 64bit arches.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoinet: move inet->bind_address_no_port to inet->inet_flags
Eric Dumazet [Wed, 16 Aug 2023 08:15:44 +0000 (08:15 +0000)]
inet: move inet->bind_address_no_port to inet->inet_flags

IP_BIND_ADDRESS_NO_PORT socket option can now be set/read
without locking the socket.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoinet: move inet->nodefrag to inet->inet_flags
Eric Dumazet [Wed, 16 Aug 2023 08:15:43 +0000 (08:15 +0000)]
inet: move inet->nodefrag to inet->inet_flags

IP_NODEFRAG socket option can now be set/read
without locking the socket.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoinet: move inet->is_icsk to inet->inet_flags
Eric Dumazet [Wed, 16 Aug 2023 08:15:42 +0000 (08:15 +0000)]
inet: move inet->is_icsk to inet->inet_flags

We move single bit fields to inet->inet_flags to avoid races.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoinet: move inet->transparent to inet->inet_flags
Eric Dumazet [Wed, 16 Aug 2023 08:15:41 +0000 (08:15 +0000)]
inet: move inet->transparent to inet->inet_flags

IP_TRANSPARENT socket option can now be set/read
without locking the socket.

v2: removed unused issk variable in mptcp_setsockopt_sol_ip_set_transparent()
v4: rebased after commit 3f326a821b99 ("mptcp: change the mpc check helper to return a sk")

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoinet: move inet->mc_all to inet->inet_frags
Eric Dumazet [Wed, 16 Aug 2023 08:15:40 +0000 (08:15 +0000)]
inet: move inet->mc_all to inet->inet_frags

IP_MULTICAST_ALL socket option can now be set/read
without locking the socket.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoinet: move inet->mc_loop to inet->inet_frags
Eric Dumazet [Wed, 16 Aug 2023 08:15:39 +0000 (08:15 +0000)]
inet: move inet->mc_loop to inet->inet_frags

IP_MULTICAST_LOOP socket option can now be set/read
without locking the socket.

v3: fix build bot error reported in ipvs set_mcast_loop()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoinet: move inet->hdrincl to inet->inet_flags
Eric Dumazet [Wed, 16 Aug 2023 08:15:38 +0000 (08:15 +0000)]
inet: move inet->hdrincl to inet->inet_flags

IP_HDRINCL socket option can now be set/read
without locking the socket.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoinet: move inet->freebind to inet->inet_flags
Eric Dumazet [Wed, 16 Aug 2023 08:15:37 +0000 (08:15 +0000)]
inet: move inet->freebind to inet->inet_flags

IP_FREEBIND socket option can now be set/read
without locking the socket.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoinet: move inet->recverr_rfc4884 to inet->inet_flags
Eric Dumazet [Wed, 16 Aug 2023 08:15:36 +0000 (08:15 +0000)]
inet: move inet->recverr_rfc4884 to inet->inet_flags

IP_RECVERR_RFC4884 socket option can now be set/read
without locking the socket.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoinet: move inet->recverr to inet->inet_flags
Eric Dumazet [Wed, 16 Aug 2023 08:15:35 +0000 (08:15 +0000)]
inet: move inet->recverr to inet->inet_flags

IP_RECVERR socket option can now be set/get without locking the socket.

This patch potentially avoid data-races around inet->recverr.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoinet: set/get simple options locklessly
Eric Dumazet [Wed, 16 Aug 2023 08:15:34 +0000 (08:15 +0000)]
inet: set/get simple options locklessly

Now we have inet->inet_flags, we can set following options
without having to hold the socket lock:

IP_PKTINFO, IP_RECVTTL, IP_RECVTOS, IP_RECVOPTS, IP_RETOPTS,
IP_PASSSEC, IP_RECVORIGDSTADDR, IP_RECVFRAGSIZE.

ip_sock_set_pktinfo() no longer hold the socket lock.

Similarly we can get the following options whithout holding
the socket lock:

IP_PKTINFO, IP_RECVTTL, IP_RECVTOS, IP_RECVOPTS, IP_RETOPTS,
IP_PASSSEC, IP_RECVORIGDSTADDR, IP_CHECKSUM, IP_RECVFRAGSIZE.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoinet: introduce inet->inet_flags
Eric Dumazet [Wed, 16 Aug 2023 08:15:33 +0000 (08:15 +0000)]
inet: introduce inet->inet_flags

Various inet fields are currently racy.

do_ip_setsockopt() and do_ip_getsockopt() are mostly holding
the socket lock, but some (fast) paths do not.

Use a new inet->inet_flags to hold atomic bits in the series.

Remove inet->cmsg_flags, and use instead 9 bits from inet_flags.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoipv6: fix indentation of a config attribute
Prasad Pandit [Wed, 16 Aug 2023 07:56:06 +0000 (13:26 +0530)]
ipv6: fix indentation of a config attribute

Fix indentation of a type attribute of IPV6_VTI config entry.

Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoMerge branch 'redundant-of_match_ptr'
David S. Miller [Wed, 16 Aug 2023 08:59:40 +0000 (09:59 +0100)]
Merge branch 'redundant-of_match_ptr'

Ruan Jinjie says:

====================
net: Remove redundant of_match_ptr() macro

Since these net drivers depend on CONFIG_OF, there is
no need to wrap the macro of_match_ptr() here.

Changes in v3:
- Collect responses from v1 and v2.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agowlcore: spi: Remove redundant of_match_ptr()
Ruan Jinjie [Mon, 14 Aug 2023 02:55:19 +0000 (10:55 +0800)]
wlcore: spi: Remove redundant of_match_ptr()

The driver depends on CONFIG_OF, it is not necessary to use
of_match_ptr() here.

Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agonet: qualcomm: Remove redundant of_match_ptr()
Ruan Jinjie [Mon, 14 Aug 2023 02:55:18 +0000 (10:55 +0800)]
net: qualcomm: Remove redundant of_match_ptr()

The driver depends on CONFIG_OF, it is not necessary to use
of_match_ptr() here.

Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agonet: gemini: Remove redundant of_match_ptr()
Ruan Jinjie [Mon, 14 Aug 2023 02:55:17 +0000 (10:55 +0800)]
net: gemini: Remove redundant of_match_ptr()

The driver depends on CONFIG_OF, it is not necessary to use
of_match_ptr() here.

Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agonet: dsa: rzn1-a5psw: Remove redundant of_match_ptr()
Ruan Jinjie [Mon, 14 Aug 2023 02:55:16 +0000 (10:55 +0800)]
net: dsa: rzn1-a5psw: Remove redundant of_match_ptr()

The driver depends on CONFIG_OF, it is not necessary to use
of_match_ptr() here.

Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agonet: dsa: realtek: Remove redundant of_match_ptr()
Ruan Jinjie [Mon, 14 Aug 2023 02:55:15 +0000 (10:55 +0800)]
net: dsa: realtek: Remove redundant of_match_ptr()

The driver depends on CONFIG_OF, it is not necessary to use
of_match_ptr() here.

Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agonfc: virtual_ncidev: Use module_misc_device macro to simplify the code
Li Zetao [Tue, 15 Aug 2023 07:49:27 +0000 (15:49 +0800)]
nfc: virtual_ncidev: Use module_misc_device macro to simplify the code

Use the module_misc_device macro to simplify the code, which is the
same as declaring with module_init() and module_exit().

Signed-off-by: Li Zetao <lizetao1@huawei.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agomailmap: add entries for Simon Horman
Simon Horman [Tue, 15 Aug 2023 15:27:49 +0000 (17:27 +0200)]
mailmap: add entries for Simon Horman

Retire some of my email addresses from Kernel activities.

Signed-off-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoMerge tag 'ipsec-2023-08-15' of git://git.kernel.org/pub/scm/linux/kernel/git/klasser...
David S. Miller [Wed, 16 Aug 2023 07:57:41 +0000 (08:57 +0100)]
Merge tag 'ipsec-2023-08-15' of git://git./linux/kernel/git/klassert/ipsec

Steffen Klassert says:

====================
1) Fix a slab-out-of-bounds read in xfrm_address_filter.
   From Lin Ma.

2) Fix the pfkey sadb_x_filter validation.
   From Lin Ma.

3) Use the correct nla_policy structure for XFRMA_SEC_CTX.
   From Lin Ma.

4) Fix warnings triggerable by bad packets in the encap functions.
   From Herbert Xu.

5) Fix some slab-use-after-free in decode_session6.
   From Zhengchao Shao.

6) Fix a possible NULL piointer dereference in xfrm_update_ae_params.
   Lin Ma.

7) Add a forgotten nla_policy for XFRMA_MTIMER_THRESH.
   From Lin Ma.

8) Don't leak offloaded policies.
   From Leon Romanovsky.

9) Delete also the offloading part of an acquire state.
   From Leon Romanovsky.

Please pull or let me know if there are problems.

10 months agoMerge branch 'hns3-ethtool'
David S. Miller [Wed, 16 Aug 2023 07:56:38 +0000 (08:56 +0100)]
Merge branch 'hns3-ethtool'

Jijie Shao says:

====================
hns3: refactor registers information for ethtool -d

refactor registers information for ethtool -d
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agonet: hns3: fix wrong rpu tln reg issue
Jijie Shao [Tue, 15 Aug 2023 06:06:41 +0000 (14:06 +0800)]
net: hns3: fix wrong rpu tln reg issue

In the original RPU query command, the status register values of
multiple RPU tunnels are accumulated by default, which is unreasonable.
This patch Fix it by querying the specified tunnel ID.
The tunnel number of the device can be obtained from firmware
during initialization.

Fixes: ddb54554fa51 ("net: hns3: add DFX registers information for ethtool -d")
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agonet: hns3: Support tlv in regs data for HNS3 VF driver
Jijie Shao [Tue, 15 Aug 2023 06:06:40 +0000 (14:06 +0800)]
net: hns3: Support tlv in regs data for HNS3 VF driver

The dump register function is being refactored.
The third step in refactoring is to support tlv info in regs data for
HNS3 PF driver.

Currently, if we use "ethtool -d" to dump regs value,
the output is as follows:
  offset1: 00 01 02 03 04 05 ...
  offset2:10 11 12 13 14 15 ...
  ......

We can't get the value of a register directly.

This patch deletes the original separator information and
add tag_len_value information in regs data.
ethtool can parse register data in key-value format by -d command.

a patch will be added to the ethtool to parse regs data
in the following format:
  reg1 : value2
  reg2 : value2
  ......

Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agonet: hns3: Support tlv in regs data for HNS3 PF driver
Jijie Shao [Tue, 15 Aug 2023 06:06:39 +0000 (14:06 +0800)]
net: hns3: Support tlv in regs data for HNS3 PF driver

The dump register function is being refactored.
The second step in refactoring is to support tlv info in regs data for
HNS3 PF driver.

Currently, if we use "ethtool -d" to dump regs value,
the output is as follows:
  offset1: 00 01 02 03 04 05 ...
  offset2:10 11 12 13 14 15 ...
  ......

We can't get the value of a register directly.

This patch deletes the original separator information and
add tag_len_value information in regs data.
ethtool can parse register data in key-value format by -d command.

a patch will be added to the ethtool to parse regs data
in the following format:
  reg1 : value2
  reg2 : value2
  ......

Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agonet: hns3: move dump regs function to a separate file
Jijie Shao [Tue, 15 Aug 2023 06:06:38 +0000 (14:06 +0800)]
net: hns3: move dump regs function to a separate file

The dump register function is being refactored.
The first step in refactoring is put the dump regs function
into a separate file.

Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoMerge branch 'fec-XDP_TX'
David S. Miller [Wed, 16 Aug 2023 06:12:40 +0000 (07:12 +0100)]
Merge branch 'fec-XDP_TX'

Wei Fang says:

====================
net: fec: add XDP_TX feature support

This patch set is to support the XDP_TX feature of FEC driver, the first
patch is add initial XDP_TX support, and the second patch improves the
performance of XDP_TX by not using xdp_convert_buff_to_frame(). Please
refer to the commit message of each patch for more details.
====================

Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agonet: fec: improve XDP_TX performance
Wei Fang [Tue, 15 Aug 2023 05:19:55 +0000 (13:19 +0800)]
net: fec: improve XDP_TX performance

As suggested by Jesper and Alexander, we can avoid converting xdp_buff
to xdp_frame in case of XDP_TX to save a bunch of CPU cycles, so that
we can further improve the XDP_TX performance.

Before this patch on i.MX8MP-EVK board, the performance shows as follows.
root@imx8mpevk:~# ./xdp2 eth0
proto 17:     353918 pkt/s
proto 17:     352923 pkt/s
proto 17:     353900 pkt/s
proto 17:     352672 pkt/s
proto 17:     353912 pkt/s
proto 17:     354219 pkt/s

After applying this patch, the performance is improved.
root@imx8mpevk:~# ./xdp2 eth0
proto 17:     369261 pkt/s
proto 17:     369267 pkt/s
proto 17:     369206 pkt/s
proto 17:     369214 pkt/s
proto 17:     369126 pkt/s
proto 17:     369272 pkt/s

Signed-off-by: Wei Fang <wei.fang@nxp.com>
Suggested-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Suggested-by: Jesper Dangaard Brouer <hawk@kernel.org>
Reviewed-by: Jesper Dangaard Brouer <hawk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agonet: fec: add XDP_TX feature support
Wei Fang [Tue, 15 Aug 2023 05:19:54 +0000 (13:19 +0800)]
net: fec: add XDP_TX feature support

The XDP_TX feature is not supported before, and all the frames
which are deemed to do XDP_TX action actually do the XDP_DROP
action. So this patch adds the XDP_TX support to FEC driver.

I tested the performance of XDP_TX in XDP_DRV mode and XDP_SKB
mode respectively on i.MX8MP-EVK platform, and as suggested by
Jesper, I also tested the performance of XDP_REDIRECT on the
same platform. And the test steps and results are as follows.

XDP_TX test:
Step 1: One board is used as generator and connects to switch,and
the FEC port of DUT also connects to the switch. Both boards with
flow control off. Then the generator runs the
pktgen_sample03_burst_single_flow.sh script to generate and send
burst traffic to DUT. Note that the size of packet was set to 64
bytes and the procotol of packet was UDP in my test scenario. In
addition, the SMAC of the packet need to be different from the MAC
of the generator, because the xdp2 program will swap the DMAC and
SMAC of the packet and send it back to the generator. If the SMAC
of the generated packet is the MAC of the generator, the generator
will receive the returned traffic which increase the CPU loading
and significantly degrade the transmit speed of the generator, and
finally it affects the test of XDP_TX performance.

Step 2: The DUT runs the xdp2 program to transmit received UDP
packets back out on the same port where they were received.

root@imx8mpevk:~# ./xdp2 eth0
proto 17:     353918 pkt/s
proto 17:     352923 pkt/s
proto 17:     353900 pkt/s
proto 17:     352672 pkt/s
proto 17:     353912 pkt/s
proto 17:     354219 pkt/s

root@imx8mpevk:~# ./xdp2 -S eth0
proto 17:     160604 pkt/s
proto 17:     160708 pkt/s
proto 17:     160564 pkt/s
proto 17:     160684 pkt/s
proto 17:     160640 pkt/s
proto 17:     160720 pkt/s

The above results show that the XDP_TX performance of XDP_DRV mode
is much better than XDP_SKB mode, more than twice that of XDP_SKB
mode, which is in line with our expectation.

XDP_REDIRECT test:
Step1: Both the generator and the FEC port of the DUT connet to the
switch port. All the ports with flow control off, then the generator
runs the pktgen script to generate and send burst traffic to DUT.
Note that the size of packet was set to 64 bytes and the procotol of
packet was UDP in my test scenario.

Step2: The DUT runs the xdp_redirect program to redirect the traffic
from the FEC port to the FEC port itself.

root@imx8mpevk:~# ./xdp_redirect eth0 eth0
Redirecting from eth0 (ifindex 2; driver fec) to eth0
(ifindex 2; driver fec)
Summary        232,302 rx/s        0 err,drop/s      232,344 xmit/s
Summary        234,579 rx/s        0 err,drop/s      234,577 xmit/s
Summary        235,548 rx/s        0 err,drop/s      235,549 xmit/s
Summary        234,704 rx/s        0 err,drop/s      234,703 xmit/s
Summary        235,504 rx/s        0 err,drop/s      235,504 xmit/s
Summary        235,223 rx/s        0 err,drop/s      235,224 xmit/s
Summary        234,509 rx/s        0 err,drop/s      234,507 xmit/s
Summary        235,481 rx/s        0 err,drop/s      235,482 xmit/s
Summary        234,684 rx/s        0 err,drop/s      234,683 xmit/s
Summary        235,520 rx/s        0 err,drop/s      235,520 xmit/s
Summary        235,461 rx/s        0 err,drop/s      235,461 xmit/s
Summary        234,627 rx/s        0 err,drop/s      234,627 xmit/s
Summary        235,611 rx/s        0 err,drop/s      235,611 xmit/s
  Packets received    : 3,053,753
  Average packets/s   : 234,904
  Packets transmitted : 3,053,792
  Average transmit/s  : 234,907

Compared the performance of XDP_TX with XDP_REDIRECT, XDP_TX is also
much better than XDP_REDIRECT. It's also in line with our expectation.

Signed-off-by: Wei Fang <wei.fang@nxp.com>
Suggested-by: Jesper Dangaard Brouer <hawk@kernel.org>
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agobroadcom: b44: Use b44_writephy() return value
Artem Chernyshev [Mon, 14 Aug 2023 21:00:30 +0000 (00:00 +0300)]
broadcom: b44: Use b44_writephy() return value

Return result of b44_writephy() instead of zero to
deal with possible error.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Signed-off-by: Artem Chernyshev <artem.chernyshev@red-soft.ru>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoselftests: bonding: remove redundant delete action of device link1_1
Zhengchao Shao [Sat, 12 Aug 2023 08:40:36 +0000 (16:40 +0800)]
selftests: bonding: remove redundant delete action of device link1_1

When run command "ip netns delete client", device link1_1 has been
deleted. So, it is no need to delete link1_1 again. Remove it.

Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoMerge tag 'mlx5-updates-2023-08-14' of git://git.kernel.org/pub/scm/linux/kernel...
Jakub Kicinski [Wed, 16 Aug 2023 02:21:49 +0000 (19:21 -0700)]
Merge tag 'mlx5-updates-2023-08-14' of git://git./linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2023-08-14

1) Handle PTP out of order CQEs issue
2) Check FW status before determining reset successful
3) Expose maximum supported SFs via devlink resource
4) MISC cleanups

* tag 'mlx5-updates-2023-08-14' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5: Don't query MAX caps twice
  net/mlx5: Remove unused MAX HCA capabilities
  net/mlx5: Remove unused CAPs
  net/mlx5: Fix error message in mlx5_sf_dev_state_change_handler()
  net/mlx5: Remove redundant check of mlx5_vhca_event_supported()
  net/mlx5: Use mlx5_sf_start_function_id() helper instead of directly calling MLX5_CAP_GEN()
  net/mlx5: Remove redundant SF supported check from mlx5_sf_hw_table_init()
  net/mlx5: Use auxiliary_device_uninit() instead of device_put()
  net/mlx5: E-switch, Add checking for flow rule destinations
  net/mlx5: Check with FW that sync reset completed successfully
  net/mlx5: Expose max possible SFs via devlink resource
  net/mlx5e: Add recovery flow for tx devlink health reporter for unhealthy PTP SQ
  net/mlx5e: Make tx_port_ts logic resilient to out-of-order CQEs
  net/mlx5: Consolidate devlink documentation in devlink/mlx5.rst
====================

Link: https://lore.kernel.org/r/20230814214144.159464-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agoMerge branch 'net-warn-about-attempts-to-register-negative-ifindex'
Jakub Kicinski [Wed, 16 Aug 2023 02:18:36 +0000 (19:18 -0700)]
Merge branch 'net-warn-about-attempts-to-register-negative-ifindex'

Jakub Kicinski says:

====================
net: warn about attempts to register negative ifindex

Follow up to the recently posted fix for OvS lacking input
validation:
https://lore.kernel.org/all/20230814203840.2908710-1-kuba@kernel.org/

Warn about negative ifindex more explicitly and misc YNL updates.
====================

Link: https://lore.kernel.org/r/20230814205627.2914583-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agotools: ynl: add more info to KeyErrors on missing attrs
Jakub Kicinski [Mon, 14 Aug 2023 20:56:27 +0000 (13:56 -0700)]
tools: ynl: add more info to KeyErrors on missing attrs

When developing specs its useful to know which attr space
YNL was trying to find an attribute in on key error.

Instead of printing:
 KeyError: 0
add info about the space:
 Exception: Space 'vport' has no attribute with value '0'

Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://lore.kernel.org/r/20230814205627.2914583-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>