platform/kernel/linux-starfive.git
16 months agoscsi: target: core: Fix error path in target_setup_session()
Bob Pearson [Tue, 13 Jun 2023 14:43:00 +0000 (09:43 -0500)]
scsi: target: core: Fix error path in target_setup_session()

In the error exits in target_setup_session(), if a branch is taken to
free_sess: transport_free_session() may call to target_free_cmd_counter()
and then fall through to call target_free_cmd_counter() a second time.
This can, and does, sometimes cause seg faults since the data field in
cmd_cnt->refcnt has been freed in the first call.

Fix this problem by simply returning after the call to
transport_free_session(). The second call is redundant for those cases.

Fixes: 4edba7e4a8f3 ("scsi: target: Move cmd counter allocation")
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Link: https://lore.kernel.org/r/20230613144259.12890-1-rpearsonhpe@gmail.com
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
16 months agoscsi: storvsc: Always set no_report_opcodes
Michael Kelley [Fri, 9 Jun 2023 20:38:21 +0000 (13:38 -0700)]
scsi: storvsc: Always set no_report_opcodes

Hyper-V synthetic SCSI devices do not support the MAINTENANCE_IN SCSI
command, so scsi_report_opcode() always fails, resulting in messages like
this:

hv_storvsc <guid>: tag#205 cmd 0xa3 status: scsi 0x2 srb 0x86 hv 0xc0000001

The recently added support for command duration limits calls
scsi_report_opcode() four times as each device comes online, which
significantly increases the number of messages logged in a system with many
disks.

Fix the problem by always marking Hyper-V synthetic SCSI devices as not
supporting scsi_report_opcode(). With this setting, the MAINTENANCE_IN SCSI
command is not issued and no messages are logged.

Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/1686343101-18930-1-git-send-email-mikelley@microsoft.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
16 months agoscsi: aacraid: Reply queue mapping to CPUs based on IRQ affinity
Sagar Biradar [Fri, 19 May 2023 23:08:34 +0000 (16:08 -0700)]
scsi: aacraid: Reply queue mapping to CPUs based on IRQ affinity

Fix the I/O hang that arises because of the MSIx vector not having a mapped
online CPU upon receiving completion.

SCSI cmds take the blk_mq route, which is setup during init. Reserved cmds
fetch the vector_no from mq_map after init is complete. Before init, they
have to use 0 - as per the norm.

Reviewed-by: Gilbert Wu <gilbert.wu@microchip.com>
Signed-off-by: Sagar Biradar <Sagar.Biradar@microchip.com>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Link: https://lore.kernel.org/r/20230519230834.27436-1-sagar.biradar@microchip.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
16 months agoclk: pxa: fix NULL pointer dereference in pxa3xx_clk_update_accr
Arnd Bergmann [Thu, 11 May 2023 10:58:33 +0000 (12:58 +0200)]
clk: pxa: fix NULL pointer dereference in pxa3xx_clk_update_accr

sparse points out an embarrasing bug in an older patch of mine,
which uses the register offset instead of an __iomem pointer:

drivers/clk/pxa/clk-pxa3xx.c:167:9: sparse: sparse: Using plain integer as NULL pointer

Unlike sparse, gcc and clang ignore this bug and fail to warn
because a literal '0' is considered a valid representation of
a NULL pointer.

Fixes: 3c816d950a49 ("ARM: pxa: move clk register definitions to driver")
Cc: stable@vger.kernel.org
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/oe-kbuild-all/202305111301.RAHohdob-lkp@intel.com/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20230511105845.299859-1-arnd@kernel.org
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
16 months agoRevert "media: dvb-core: Fix use-after-free on race condition at dvb_frontend"
Mauro Carvalho Chehab [Fri, 9 Jun 2023 08:16:21 +0000 (09:16 +0100)]
Revert "media: dvb-core: Fix use-after-free on race condition at dvb_frontend"

As reported by Thomas Voegtle <tv@lio96.de>, sometimes a DVB card does
not initialize properly booting Linux 6.4-rc4. This is not always, maybe
in 3 out of 4 attempts.

After double-checking, the root cause seems to be related to the
UAF fix, which is causing a race issue:

[   26.332149] tda10071 7-0005: found a 'NXP TDA10071' in cold state, will try to load a firmware
[   26.340779] tda10071 7-0005: downloading firmware from file 'dvb-fe-tda10071.fw'
[  989.277402] INFO: task vdr:743 blocked for more than 491 seconds.
[  989.283504]       Not tainted 6.4.0-rc5-i5 #249
[  989.288036] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  989.295860] task:vdr             state:D stack:0     pid:743   ppid:711    flags:0x00004002
[  989.295865] Call Trace:
[  989.295867]  <TASK>
[  989.295869]  __schedule+0x2ea/0x12d0
[  989.295877]  ? asm_sysvec_apic_timer_interrupt+0x16/0x20
[  989.295881]  schedule+0x57/0xc0
[  989.295884]  schedule_preempt_disabled+0xc/0x20
[  989.295887]  __mutex_lock.isra.16+0x237/0x480
[  989.295891]  ? dvb_get_property.isra.10+0x1bc/0xa50
[  989.295898]  ? dvb_frontend_stop+0x36/0x180
[  989.338777]  dvb_frontend_stop+0x36/0x180
[  989.338781]  dvb_frontend_open+0x2f1/0x470
[  989.338784]  dvb_device_open+0x81/0xf0
[  989.338804]  ? exact_lock+0x20/0x20
[  989.338808]  chrdev_open+0x7f/0x1c0
[  989.338811]  ? generic_permission+0x1a2/0x230
[  989.338813]  ? link_path_walk.part.63+0x340/0x380
[  989.338815]  ? exact_lock+0x20/0x20
[  989.338817]  do_dentry_open+0x18e/0x450
[  989.374030]  path_openat+0xca5/0xe00
[  989.374031]  ? terminate_walk+0xec/0x100
[  989.374034]  ? path_lookupat+0x93/0x140
[  989.374036]  do_filp_open+0xc0/0x140
[  989.374038]  ? __call_rcu_common.constprop.91+0x92/0x240
[  989.374041]  ? __check_object_size+0x147/0x260
[  989.374043]  ? __check_object_size+0x147/0x260
[  989.374045]  ? alloc_fd+0xbb/0x180
[  989.374048]  ? do_sys_openat2+0x243/0x310
[  989.374050]  do_sys_openat2+0x243/0x310
[  989.374052]  do_sys_open+0x52/0x80
[  989.374055]  do_syscall_64+0x5b/0x80
[  989.421335]  ? __task_pid_nr_ns+0x92/0xa0
[  989.421337]  ? syscall_exit_to_user_mode+0x20/0x40
[  989.421339]  ? do_syscall_64+0x67/0x80
[  989.421341]  ? syscall_exit_to_user_mode+0x20/0x40
[  989.421343]  ? do_syscall_64+0x67/0x80
[  989.421345]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  989.421348] RIP: 0033:0x7fe895d067e3
[  989.421349] RSP: 002b:00007fff933c2ba0 EFLAGS: 00000293 ORIG_RAX: 0000000000000101
[  989.421351] RAX: ffffffffffffffda RBX: 00007fff933c2c10 RCX: 00007fe895d067e3
[  989.421352] RDX: 0000000000000802 RSI: 00005594acdce160 RDI: 00000000ffffff9c
[  989.421353] RBP: 0000000000000802 R08: 0000000000000000 R09: 0000000000000000
[  989.421353] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000001
[  989.421354] R13: 00007fff933c2ca0 R14: 00000000ffffffff R15: 00007fff933c2c90
[  989.421355]  </TASK>

This reverts commit 6769a0b7ee0c3b31e1b22c3fadff2bfb642de23f.

Fixes: 6769a0b7ee0c ("media: dvb-core: Fix use-after-free on race condition at dvb_frontend")
Link: https://lore.kernel.org/all/da5382ad-09d6-20ac-0d53-611594b30861@lio96.de/
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
16 months agoio_uring/io-wq: clear current->worker_private on exit
Jens Axboe [Wed, 14 Jun 2023 01:26:55 +0000 (19:26 -0600)]
io_uring/io-wq: clear current->worker_private on exit

A recent fix stopped clearing PF_IO_WORKER from current->flags on exit,
which meant that we can now call inc/dec running on the worker after it
has been removed if it ends up scheduling in/out as part of exit.

If this happens after an RCU grace period has passed, then the struct
pointed to by current->worker_private may have been freed, and we can
now be accessing memory that is freed.

Ensure this doesn't happen by clearing the task worker_private field.
Both io_wq_worker_running() and io_wq_worker_sleeping() check this
field before going any further, and we don't need any accounting etc
done after this worker has exited.

Fixes: fd37b884003c ("io_uring/io-wq: don't clear PF_IO_WORKER on exit")
Reported-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
16 months agotracing/user_events: Document auto-cleanup and remove dyn_event refs
Beau Belgrave [Wed, 14 Jun 2023 16:33:36 +0000 (09:33 -0700)]
tracing/user_events: Document auto-cleanup and remove dyn_event refs

Now user_events auto-cleanup upon the last reference by default. This
makes it not possible to use the dynamics event file via tracefs.

Document that auto-cleanup is enabled by default and remove the refernce
to /sys/kernel/tracing/dynamic_events file to make this clear.

Link: https://lkml.kernel.org/r/20230614163336.5797-7-beaub@linux.microsoft.com
Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
16 months agoselftests/user_events: Adapt dyn_test to non-persist events
Beau Belgrave [Wed, 14 Jun 2023 16:33:35 +0000 (09:33 -0700)]
selftests/user_events: Adapt dyn_test to non-persist events

Now that user_events does not honor persist events the dynamic_events
file cannot be easily used to test parsing and matching cases.

Update dyn_test to use the direct ABI file instead of dynamic_events so
that we still have testing coverage until persist events and
dynamic_events file integration has been decided.

Link: https://lkml.kernel.org/r/20230614163336.5797-6-beaub@linux.microsoft.com
Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
16 months agoselftests/user_events: Ensure auto cleanup works as expected
Beau Belgrave [Wed, 14 Jun 2023 16:33:34 +0000 (09:33 -0700)]
selftests/user_events: Ensure auto cleanup works as expected

User events now auto cleanup upon the last reference put. Update
ftrace_test to ensure this works as expected. Ensure EBUSY delays
while event is being deleted do not cause transient failures by
waiting and re-attempting.

Link: https://lkml.kernel.org/r/20230614163336.5797-5-beaub@linux.microsoft.com
Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
16 months agotracing/user_events: Add auto cleanup and future persist flag
Beau Belgrave [Wed, 14 Jun 2023 16:33:33 +0000 (09:33 -0700)]
tracing/user_events: Add auto cleanup and future persist flag

Currently user events need to be manually deleted via the delete IOCTL
call or via the dynamic_events file. Most operators and processes wish
to have these events auto cleanup when they are no longer used by
anything to prevent them piling without manual maintenance. However,
some operators may not want this, such as pre-registering events via the
dynamic_events tracefs file.

Update user_event_put() to attempt an auto delete of the event if it's
the last reference. The auto delete must run in a work queue to ensure
proper behavior of class->reg() invocations that don't expect the call
to go away from underneath them during the unregister. Add work_struct
to user_event struct to ensure we can do this reliably.

Add a persist flag, that is not yet exposed, to ensure we can toggle
between auto-cleanup and leaving the events existing in the future. When
a non-zero flag is seen during register, return -EINVAL to ensure ABI
is clear for the user processes while we work out the best approach for
persistent events.

Link: https://lkml.kernel.org/r/20230614163336.5797-4-beaub@linux.microsoft.com
Link: https://lore.kernel.org/linux-trace-kernel/20230518093600.3f119d68@rorschach.local.home/
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
16 months agotracing/user_events: Track refcount consistently via put/get
Beau Belgrave [Wed, 14 Jun 2023 16:33:32 +0000 (09:33 -0700)]
tracing/user_events: Track refcount consistently via put/get

Various parts of the code today track user_event's refcnt field directly
via a refcount_add/dec. This makes it hard to modify the behavior of the
last reference decrement in all code paths consistently. For example, in
the future we will auto-delete events upon the last reference going
away. This last reference could happen in many places, but we want it to
be consistently handled.

Add user_event_get() and user_event_put() for the add/dec. Update all
places where direct refcounts are being used to utilize these new
functions. In each location pass if event_mutex is locked or not. This
allows us to drop events automatically in future patches clearly. Ensure
when caller states the lock is held, it really is (or is not) held.

Link: https://lkml.kernel.org/r/20230614163336.5797-3-beaub@linux.microsoft.com
Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
16 months agotracing/user_events: Store register flags on events
Beau Belgrave [Wed, 14 Jun 2023 16:33:31 +0000 (09:33 -0700)]
tracing/user_events: Store register flags on events

Currently we don't have any available flags for user processes to use to
indicate options for user_events. We will soon have a flag to indicate
the event should or should not auto-delete once it's not being used by
anyone.

Add a reg_flags field to user_events and parameters to existing
functions to allow for this in future patches.

Link: https://lkml.kernel.org/r/20230614163336.5797-2-beaub@linux.microsoft.com
Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
16 months agoRDMA/rxe: Fix rxe_cq_post
Bob Pearson [Mon, 12 Jun 2023 15:50:33 +0000 (10:50 -0500)]
RDMA/rxe: Fix rxe_cq_post

A recent patch replaced a tasklet execution of cq->comp_handler by a
direct call. While this made sense it let changes to cq->notify state be
unprotected and assumed that the cq completion machinery and the ulp done
callbacks were reentrant. The result is that in some cases completion
events can be lost. This patch moves the cq->comp_handler call inside of
the spinlock in rxe_cq_post which solves both issues. This is compatible
with the matching code in the request notify verb.

Fixes: 78b26a335310 ("RDMA/rxe: Remove tasklet call from rxe_cq.c")
Link: https://lore.kernel.org/r/20230612155032.17036-1-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
16 months agotracing/user_events: Remove user_ns walk for groups
Beau Belgrave [Thu, 1 Jun 2023 22:49:28 +0000 (15:49 -0700)]
tracing/user_events: Remove user_ns walk for groups

During discussions it was suggested that user_ns is not a good place to
try to attach a tracing namespace. The current code has stubs to enable
that work that are very likely to change and incur a performance cost.

Remove the user_ns walk when creating a group and determining the system
name to use, since it's unlikely user_ns will be used in the future.

Link: https://lore.kernel.org/all/20230601-urenkel-holzofen-cd9403b9cadd@brauner/
Link: https://lore.kernel.org/linux-trace-kernel/20230601224928.301-1-beaub@linux.microsoft.com
Suggested-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
16 months agoselftests/user_events: Add perf self-test for empty arguments events
sunliming [Tue, 6 Jun 2023 06:20:27 +0000 (14:20 +0800)]
selftests/user_events: Add perf self-test for empty arguments events

Tests to ensure events that has empty arguments can input trace record
correctly when using perf.

Link: https://lkml.kernel.org/r/20230606062027.1008398-5-sunliming@kylinos.cn
Acked-by: Beau Belgrave <beaub@linux.microsoft.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: sunliming <sunliming@kylinos.cn>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
16 months agoselftests/user_events: Clear the events after perf self-test
sunliming [Tue, 6 Jun 2023 06:20:26 +0000 (14:20 +0800)]
selftests/user_events: Clear the events after perf self-test

When the self test is completed, perf self-test left the user events not to
be cleared. Clear the events by unregister and delete the event.

Link: https://lkml.kernel.org/r/20230606062027.1008398-4-sunliming@kylinos.cn
Acked-by: Beau Belgrave <beaub@linux.microsoft.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: sunliming <sunliming@kylinos.cn>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
16 months agoselftests/user_events: Add ftrace self-test for empty arguments events
sunliming [Tue, 6 Jun 2023 06:20:25 +0000 (14:20 +0800)]
selftests/user_events: Add ftrace self-test for empty arguments events

Tests to ensure events that has empty arguments can input trace record
correctly when using ftrace.

Link: https://lkml.kernel.org/r/20230606062027.1008398-3-sunliming@kylinos.cn
Acked-by: Beau Belgrave <beaub@linux.microsoft.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: sunliming <sunliming@kylinos.cn>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
16 months agotracing/user_events: Fix the incorrect trace record for empty arguments events
sunliming [Tue, 6 Jun 2023 06:20:24 +0000 (14:20 +0800)]
tracing/user_events: Fix the incorrect trace record for empty arguments events

The user_events support events that has empty arguments. But the trace event
is discarded and not really committed when the arguments is empty. Fix this
by not attempting to copy in zero-length data.

Link: https://lkml.kernel.org/r/20230606062027.1008398-2-sunliming@kylinos.cn
Acked-by: Beau Belgrave <beaub@linux.microsoft.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: sunliming <sunliming@kylinos.cn>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
16 months agotracing: Modify print_fields() for fields output order
sunliming [Thu, 25 May 2023 08:52:32 +0000 (16:52 +0800)]
tracing: Modify print_fields() for fields output order

Now the print_fields() print trace event fields in reverse order. Modify
it to the positive sequence.

Example outputs for a user event:
test0 u32 count1; u32 count2

Output before:
example-2547    [000] .....   325.666387: test0: count2=0x2 (2) count1=0x1 (1)

Output after:
example-2742    [002] .....   429.769370: test0: count1=0x1 (1) count2=0x2 (2)

Link: https://lore.kernel.org/linux-trace-kernel/20230525085232.5096-1-sunliming@kylinos.cn
Fixes: 80a76994b2d88 ("tracing: Add "fields" option to show raw trace event fields")
Signed-off-by: sunliming <sunliming@kylinos.cn>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
16 months agotracing/user_events: Handle matching arguments that is null from dyn_events
sunliming [Mon, 29 May 2023 06:51:10 +0000 (14:51 +0800)]
tracing/user_events: Handle matching arguments that is null from dyn_events

When A registering user event from dyn_events has no argments, it will pass the
matching check, regardless of whether there is a user event with the same name
and arguments. Add the matching check when the arguments of registering user
event is null.

Link: https://lore.kernel.org/linux-trace-kernel/20230529065110.303440-1-sunliming@kylinos.cn
Signed-off-by: sunliming <sunliming@kylinos.cn>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
16 months agotracing/user_events: Prevent same name but different args event
sunliming [Mon, 29 May 2023 03:21:00 +0000 (11:21 +0800)]
tracing/user_events: Prevent same name but different args event

User processes register name_args for events. If the same name but different
args event are registered. The trace outputs of second event are printed
as the first event. This is incorrect.

Return EADDRINUSE back to the user process if the same name but different args
event has being registered.

Link: https://lore.kernel.org/linux-trace-kernel/20230529032100.286534-1-sunliming@kylinos.cn
Signed-off-by: sunliming <sunliming@kylinos.cn>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Acked-by: Beau Belgrave <beaub@linux.microsoft.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
16 months agobtrfs: scrub: fix a return value overwrite in scrub_stripe()
Qu Wenruo [Wed, 14 Jun 2023 06:49:35 +0000 (14:49 +0800)]
btrfs: scrub: fix a return value overwrite in scrub_stripe()

[RETURN VALUE OVERWRITE]
Inside scrub_stripe(), we would submit all the remaining stripes after
iterating all extents.

But since flush_scrub_stripes() can return error, we need to avoid
overwriting the existing @ret if there is any error.

However the existing check is doing the wrong check:

ret2 = flush_scrub_stripes();
if (!ret2)
ret = ret2;

This would overwrite the existing @ret to 0 as long as the final flush
detects no critical errors.

[FIX]
We should check @ret other than @ret2 in that case.

Fixes: 8eb3dd17eadd ("btrfs: dev-replace: error out if we have unrepaired metadata error during")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
16 months agocifs: add a warning when the in-flight count goes negative
Shyam Prasad N [Fri, 9 Jun 2023 17:46:56 +0000 (17:46 +0000)]
cifs: add a warning when the in-flight count goes negative

We've seen the in-flight count go into negative with some
internal stress testing in Microsoft.

Adding a WARN when this happens, in hope of understanding
why this happens when it happens.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Bharath SM <bharathsm@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
16 months agocifs: fix lease break oops in xfstest generic/098
Steve French [Sun, 11 Jun 2023 16:23:32 +0000 (11:23 -0500)]
cifs: fix lease break oops in xfstest generic/098

umount can race with lease break so need to check if
tcon->ses->server is still valid to send the lease
break response.

Reviewed-by: Bharath SM <bharathsm@microsoft.com>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Fixes: 59a556aebc43 ("SMB3: drop reference to cfile before sending oplock break")
Signed-off-by: Steve French <stfrench@microsoft.com>
16 months agoDocumentation: RISC-V: patch-acceptance: mention patchwork's role
Conor Dooley [Tue, 6 Jun 2023 06:59:19 +0000 (07:59 +0100)]
Documentation: RISC-V: patch-acceptance: mention patchwork's role

Palmer suggested at some point, not sure if it was in one of the
weekly linux-riscv syncs, or a conversation at FOSDEM, that we
should document the role of the automation running on our patchwork
instance plays in patch acceptance.

Add a short note to the patch-acceptance document to that end.

Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20230606-rehab-monsoon-12c17bbe08e3@wendy
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
16 months agoselftests: forwarding: hw_stats_l3: Set addrgenmode in a separate step
Danielle Ratson [Mon, 12 Jun 2023 14:34:58 +0000 (16:34 +0200)]
selftests: forwarding: hw_stats_l3: Set addrgenmode in a separate step

Setting the IPv6 address generation mode of a net device during its
creation never worked, but after commit b0ad3c179059 ("rtnetlink: call
validate_linkmsg in rtnl_create_link") it explicitly fails [1]. The
failure is caused by the fact that validate_linkmsg() is called before
the net device is registered, when it still does not have an 'inet6_dev'.

Likewise, raising the net device before setting the address generation
mode is meaningless, because by the time the mode is set, the address
has already been generated.

Therefore, fix the test to first create the net device, then set its
IPv6 address generation mode and finally bring it up.

[1]
 # ip link add name mydev addrgenmode eui64 type dummy
 RTNETLINK answers: Address family not supported by protocol

Fixes: ba95e7930957 ("selftests: forwarding: hw_stats_l3: Add a new test")
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Link: https://lore.kernel.org/r/f3b05d85b2bc0c3d6168fe8f7207c6c8365703db.1686580046.git.petrm@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
16 months agoMerge branch 'net-sched-fix-race-conditions-in-mini_qdisc_pair_swap'
Paolo Abeni [Wed, 14 Jun 2023 08:31:42 +0000 (10:31 +0200)]
Merge branch 'net-sched-fix-race-conditions-in-mini_qdisc_pair_swap'

Peilin Ye says:

====================
net/sched: Fix race conditions in mini_qdisc_pair_swap()

These 2 patches fix race conditions for ingress and clsact Qdiscs as
reported [1] by syzbot, split out from another [2] series (last 2 patches
of it).  Per-patch changelog omitted.

Patch 1 hasn't been touched since last version; I just included
everybody's tag.

Patch 2 bases on patch 6 v1 of [2], with comments and commit log slightly
changed.  We also need rtnl_dereference() to load ->qdisc_sleeping since
commit d636fc5dd692 ("net: sched: add rcu annotations around
qdisc->qdisc_sleeping"), so I changed that; please take yet another look,
thanks!

Patch 2 has been tested with the new reproducer Pedro posted [3].

[1] https://syzkaller.appspot.com/bug?extid=b53a9c0d1ea4ad62da8b
[2] https://lore.kernel.org/r/cover.1684887977.git.peilin.ye@bytedance.com/
[3] https://lore.kernel.org/r/7879f218-c712-e9cc-57ba-665990f5f4c9@mojatatu.com/
====================

Link: https://lore.kernel.org/r/cover.1686355297.git.peilin.ye@bytedance.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
16 months agonet/sched: qdisc_destroy() old ingress and clsact Qdiscs before grafting
Peilin Ye [Sun, 11 Jun 2023 03:30:25 +0000 (20:30 -0700)]
net/sched: qdisc_destroy() old ingress and clsact Qdiscs before grafting

mini_Qdisc_pair::p_miniq is a double pointer to mini_Qdisc, initialized
in ingress_init() to point to net_device::miniq_ingress.  ingress Qdiscs
access this per-net_device pointer in mini_qdisc_pair_swap().  Similar
for clsact Qdiscs and miniq_egress.

Unfortunately, after introducing RTNL-unlocked RTM_{NEW,DEL,GET}TFILTER
requests (thanks Hillf Danton for the hint), when replacing ingress or
clsact Qdiscs, for example, the old Qdisc ("@old") could access the same
miniq_{in,e}gress pointer(s) concurrently with the new Qdisc ("@new"),
causing race conditions [1] including a use-after-free bug in
mini_qdisc_pair_swap() reported by syzbot:

 BUG: KASAN: slab-use-after-free in mini_qdisc_pair_swap+0x1c2/0x1f0 net/sched/sch_generic.c:1573
 Write of size 8 at addr ffff888045b31308 by task syz-executor690/14901
...
 Call Trace:
  <TASK>
  __dump_stack lib/dump_stack.c:88 [inline]
  dump_stack_lvl+0xd9/0x150 lib/dump_stack.c:106
  print_address_description.constprop.0+0x2c/0x3c0 mm/kasan/report.c:319
  print_report mm/kasan/report.c:430 [inline]
  kasan_report+0x11c/0x130 mm/kasan/report.c:536
  mini_qdisc_pair_swap+0x1c2/0x1f0 net/sched/sch_generic.c:1573
  tcf_chain_head_change_item net/sched/cls_api.c:495 [inline]
  tcf_chain0_head_change.isra.0+0xb9/0x120 net/sched/cls_api.c:509
  tcf_chain_tp_insert net/sched/cls_api.c:1826 [inline]
  tcf_chain_tp_insert_unique net/sched/cls_api.c:1875 [inline]
  tc_new_tfilter+0x1de6/0x2290 net/sched/cls_api.c:2266
...

@old and @new should not affect each other.  In other words, @old should
never modify miniq_{in,e}gress after @new, and @new should not update
@old's RCU state.

Fixing without changing sch_api.c turned out to be difficult (please
refer to Closes: for discussions).  Instead, make sure @new's first call
always happen after @old's last call (in {ingress,clsact}_destroy()) has
finished:

In qdisc_graft(), return -EBUSY if @old has any ongoing filter requests,
and call qdisc_destroy() for @old before grafting @new.

Introduce qdisc_refcount_dec_if_one() as the counterpart of
qdisc_refcount_inc_nz() used for filter requests.  Introduce a
non-static version of qdisc_destroy() that does a TCQ_F_BUILTIN check,
just like qdisc_put() etc.

Depends on patch "net/sched: Refactor qdisc_graft() for ingress and
clsact Qdiscs".

[1] To illustrate, the syzkaller reproducer adds ingress Qdiscs under
TC_H_ROOT (no longer possible after commit c7cfbd115001 ("net/sched:
sch_ingress: Only create under TC_H_INGRESS")) on eth0 that has 8
transmission queues:

  Thread 1 creates ingress Qdisc A (containing mini Qdisc a1 and a2),
  then adds a flower filter X to A.

  Thread 2 creates another ingress Qdisc B (containing mini Qdisc b1 and
  b2) to replace A, then adds a flower filter Y to B.

 Thread 1               A's refcnt   Thread 2
  RTM_NEWQDISC (A, RTNL-locked)
   qdisc_create(A)               1
   qdisc_graft(A)                9

  RTM_NEWTFILTER (X, RTNL-unlocked)
   __tcf_qdisc_find(A)          10
   tcf_chain0_head_change(A)
   mini_qdisc_pair_swap(A) (1st)
            |
            |                         RTM_NEWQDISC (B, RTNL-locked)
         RCU sync                2     qdisc_graft(B)
            |                    1     notify_and_destroy(A)
            |
   tcf_block_release(A)          0    RTM_NEWTFILTER (Y, RTNL-unlocked)
   qdisc_destroy(A)                    tcf_chain0_head_change(B)
   tcf_chain0_head_change_cb_del(A)    mini_qdisc_pair_swap(B) (2nd)
   mini_qdisc_pair_swap(A) (3rd)                |
           ...                                 ...

Here, B calls mini_qdisc_pair_swap(), pointing eth0->miniq_ingress to
its mini Qdisc, b1.  Then, A calls mini_qdisc_pair_swap() again during
ingress_destroy(), setting eth0->miniq_ingress to NULL, so ingress
packets on eth0 will not find filter Y in sch_handle_ingress().

This is just one of the possible consequences of concurrently accessing
miniq_{in,e}gress pointers.

Fixes: 7a096d579e8e ("net: sched: ingress: set 'unlocked' flag for Qdisc ops")
Fixes: 87f373921c4e ("net: sched: ingress: set 'unlocked' flag for clsact Qdisc ops")
Reported-by: syzbot+b53a9c0d1ea4ad62da8b@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/r/0000000000006cf87705f79acf1a@google.com/
Cc: Hillf Danton <hdanton@sina.com>
Cc: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
16 months agonet/sched: Refactor qdisc_graft() for ingress and clsact Qdiscs
Peilin Ye [Sun, 11 Jun 2023 03:30:15 +0000 (20:30 -0700)]
net/sched: Refactor qdisc_graft() for ingress and clsact Qdiscs

Grafting ingress and clsact Qdiscs does not need a for-loop in
qdisc_graft().  Refactor it.  No functional changes intended.

Tested-by: Pedro Tammela <pctammela@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
16 months agonet/sched: act_ct: Fix promotion of offloaded unreplied tuple
Paul Blakey [Fri, 9 Jun 2023 12:22:59 +0000 (15:22 +0300)]
net/sched: act_ct: Fix promotion of offloaded unreplied tuple

Currently UNREPLIED and UNASSURED connections are added to the nf flow
table. This causes the following connection packets to be processed
by the flow table which then skips conntrack_in(), and thus such the
connections will remain UNREPLIED and UNASSURED even if reply traffic
is then seen. Even still, the unoffloaded reply packets are the ones
triggering hardware update from new to established state, and if
there aren't any to triger an update and/or previous update was
missed, hardware can get out of sync with sw and still mark
packets as new.

Fix the above by:
1) Not skipping conntrack_in() for UNASSURED packets, but still
   refresh for hardware, as before the cited patch.
2) Try and force a refresh by reply-direction packets that update
   the hardware rules from new to established state.
3) Remove any bidirectional flows that didn't failed to update in
   hardware for re-insertion as bidrectional once any new packet
   arrives.

Fixes: 6a9bad0069cf ("net/sched: act_ct: offload UDP NEW connections")
Co-developed-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Florian Westphal <fw@strlen.de>
Link: https://lore.kernel.org/r/1686313379-117663-1-git-send-email-paulb@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
16 months agowifi: iwlwifi: mvm: spin_lock_bh() to fix lockdep regression
Hugh Dickins [Fri, 9 Jun 2023 21:29:39 +0000 (14:29 -0700)]
wifi: iwlwifi: mvm: spin_lock_bh() to fix lockdep regression

Lockdep on 6.4-rc on ThinkPad X1 Carbon 5th says
=====================================================
WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
6.4.0-rc5 #1 Not tainted
-----------------------------------------------------
kworker/3:1/49 [HC0[0]:SC0[4]:HE1:SE0] is trying to acquire:
ffff8881066fa368 (&mvm_sta->deflink.lq_sta.rs_drv.pers.lock){+.+.}-{2:2}, at: rs_drv_get_rate+0x46/0xe7

and this task is already holding:
ffff8881066f80a8 (&sta->rate_ctrl_lock){+.-.}-{2:2}, at: rate_control_get_rate+0xbd/0x126
which would create a new lock dependency:
 (&sta->rate_ctrl_lock){+.-.}-{2:2} -> (&mvm_sta->deflink.lq_sta.rs_drv.pers.lock){+.+.}-{2:2}

but this new dependency connects a SOFTIRQ-irq-safe lock:
 (&sta->rate_ctrl_lock){+.-.}-{2:2}
etc. etc. etc.

Changing the spin_lock() in rs_drv_get_rate() to spin_lock_bh() was not
enough to pacify lockdep, but changing them all on pers.lock has worked.

Fixes: a8938bc881d2 ("wifi: iwlwifi: mvm: Add locking to the rate read flow")
Signed-off-by: Hugh Dickins <hughd@google.com>
Link: https://lore.kernel.org/r/79ffcc22-9775-cb6d-3ffd-1a517c40beef@google.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
16 months agoMerge branch 'fix-small-bugs-and-annoyances-in-tc-testing'
Jakub Kicinski [Wed, 14 Jun 2023 03:49:16 +0000 (20:49 -0700)]
Merge branch 'fix-small-bugs-and-annoyances-in-tc-testing'

Vlad Buslov says:

====================
Fix small bugs and annoyances in tc-testing
====================

Link: https://lore.kernel.org/r/20230612075712.2861848-1-vladbu@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoselftests/tc-testing: Remove configs that no longer exist
Vlad Buslov [Mon, 12 Jun 2023 07:57:12 +0000 (09:57 +0200)]
selftests/tc-testing: Remove configs that no longer exist

Some qdiscs and classifiers have recently been retired from kernel.
However, tc-testing config is still cluttered with them which causes noise
when using merge_config.sh script to update existing config for tc-testing
compatibility. Remove the config settings for affected qdiscs and
classifiers.

Fixes: fb38306ceb9e ("net/sched: Retire ATM qdisc")
Fixes: 051d44209842 ("net/sched: Retire CBQ qdisc")
Fixes: bbe77c14ee61 ("net/sched: Retire dsmark qdisc")
Fixes: 265b4da82dbf ("net/sched: Retire rsvp classifier")
Fixes: 8c710f75256b ("net/sched: Retire tcindex classifier")
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoselftests/tc-testing: Fix SFB db test
Vlad Buslov [Mon, 12 Jun 2023 07:57:11 +0000 (09:57 +0200)]
selftests/tc-testing: Fix SFB db test

Setting very small value of db like 10ms introduces rounding errors when
converting to/from jiffies on some kernel configs. For example, on 250hz
the actual value will be set to 12ms which causes the test to fail:

 # $ sudo ./tdc.py  -d eth2 -e 3410
 #  -- ns/SubPlugin.__init__
 # Test 3410: Create SFB with db setting
 #
 # All test results:
 #
 # 1..1
 # not ok 1 3410 - Create SFB with db setting
 #         Could not match regex pattern. Verify command output:
 # qdisc sfb 1: root refcnt 2 rehash 600s db 12ms limit 1000p max 25p target 20p increment 0.000503548 decrement 4.57771e-05 penalty_rate 10pps penalty_burst 20p

Set the value to 100ms instead which currently seem to work on 100hz,
250hz, 300hz and 1000hz kernel configs.

Fixes: 6ad92dc56fca ("selftests/tc-testing: add selftests for sfb qdisc")
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoselftests/tc-testing: Fix Error: failed to find target LOG
Vlad Buslov [Mon, 12 Jun 2023 07:57:10 +0000 (09:57 +0200)]
selftests/tc-testing: Fix Error: failed to find target LOG

Add missing netfilter config dependency.

Fixes following example error when running tests via tdc.sh for all XT
tests:

 # $ sudo ./tdc.py -d eth2 -e 2029
 # Test 2029: Add xt action with log-prefix
 # exit: 255
 # exit: 0
 #  failed to find target LOG
 #
 # bad action parsing
 # parse_action: bad value (7:xt)!
 # Illegal "action"
 #
 # -----> teardown stage *** Could not execute: "$TC actions flush action xt"
 #
 # -----> teardown stage *** Error message: "Error: Cannot flush unknown TC action.
 # We have an error flushing
 # "
 # returncode 1; expected [0]
 #
 # -----> teardown stage *** Aborting test run.
 #
 # <_io.BufferedReader name=3> *** stdout ***
 #
 # <_io.BufferedReader name=5> *** stderr ***
 # "-----> teardown stage" did not complete successfully
 # Exception <class '__main__.PluginMgrTestFail'> ('teardown', ' failed to find target LOG\n\nbad action parsing\nparse_action: bad value (7:xt)!\nIllegal "action"\n', '"-----> teardown stage" did not complete successfully') (caught in test_runner, running test 2 2029 Add xt action with log-prefix stage teardown)
 # ---------------
 # traceback
 #   File "/images/src/linux/tools/testing/selftests/tc-testing/./tdc.py", line 495, in test_runner
 #     res = run_one_test(pm, args, index, tidx)
 #   File "/images/src/linux/tools/testing/selftests/tc-testing/./tdc.py", line 434, in run_one_test
 #     prepare_env(args, pm, 'teardown', '-----> teardown stage', tidx['teardown'], procout)
 #   File "/images/src/linux/tools/testing/selftests/tc-testing/./tdc.py", line 245, in prepare_env
 #     raise PluginMgrTestFail(
 # ---------------
 # accumulated output for this test:
 #  failed to find target LOG
 #
 # bad action parsing
 # parse_action: bad value (7:xt)!
 # Illegal "action"
 #
 # ---------------
 #
 # All test results:
 #
 # 1..1
 # ok 1 2029 - Add xt action with log-prefix # skipped - "-----> teardown stage" did not complete successfully

Fixes: 910d504bc187 ("selftests/tc-testings: add selftests for xt action")
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoselftests/tc-testing: Fix Error: Specified qdisc kind is unknown.
Vlad Buslov [Mon, 12 Jun 2023 07:57:09 +0000 (09:57 +0200)]
selftests/tc-testing: Fix Error: Specified qdisc kind is unknown.

All TEQL tests assume that sch_teql module is loaded. Load module in tdc.sh
before running qdisc tests.

Fixes following example error when running tests via tdc.sh for all TEQL
tests:

 # $ sudo ./tdc.py -d eth2 -e 84a0
 #  -- ns/SubPlugin.__init__
 # Test 84a0: Create TEQL with default setting
 # exit: 2
 # exit: 0
 # Error: Specified qdisc kind is unknown.
 #
 # -----> teardown stage *** Could not execute: "$TC qdisc del dev $DUMMY handle 1: root"
 #
 # -----> teardown stage *** Error message: "Error: Invalid handle.
 # "
 # returncode 2; expected [0]
 #
 # -----> teardown stage *** Aborting test run.
 #
 # <_io.BufferedReader name=3> *** stdout ***
 #
 # <_io.BufferedReader name=5> *** stderr ***
 # "-----> teardown stage" did not complete successfully
 # Exception <class '__main__.PluginMgrTestFail'> ('teardown', 'Error: Specified qdisc kind is unknown.\n', '"-----> teardown stage" did not complete successfully') (caught in test_runner, running test 2 84a0 Create TEQL with default setting stage teardown)
 # ---------------
 # traceback
 #   File "/images/src/linux/tools/testing/selftests/tc-testing/./tdc.py", line 495, in test_runner
 #     res = run_one_test(pm, args, index, tidx)
 #   File "/images/src/linux/tools/testing/selftests/tc-testing/./tdc.py", line 434, in run_one_test
 #     prepare_env(args, pm, 'teardown', '-----> teardown stage', tidx['teardown'], procout)
 #   File "/images/src/linux/tools/testing/selftests/tc-testing/./tdc.py", line 245, in prepare_env
 #     raise PluginMgrTestFail(
 # ---------------
 # accumulated output for this test:
 # Error: Specified qdisc kind is unknown.
 #
 # ---------------
 #
 # All test results:
 #
 # 1..1
 # ok 1 84a0 - Create TEQL with default setting # skipped - "-----> teardown stage" did not complete successfully

Fixes: cc62fbe114c9 ("selftests/tc-testing: add selftests for teql qdisc")
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Victor Nogueira <victor@mojatatu.com>
Reviewed-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agonet: ethernet: ti: am65-cpsw: Call of_node_put() on error path
Dan Carpenter [Mon, 12 Jun 2023 07:18:50 +0000 (10:18 +0300)]
net: ethernet: ti: am65-cpsw: Call of_node_put() on error path

This code returns directly but it should instead call of_node_put()
to drop some reference counts.

Fixes: dab2b265dd23 ("net: ethernet: ti: am65-cpsw: Add support for SERDES configuration")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Link: https://lore.kernel.org/r/e3012f0c-1621-40e6-bf7d-03c276f6e07f@kili.mountain
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoio_uring/net: save msghdr->msg_control for retries
Jens Axboe [Mon, 12 Jun 2023 19:51:36 +0000 (13:51 -0600)]
io_uring/net: save msghdr->msg_control for retries

If the application sets ->msg_control and we have to later retry this
command, or if it got queued with IOSQE_ASYNC to begin with, then we
need to retain the original msg_control value. This is due to the net
stack overwriting this field with an in-kernel pointer, to copy it
in. Hitting that path for the second time will now fail the copy from
user, as it's attempting to copy from a non-user address.

Cc: stable@vger.kernel.org # 5.10+
Link: https://github.com/axboe/liburing/issues/880
Reported-and-tested-by: Marek Majkowski <marek@cloudflare.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
16 months agoMerge tag 'nios2_fix_v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen...
Linus Torvalds [Wed, 14 Jun 2023 00:00:33 +0000 (17:00 -0700)]
Merge tag 'nios2_fix_v6.4' of git://git./linux/kernel/git/dinguyen/linux

Pull NIOS2 dts fix from Dinh Nguyen:

 - Fix tse_mac "max-frame-size" property

* tag 'nios2_fix_v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux:
  nios2: dts: Fix tse_mac "max-frame-size" property

16 months agonios2: dts: Fix tse_mac "max-frame-size" property
Janne Grunau [Sun, 12 Feb 2023 12:16:32 +0000 (13:16 +0100)]
nios2: dts: Fix tse_mac "max-frame-size" property

The given value of 1518 seems to refer to the layer 2 ethernet frame
size without 802.1Q tag. Actual use of the "max-frame-size" including in
the consumer of the "altr,tse-1.0" compatible is the MTU.

Fixes: 95acd4c7b69c ("nios2: Device tree support")
Fixes: 61c610ec61bb ("nios2: Add Max10 device tree")
Cc: <stable@vger.kernel.org>
Signed-off-by: Janne Grunau <j@jannau.net>
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
16 months agodrm/amd/display: limit DPIA link rate to HBR3
Peichen Huang [Wed, 31 May 2023 05:36:14 +0000 (13:36 +0800)]
drm/amd/display: limit DPIA link rate to HBR3

[Why]
DPIA doesn't support UHBR, driver should not enable UHBR
for dp tunneling

[How]
limit DPIA link rate to HBR3

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Acked-by: Stylon Wang <stylon.wang@amd.com>
Signed-off-by: Peichen Huang <peichen.huang@amd.com>
Reviewed-by: Mustapha Ghaddar <Mustapha.Ghaddar@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
16 months agodrm/amd/display: fix the system hang while disable PSR
Tom Chung [Mon, 29 May 2023 10:00:09 +0000 (18:00 +0800)]
drm/amd/display: fix the system hang while disable PSR

[Why]
When the PSR enabled. If you try to adjust the timing parameters,
it may cause system hang. Because the timing mismatch with the
DMCUB settings.

[How]
Disable the PSR before adjusting timing parameters.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Acked-by: Stylon Wang <stylon.wang@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Reviewed-by: Wayne Lin <Wayne.Lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
16 months agodrm/amd/display: edp do not add non-edid timings
Hersen Wu [Thu, 25 May 2023 12:37:40 +0000 (08:37 -0400)]
drm/amd/display: edp do not add non-edid timings

[Why] most edp support only timings from edid. applying
non-edid timings, especially those timings out of edp
bandwidth, may damage edp.

[How] do not add non-edid timings for edp.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Acked-by: Stylon Wang <stylon.wang@amd.com>
Signed-off-by: Hersen Wu <hersenxs.wu@amd.com>
Reviewed-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
16 months agoRevert "drm/amdgpu: remove TOPDOWN flags when allocating VRAM in large bar system"
Arunpravin Paneer Selvam [Fri, 9 Jun 2023 13:18:41 +0000 (06:18 -0700)]
Revert "drm/amdgpu: remove TOPDOWN flags when allocating VRAM in large bar system"

This reverts commit c105518679b6e87232874ffc989ec403bee59664.

This patch disables the TOPDOWN flag for APU and few dGPU cards
which has the VRAM size equal to the BAR size.

When we enable the TOPDOWN flag, we get the free blocks at
the highest available memory region and we don't split the
lower order blocks. This change is required to keep off
the fragmentation related issues particularly in ASIC
which has VRAM space <= 500MiB

Hence, we are reverting this patch.

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2270
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
16 months agodrm/amdgpu: vcn_4_0 set instance 0 init sched score to 1
Sonny Jiang [Tue, 6 Jun 2023 21:18:52 +0000 (17:18 -0400)]
drm/amdgpu: vcn_4_0 set instance 0 init sched score to 1

Only vcn0 can process AV1 codecx. In order to use both vcn0 and
vcn1 in h264/265 transcode to AV1 cases, set vcn0 sched score to 1
at initialization time.

Signed-off-by: Sonny Jiang <sonjiang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.1.x
16 months agodrm/radeon: Disable outputs when releasing fbdev client
Thomas Zimmermann [Fri, 9 Jun 2023 14:03:56 +0000 (16:03 +0200)]
drm/radeon: Disable outputs when releasing fbdev client

Disable the modesetting pipeline before release the radeon's fbdev
client. Fixes the following error:

[   17.217408] WARNING: CPU: 5 PID: 1464 at drivers/gpu/drm/ttm/ttm_bo.c:326 ttm_bo_release+0x27e/0x2d0 [ttm]
[   17.217418] Modules linked in: edac_mce_amd radeon(+) drm_ttm_helper ttm video drm_suballoc_helper drm_display_helper kvm irqbypass drm_kms_helper syscopyarea crc32_pclmul sysfillrect sha512_ssse3 sysimgblt sha512_generic cfbfillrect cfbimgblt wmi_bmof aesni_intel cfbcopyarea crypto_simd cryptd k10temp acpi_cpufreq wmi dm_mod
[   17.217432] CPU: 5 PID: 1464 Comm: systemd-udevd Not tainted 6.4.0-rc4+ #1
[   17.217436] Hardware name: Micro-Star International Co., Ltd. MS-7A38/B450M PRO-VDH MAX (MS-7A38), BIOS B.G0 07/26/2022
[   17.217438] RIP: 0010:ttm_bo_release+0x27e/0x2d0 [ttm]
[   17.217444] Code: 48 89 43 38 48 89 43 40 48 8b 5c 24 30 48 8b b5 40 08 00 00 48 8b 6c 24 38 48 83 c4 58 e9 7a 49 f7 e0 48 89 ef e9 6c fe ff ff <0f> 0b 48 83 7b 20 00 0f 84 b7 fd ff ff 0f 0b 0f 1f 00 e9 ad fd ff
[   17.217448] RSP: 0018:ffffc9000095fbb0 EFLAGS: 00010202
[   17.217451] RAX: 0000000000000001 RBX: ffff8881052c8de0 RCX: 0000000000000000
[   17.217453] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8881052c8de0
[   17.217455] RBP: ffff888104a66e00 R08: ffff8881052c8de0 R09: ffff888104a7cf08
[   17.217457] R10: ffffc9000095fbe0 R11: ffffc9000095fbe8 R12: ffff8881052c8c78
[   17.217458] R13: ffff8881052c8c78 R14: dead000000000100 R15: ffff88810528b108
[   17.217460] FS:  00007f319fcbb8c0(0000) GS:ffff88881a540000(0000) knlGS:0000000000000000
[   17.217463] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   17.217464] CR2: 000055dc8b0224a0 CR3: 000000010373d000 CR4: 0000000000750ee0
[   17.217466] PKRU: 55555554
[   17.217468] Call Trace:
[   17.217470]  <TASK>
[   17.217472]  ? __warn+0x97/0x160
[   17.217476]  ? ttm_bo_release+0x27e/0x2d0 [ttm]
[   17.217481]  ? report_bug+0x1ec/0x200
[   17.217487]  ? handle_bug+0x3c/0x70
[   17.217490]  ? exc_invalid_op+0x1f/0x90
[   17.217493]  ? preempt_count_sub+0xb5/0x100
[   17.217496]  ? asm_exc_invalid_op+0x16/0x20
[   17.217500]  ? ttm_bo_release+0x27e/0x2d0 [ttm]
[   17.217505]  ? ttm_resource_move_to_lru_tail+0x1ab/0x1d0 [ttm]
[   17.217511]  radeon_bo_unref+0x1a/0x30 [radeon]
[   17.217547]  radeon_gem_object_free+0x20/0x30 [radeon]
[   17.217579]  radeon_fbdev_fb_destroy+0x57/0x90 [radeon]
[   17.217616]  unregister_framebuffer+0x72/0x110
[   17.217620]  drm_client_dev_unregister+0x6d/0xe0
[   17.217623]  drm_dev_unregister+0x2e/0x90
[   17.217626]  drm_put_dev+0x26/0x90
[   17.217628]  pci_device_remove+0x44/0xc0
[   17.217631]  really_probe+0x257/0x340
[   17.217635]  __driver_probe_device+0x73/0x120
[   17.217638]  driver_probe_device+0x2c/0xb0
[   17.217641]  __driver_attach+0xa0/0x150
[   17.217643]  ? __pfx___driver_attach+0x10/0x10
[   17.217646]  bus_for_each_dev+0x67/0xa0
[   17.217649]  bus_add_driver+0x10e/0x210
[   17.217651]  driver_register+0x5c/0x120
[   17.217653]  ? __pfx_radeon_module_init+0x10/0x10 [radeon]
[   17.217681]  do_one_initcall+0x44/0x220
[   17.217684]  ? kmalloc_trace+0x37/0xc0
[   17.217688]  do_init_module+0x64/0x240
[   17.217691]  __do_sys_finit_module+0xb2/0x100
[   17.217694]  do_syscall_64+0x3b/0x90
[   17.217697]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
[   17.217700] RIP: 0033:0x7f319feaa5a9
[   17.217702] Code: 08 89 e8 5b 5d c3 66 2e 0f 1f 84 00 00 00 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 27 08 0d 00 f7 d8 64 89 01 48
[   17.217706] RSP: 002b:00007ffc6bf3e7f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[   17.217709] RAX: ffffffffffffffda RBX: 00005607204f3170 RCX: 00007f319feaa5a9
[   17.217710] RDX: 0000000000000000 RSI: 00007f31a002eefd RDI: 0000000000000018
[   17.217712] RBP: 00007f31a002eefd R08: 0000000000000000 R09: 00005607204f1860
[   17.217714] R10: 0000000000000018 R11: 0000000000000246 R12: 0000000000020000
[   17.217716] R13: 0000000000000000 R14: 0000560720522450 R15: 0000560720255899
[   17.217718]  </TASK>
[   17.217719] ---[ end trace 0000000000000000 ]---

The buffer object backing the fbdev emulation got pinned twice: by the
fb_probe helper radeon_fbdev_create_pinned_object() and the modesetting
code when the framebuffer got displayed. It only got unpinned once by
the fbdev helper radeon_fbdev_destroy_pinned_object(). Hence TTM's BO-
release function complains about the pin counter. Forcing the outputs
off also undoes the modesettings pin increment.

Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
Reported-by: Borislav Petkov <bp@alien8.de>
Closes: https://lore.kernel.org/dri-devel/20230603174814.GCZHt83pN+wNjf63sC@fat_crate.local/
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Fixes: e317a69fe891 ("drm/radeon: Implement client-based fbdev emulation")
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
16 months agodrm/amd/pm: workaround for compute workload type on some skus
Kenneth Feng [Thu, 8 Jun 2023 14:07:11 +0000 (22:07 +0800)]
drm/amd/pm: workaround for compute workload type on some skus

On smu 13.0.0, the compute workload type cannot be set on all the skus
due to some other problems. This workaround is to make sure compute workload type
can also run on some specific skus.

v2: keep the variable consistent

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Acked-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.1.x
16 months agodrm/amd: Tighten permissions on VBIOS flashing attributes
Mario Limonciello [Wed, 7 Jun 2023 06:45:20 +0000 (01:45 -0500)]
drm/amd: Tighten permissions on VBIOS flashing attributes

Non-root users shouldn't be able to try to trigger a VBIOS flash
or query the flashing status.  This should be reserved for users with the
appropriate permissions.

Cc: stable@vger.kernel.org
Fixes: 8424f2ccb3c0 ("drm/amdgpu/psp: Add vbflash sysfs interface support")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
16 months agodrm/amd: Make sure image is written to trigger VBIOS image update flow
Mario Limonciello [Wed, 7 Jun 2023 06:41:22 +0000 (01:41 -0500)]
drm/amd: Make sure image is written to trigger VBIOS image update flow

The VBIOS image update flow requires userspace to:
1) Write the image to `psp_vbflash`
2) Read `psp_vbflash`
3) Poll `psp_vbflash_status` to check for completion

If userspace reads `psp_vbflash` before writing an image, it's
possible that it causes problems that can put the dGPU into an invalid
state.

Explicitly check that an image has been written before letting a read
succeed.

Cc: stable@vger.kernel.org
Fixes: 8424f2ccb3c0 ("drm/amdgpu/psp: Add vbflash sysfs interface support")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
16 months agodrm/amdgpu: add missing radeon secondary PCI ID
Alex Deucher [Tue, 6 Jun 2023 15:14:04 +0000 (11:14 -0400)]
drm/amdgpu: add missing radeon secondary PCI ID

0x5b70 is a missing RV370 secondary id.  Add it so
we don't try and probe it with amdgpu.

Cc: michel@daenzer.net
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Tested-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
16 months agodrm/amdgpu: Implement gfx9 patch functions for resubmission
Jiadong Zhu [Thu, 25 May 2023 10:42:15 +0000 (18:42 +0800)]
drm/amdgpu: Implement gfx9 patch functions for resubmission

Patch the packages including CONTEXT_CONTROL and WRITE_DATA for gfx9
during the resubmission scenario.

Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.3.x
16 months agodrm/amdgpu: Modify indirect buffer packages for resubmission
Jiadong Zhu [Thu, 25 May 2023 08:52:55 +0000 (16:52 +0800)]
drm/amdgpu: Modify indirect buffer packages for resubmission

When the preempted IB frame resubmitted to cp, we need to modify the frame
data including:
1. set PRE_RESUME 1 in CONTEXT_CONTROL.
2. use meta data(DE and CE) read from CSA in WRITE_DATA.

Add functions to save the location the first time IBs emitted and callback
to patch the package when resubmission happens.

Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.3.x
16 months agodrm/amdgpu: Program gds backup address as zero if no gds allocated
Jiadong Zhu [Wed, 24 May 2023 08:51:32 +0000 (16:51 +0800)]
drm/amdgpu: Program gds backup address as zero if no gds allocated

It is firmware requirement to set gds_backup_addrlo and gds_backup_addrhi
of DE meta both zero if no gds partition is allocated for the frame.

Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.3.x
16 months agodrm/nouveau: add nv_encoder pointer check for NULL
Natalia Petrova [Fri, 12 May 2023 10:33:20 +0000 (13:33 +0300)]
drm/nouveau: add nv_encoder pointer check for NULL

Pointer nv_encoder could be dereferenced at nouveau_connector.c
in case it's equal to NULL by jumping to goto label.
This patch adds a NULL-check to avoid it.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: 3195c5f9784a ("drm/nouveau: set encoder for lvds")
Signed-off-by: Natalia Petrova <n.petrova@fintech.ru>
Reviewed-by: Lyude Paul <lyude@redhat.com>
[Fixed patch title]
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230512103320.82234-1-n.petrova@fintech.ru
16 months agodrm/amdgpu: Reset CP_VMID_PREEMPT after trailing fence signaled
Jiadong Zhu [Wed, 24 May 2023 03:42:19 +0000 (11:42 +0800)]
drm/amdgpu: Reset CP_VMID_PREEMPT after trailing fence signaled

When MEC executes unmap_queue for mid command buffer preemption, it will
kick the write pointer of the gfx ring, set CP_VMID_PREEMPT to trigger the
preemption and wait for CP_VMID_PREEMPT becomes zero after the preemption
done. There is a race condition that PFP may excute the resetting command
before MEC set CP_VMID_PREEMPT. As a result, hang happens as
CP_VMID_PREEMPT is always 0xffff.

To avoid this, we send resetting CP_VMID_PREEMPT command after the trailing
fence is siganled and update gfx write pointer explicitly.

Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.3.x
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2535
16 months agodrm/nouveau/dp: check for NULL nv_connector->native_mode
Natalia Petrova [Fri, 12 May 2023 11:15:26 +0000 (14:15 +0300)]
drm/nouveau/dp: check for NULL nv_connector->native_mode

Add checking for NULL before calling nouveau_connector_detect_depth() in
nouveau_connector_get_modes() function because nv_connector->native_mode
could be dereferenced there since connector pointer passed to
nouveau_connector_detect_depth() and the same value of
nv_connector->native_mode is used there.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: d4c2c99bdc83 ("drm/nouveau/dp: remove broken display depth function, use the improved one")

Signed-off-by: Natalia Petrova <n.petrova@fintech.ru>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230512111526.82408-1-n.petrova@fintech.ru
16 months agospi: dw: Replace incorrect spi_get_chipselect with set
Abe Kohandel [Tue, 13 Jun 2023 16:21:03 +0000 (09:21 -0700)]
spi: dw: Replace incorrect spi_get_chipselect with set

Commit 445164e8c136 ("spi: dw: Replace spi->chip_select references with
function calls") replaced direct access to spi.chip_select with
spi_*_chipselect calls but incorrectly replaced a set instance with a
get instance, replace the incorrect instance.

Fixes: 445164e8c136 ("spi: dw: Replace spi->chip_select references with function calls")
Signed-off-by: Abe Kohandel <abe.kohandel@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Serge Semin <fancer.lancer@gmail.com>
Link: https://lore.kernel.org/r/20230613162103.569812-1-abe.kohandel@intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
16 months agodrm/bridge: ti-sn65dsi86: Avoid possible buffer overflow
Su Hui [Thu, 8 Jun 2023 01:24:43 +0000 (09:24 +0800)]
drm/bridge: ti-sn65dsi86: Avoid possible buffer overflow

Smatch error:buffer overflow 'ti_sn_bridge_refclk_lut' 5 <= 5.

Fixes: cea86c5bb442 ("drm/bridge: ti-sn65dsi86: Implement the pwm_chip")
Signed-off-by: Su Hui <suhui@nfschina.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20230608012443.839372-1-suhui@nfschina.com
16 months agoMerge tag 'devicetree-fixes-for-6.4-3' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Tue, 13 Jun 2023 17:27:56 +0000 (10:27 -0700)]
Merge tag 'devicetree-fixes-for-6.4-3' of git://git./linux/kernel/git/robh/linux

Pull devicetree fixes from Rob Herring:

 - Fix missing of_node_put() in init_overlay_changeset()

 - Fix schema for qcom,pmic-mpp "qcom,paired" property

 - Fix 'additionalProperties' in silvaco,i3c-master binding

 - usage-model.rst: Use documented "arm,primecell" compatible string

 - Update Damien Le Moal's email address

 - Fixes in Realtek Bluetooth binding

* tag 'devicetree-fixes-for-6.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  dt-bindings: pinctrl: qcom,pmic-mpp: Fix schema for "qcom,paired"
  dt-bindings: i3c: silvaco,i3c-master: fix missing schema restriction
  of: overlay: Fix missing of_node_put() in error case of init_overlay_changeset()
  docs: zh_CN/devicetree: sync usage-model fix
  docs: dt: fix documented Primecell compatible string
  dt-bindings: Change Damien Le Moal's contact email
  dt-bindings: net: realtek-bluetooth: Fix double RTL8723CS in desc
  dt-bindings: net: realtek-bluetooth: Fix RTL8821CS binding

16 months agodt-bindings: pinctrl: qcom,pmic-mpp: Fix schema for "qcom,paired"
Rob Herring [Tue, 18 Apr 2023 15:06:06 +0000 (10:06 -0500)]
dt-bindings: pinctrl: qcom,pmic-mpp: Fix schema for "qcom,paired"

The "qcom,paired" schema is all wrong. First, it's a list rather than an
object(dictionary). Second, it is missing a required type. The meta-schema
normally catches this, but schemas under "$defs" was not getting checked.
A fix for that is pending.

Fixes: f9a06b810951 ("dt-bindings: pinctrl: qcom,pmic-mpp: Convert qcom pmic mpp bindings to YAML")
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20230418150606.1528107-1-robh@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
16 months agoregmap: regcache: Don't sync read-only registers
Takashi Iwai [Tue, 13 Jun 2023 11:22:40 +0000 (13:22 +0200)]
regmap: regcache: Don't sync read-only registers

regcache_maple_sync() tries to sync all cached values no matter
whether it's writable or not.  OTOH, regache_sync_val() does care the
wrtability and returns -EIO for a read-only register.  This results in
an error message like:
  snd_hda_codec_realtek hdaudioC0D0: Unable to sync register 0x2f0009. -5
and the sync loop is aborted incompletely.

This patch adds the writable register check to regcache_sync_val() for
addressing the bug above.

Note that, although we may add the check in the caller side
(regcache_maple_sync()), here we put in regcache_sync_val(), so that a
similar case like this can be avoided in future.

Fixes: f033c26de5a5 ("regmap: Add maple tree based register cache")
Link: https://lore.kernel.org/r/877cs7g6f1.wl-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://lore.kernel.org/r/20230613112240.3361-1-tiwai@suse.de
Signed-off-by: Mark Brown <broonie@kernel.org>
16 months agoASoC: tegra: Fix Master Volume Control
Jon Hunter [Tue, 13 Jun 2023 09:34:53 +0000 (10:34 +0100)]
ASoC: tegra: Fix Master Volume Control

Commit 3ed2b549b39f ("ALSA: pcm: fix wait_time calculations") corrected
the PCM wait_time calculations and in doing so reduced the calculated
wait_time. This exposed an issue with the Tegra Master Volume Control
(MVC) device where the reduced wait_time caused the MVC to fail. For now
fix this by setting the default wait_time for Tegra to be 500ms.

Fixes: 3ed2b549b39f ("ALSA: pcm: fix wait_time calculations")
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Link: https://lore.kernel.org/r/20230613093453.13927-1-jonathanh@nvidia.com
Signed-off-by: Mark Brown <broonie@kernel.org>
16 months agotty: serial: fsl_lpuart: reduce RX watermark to 0 on LS1028A
Robert Hodaszi [Fri, 9 Jun 2023 12:13:34 +0000 (14:13 +0200)]
tty: serial: fsl_lpuart: reduce RX watermark to 0 on LS1028A

LS1028A is using DMA with LPUART. Having RX watermark set to 1, means
DMA transactions are started only after receiving the second character.

On other platforms with newer LPUART IP, Receiver Idle Empty function
initiates the DMA request after the receiver is idling for 4 characters.
But this feature is missing on LS1028A, which is causing a 1-character
delay in the RX direction on this platform.

Set RX watermark to 0 to initiate RX DMA after each character.

Link: https://lore.kernel.org/linux-serial/20230607103459.1222426-1-robert.hodaszi@digi.com/
Fixes: 9ad9df844754 ("tty: serial: fsl_lpuart: Fix the wrong RXWATER setting for rx dma case")
Cc: stable <stable@kernel.org>
Signed-off-by: Robert Hodaszi <robert.hodaszi@digi.com>
Message-ID: <20230609121334.1878626-1-robert.hodaszi@digi.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
16 months agodrm/nouveau: don't detect DSM for non-NVIDIA device
Ratchanan Srirattanamet [Tue, 23 May 2023 21:11:56 +0000 (04:11 +0700)]
drm/nouveau: don't detect DSM for non-NVIDIA device

The call site of nouveau_dsm_pci_probe() uses single set of output
variables for all invocations. So, we must not write anything to them
unless it's an NVIDIA device. Otherwise, if we are called with another
device after the NVIDIA device, we'll clober the result of the NVIDIA
device.

For example, if the other device doesn't have _PR3 resources, the
detection later would miss the presence of power resource support, and
the rest of the code will keep using Optimus DSM, breaking power
management for that machine.

Also, because we're detecting NVIDIA's DSM, it doesn't make sense to run
this detection on a non-NVIDIA device anyway. Thus, check at the
beginning of the detection code if this is an NVIDIA card, and just
return if it isn't.

This, together with commit d22915d22ded ("drm/nouveau/devinit/tu102-:
wait for GFW_BOOT_PROGRESS == COMPLETED") developed independently and
landed earlier, fixes runtime power management of the NVIDIA card in
Lenovo Legion 5-15ARH05. Without this patch, the GPU resumption code
will "timeout", sometimes hanging userspace.

As a bonus, we'll also stop preventing _PR3 usage from the bridge for
unrelated devices, which is always nice, I guess.

Fixes: ccfc2d5cdb02 ("drm/nouveau: Use generic helper to check _PR3 presence")
Signed-off-by: Ratchanan Srirattanamet <peathot@hotmail.com>
Closes: https://gitlab.freedesktop.org/drm/nouveau/-/issues/79
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/DM6PR19MB2780805D4BE1E3F9B3AC96D0BC409@DM6PR19MB2780.namprd19.prod.outlook.com
16 months agousb: gadget: udc: core: Prevent soft_connect_store() race
Badhri Jagan Sridharan [Fri, 9 Jun 2023 01:02:27 +0000 (01:02 +0000)]
usb: gadget: udc: core: Prevent soft_connect_store() race

usb_udc_connect_control(), soft_connect_store() and
usb_gadget_deactivate() can potentially race against each other to invoke
usb_gadget_connect()/usb_gadget_disconnect(). To prevent this, guard
udc->started, gadget->allow_connect, gadget->deactivate and
gadget->connect with connect_lock so that ->pullup() is only invoked when
the gadget is bound, started and not deactivated. The routines
usb_gadget_connect_locked(), usb_gadget_disconnect_locked(),
usb_udc_connect_control_locked(), usb_gadget_udc_start_locked(),
usb_gadget_udc_stop_locked() are called with this lock held.

An earlier version of this commit was reverted due to the crash reported in
https://lore.kernel.org/all/ZF4BvgsOyoKxdPFF@francesco-nb.int.toradex.com/.
commit 16737e78d190 ("usb: gadget: udc: core: Offload usb_udc_vbus_handler processing")
addresses the crash reported.

Cc: stable@vger.kernel.org
Fixes: 628ef0d273a6 ("usb: udc: add usb_udc_vbus_handler")
Signed-off-by: Badhri Jagan Sridharan <badhri@google.com>
Reviewed-by: Alan Stern <stern@rowland.harvard.edu>
Message-ID: <20230609010227.978661-2-badhri@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
16 months agousb: gadget: udc: core: Offload usb_udc_vbus_handler processing
Badhri Jagan Sridharan [Fri, 9 Jun 2023 01:02:26 +0000 (01:02 +0000)]
usb: gadget: udc: core: Offload usb_udc_vbus_handler processing

usb_udc_vbus_handler() can be invoked from interrupt context by irq
handlers of the gadget drivers, however, usb_udc_connect_control() has
to run in non-atomic context due to the following:
a. Some of the gadget driver implementations expect the ->pullup
   callback to be invoked in non-atomic context.
b. usb_gadget_disconnect() acquires udc_lock which is a mutex.

Hence offload invocation of usb_udc_connect_control()
to workqueue.

UDC should not be pulled up unless gadget driver is bound. The new flag
"allow_connect" is now set by gadget_bind_driver() and cleared by
gadget_unbind_driver(). This prevents work item to pull up the gadget
even if queued when the gadget driver is already unbound.

Cc: stable@vger.kernel.org
Fixes: 1016fc0c096c ("USB: gadget: Fix obscure lockdep violation for udc_mutex")
Signed-off-by: Badhri Jagan Sridharan <badhri@google.com>
Reviewed-by: Alan Stern <stern@rowland.harvard.edu>
Message-ID: <20230609010227.978661-1-badhri@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
16 months agousb: typec: Fix fast_role_swap_current show function
Pavan Holla [Wed, 7 Jun 2023 19:33:26 +0000 (19:33 +0000)]
usb: typec: Fix fast_role_swap_current show function

The current implementation mistakenly performs a & operation on
the output of sysfs_emit. This patch performs the & operation before
calling sysfs_emit.

Fixes: 662a60102c12 ("usb: typec: Separate USB Power Delivery from USB Type-C")
Cc: stable <stable@kernel.org>
Reported-by: Benson Leung <bleung@chromium.org>
Signed-off-by: Pavan Holla <pholla@chromium.org>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Reviewed-by: Benson Leung <bleung@chromium.org>
Message-ID: <20230607193328.3359487-1-pholla@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
16 months agousb: typec: ucsi: Fix command cancellation
Heikki Krogerus [Tue, 6 Jun 2023 11:58:02 +0000 (14:58 +0300)]
usb: typec: ucsi: Fix command cancellation

The Cancel command was passed to the write callback as the
offset instead of as the actual command which caused NULL
pointer dereference.

Reported-by: Stephan Bolten <stephan.bolten@gmx.net>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217517
Fixes: 094902bc6a3c ("usb: typec: ucsi: Always cancel the command if PPM reports BUSY condition")
Cc: stable@vger.kernel.org
Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Message-ID: <20230606115802.79339-1-heikki.krogerus@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
16 months agoUSB: dwc3: fix use-after-free on core driver unbind
Johan Hovold [Wed, 7 Jun 2023 10:05:40 +0000 (12:05 +0200)]
USB: dwc3: fix use-after-free on core driver unbind

Some dwc3 glue drivers are currently accessing the driver data of the
child core device directly, which is clearly a bad idea as the child may
not have probed yet or may have been unbound from its driver.

As a workaround until the glue drivers have been fixed, clear the driver
data pointer before allowing the glue parent device to runtime suspend
to prevent its driver from accessing data that has been freed during
unbind.

Fixes: 6dd2565989b4 ("usb: dwc3: add imx8mp dwc3 glue layer driver")
Fixes: 6895ea55c385 ("usb: dwc3: qcom: Configure wakeup interrupts during suspend")
Cc: stable@vger.kernel.org # 5.12
Cc: Li Jun <jun.li@nxp.com>
Cc: Sandeep Maheswaram <quic_c_sanm@quicinc.com>
Cc: Krishna Kurapati <quic_kriskura@quicinc.com>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Message-ID: <20230607100540.31045-3-johan+linaro@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
16 months agoUSB: dwc3: qcom: fix NULL-deref on suspend
Johan Hovold [Wed, 7 Jun 2023 10:05:39 +0000 (12:05 +0200)]
USB: dwc3: qcom: fix NULL-deref on suspend

The Qualcomm dwc3 glue driver is currently accessing the driver data of
the child core device during suspend and on wakeup interrupts. This is
clearly a bad idea as the child may not have probed yet or could have
been unbound from its driver.

The first such layering violation was part of the initial version of the
driver, but this was later made worse when the hack that accesses the
driver data of the grand child xhci device to configure the wakeup
interrupts was added.

Fixing this properly is not that easily done, so add a sanity check to
make sure that the child driver data is non-NULL before dereferencing it
for now.

Note that this relies on subtleties like the fact that driver core is
making sure that the parent is not suspended while the child is probing.

Reported-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/all/20230325165217.31069-4-manivannan.sadhasivam@linaro.org/
Fixes: d9152161b4bf ("usb: dwc3: Add Qualcomm DWC3 glue layer driver")
Fixes: 6895ea55c385 ("usb: dwc3: qcom: Configure wakeup interrupts during suspend")
Cc: stable@vger.kernel.org # 3.18: a872ab303d5d: "usb: dwc3: qcom: fix use-after-free on runtime-PM wakeup"
Cc: Sandeep Maheswaram <quic_c_sanm@quicinc.com>
Cc: Krishna Kurapati <quic_kriskura@quicinc.com>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Message-ID: <20230607100540.31045-2-johan+linaro@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
16 months agousb: dwc3: gadget: Reset num TRBs before giving back the request
Elson Roy Serrao [Thu, 1 Jun 2023 21:27:30 +0000 (14:27 -0700)]
usb: dwc3: gadget: Reset num TRBs before giving back the request

Consider a scenario where cable disconnect happens when there is an active
usb reqest queued to the UDC. As part of the disconnect we would issue an
end transfer with no interrupt-on-completion before giving back this
request. Since we are giving back the request without skipping TRBs the
num_trbs field of dwc3_request still holds the stale value previously used.
Function drivers re-use same request for a given bind-unbind session and
hence their dwc3_request context gets preserved across cable
disconnect/connect. When such a request gets re-queued after cable connect,
we would increase the num_trbs field on top of the previous stale value
thus incorrectly representing the number of TRBs used. Fix this by
resetting num_trbs field before giving back the request.

Fixes: 09fe1f8d7e2f ("usb: dwc3: gadget: track number of TRBs per request")
Cc: stable <stable@kernel.org>
Signed-off-by: Elson Roy Serrao <quic_eserrao@quicinc.com>
Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Message-ID: <1685654850-8468-1-git-send-email-quic_eserrao@quicinc.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
16 months agousb: gadget: udc: renesas_usb3: Fix RZ/V2M {modprobe,bind} error
Biju Das [Tue, 30 May 2023 16:17:20 +0000 (17:17 +0100)]
usb: gadget: udc: renesas_usb3: Fix RZ/V2M {modprobe,bind} error

Currently {modprobe, bind} after {rmmod, unbind} results in probe failure.

genirq: Flags mismatch irq 22. 00000004 (85070400.usb3drd) vs. 00000004 (85070400.usb3drd)
renesas_usb3: probe of 85070000.usb3peri failed with error -16

The reason is, it is trying to register an interrupt handler for the same
IRQ twice. The devm_request_irq() was called with the parent device.
So the interrupt handler won't be unregistered when the usb3-peri device
is unbound.

Fix this issue by replacing "parent dev"->"dev" as the irq resource
is managed by this driver.

Fixes: 9cad72dfc556 ("usb: gadget: Add support for RZ/V2M USB3DRD driver")
Cc: stable <stable@kernel.org>
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Message-ID: <20230530161720.179927-1-biju.das.jz@bp.renesas.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
16 months agoclk: clk-loongson2: Zero init clk_init_data
Binbin Zhou [Wed, 24 May 2023 01:49:24 +0000 (09:49 +0800)]
clk: clk-loongson2: Zero init clk_init_data

As clk_core_populate_parent_map() checks clk_init_data.num_parents
first, and checks clk_init_data.parent_names[] before
clk_init_data.parent_data[] and clk_init_data.parent_hws[].

Therefore the clk_init_data structure needs to be explicitly initialised
to prevent an unexpected crash if clk_init_data.parent_names[] is a
random value.

 CPU 0 Unable to handle kernel paging request at virtual address 0000000000000dc0, era == 9000000002986290, ra == 900000000298624c
 Oops[#1]:
 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.4.0-rc2+ #4582
 pc 9000000002986290 ra 900000000298624c tp 9000000100094000 sp 9000000100097a60
 a0 9000000104541e00 a1 0000000000000000 a2 0000000000000dc0 a3 0000000000000001
 a4 90000001000979f0 a5 90000001800977d7 a6 0000000000000000 a7 900000000362a000
 t0 90000000034f3548 t1 6f8c2a9cb5ab5f64 t2 0000000000011340 t3 90000000031cf5b0
 t4 0000000000000dc0 t5 0000000000000004 t6 0000000000011300 t7 9000000104541e40
 t8 000000000005a4f8 u0 9000000104541e00 s9 9000000104541e00 s0 9000000104bc4700
 s1 9000000104541da8 s2 0000000000000001 s3 900000000356f9d8 s4 ffffffffffffffff
 s5 0000000000000000 s6 0000000000000dc0 s7 90000000030d0a88 s8 0000000000000000
    ra: 900000000298624c __clk_register+0x228/0x84c
   ERA: 9000000002986290 __clk_register+0x26c/0x84c
  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
  PRMD: 00000004 (PPLV0 +PIE -PWE)
  EUEN: 00000000 (-FPE -SXE -ASXE -BTE)
  ECFG: 00071c1c (LIE=2-4,10-12 VS=7)
 ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0)
  BADV: 0000000000000dc0
  PRID: 0014a000 (Loongson-64bit, )
 Modules linked in:
 Process swapper/0 (pid: 1, threadinfo=(____ptrval____), task=(____ptrval____))
 Stack : 90000000031c1810 90000000030d0a88 900000000325bac0 90000000034f3548
         90000001002ab410 9000000104541e00 0000000000000dc0 9000000003150098
         90000000031c1810 90000000031a0460 900000000362a000 90000001002ab410
         900000000362a000 9000000104541da8 9000000104541de8 90000001002ab410
         900000000362a000 9000000002986a68 90000000034f3ed8 90000000030d0aa8
         9000000104541da8 900000000298d3b8 90000000031c1810 0000000000000000
         90000000034f3ed8 90000000030d0aa8 0000000000000dc0 90000000030d0a88
         90000001002ab410 900000000298d401 0000000000000000 6f8c2a9cb5ab5f64
         90000000034f4000 90000000030d0a88 9000000003a48a58 90000001002ab410
         9000000104bd81a8 900000000298d484 9000000100020260 0000000000000000
         ...
 Call Trace:
 [<9000000002986290>] __clk_register+0x26c/0x84c
 [<9000000002986a68>] devm_clk_hw_register+0x5c/0xe0
 [<900000000298d3b8>] loongson2_clk_register.constprop.0+0xdc/0x10c
 [<900000000298d484>] loongson2_clk_probe+0x9c/0x4ac
 [<9000000002a4eba4>] platform_probe+0x68/0xc8
 [<9000000002a4bf80>] really_probe+0xbc/0x2f0
 [<9000000002a4c23c>] __driver_probe_device+0x88/0x128
 [<9000000002a4c318>] driver_probe_device+0x3c/0x11c
 [<9000000002a4c5dc>] __driver_attach+0x98/0x18c
 [<9000000002a49ca0>] bus_for_each_dev+0x80/0xe0
 [<9000000002a4b0dc>] bus_add_driver+0xfc/0x1ec
 [<9000000002a4d4a8>] driver_register+0x68/0x134
 [<90000000020f0110>] do_one_initcall+0x50/0x188
 [<9000000003150f00>] kernel_init_freeable+0x224/0x294
 [<90000000030240fc>] kernel_init+0x20/0x110
 [<90000000020f1568>] ret_from_kernel_thread+0xc/0xa4

Fixes: acc0ccffec50 ("clk: clk-loongson2: add clock controller driver support")
Cc: stable@vger.kernel.org
Cc: Yinbo Zhu <zhuyinbo@loongson.cn>
Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
Link: https://lore.kernel.org/r/20230524014924.2869051-1-zhoubinbin@loongson.cn
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
16 months agoclk: mediatek: mt8365: Fix inverted topclk operations
Markus Schneider-Pargmann [Thu, 11 May 2023 13:32:26 +0000 (15:32 +0200)]
clk: mediatek: mt8365: Fix inverted topclk operations

The given operations are inverted for the wrong registers which makes
multiple of the mt8365 hardware units unusable. In my setup at least usb
did not work.

Fixed by swapping the operations with the inverted ones.

Reported-by: Alexandre Mergnat <amergnat@baylibre.com>
Fixes: 905b7430d3cc ("clk: mediatek: mt8365: Convert simple_gate to mtk_gate clocks")
Signed-off-by: Markus Schneider-Pargmann <msp@baylibre.com>
Link: https://lore.kernel.org/r/20230511133226.913600-1-msp@baylibre.com
Tested-by: Alexandre Mergnat <amergnat@baylibre.com>
Reviewed-by: Alexandre Mergnat <amergnat@baylibre.com>
Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
16 months agoclk: composite: Fix handling of high clock rates
Sebastian Reichel [Fri, 26 May 2023 17:10:56 +0000 (19:10 +0200)]
clk: composite: Fix handling of high clock rates

ULONG_MAX is used by a few drivers to figure out the highest available
clock rate via clk_round_rate(clk, ULONG_MAX). Since abs() takes a
signed value as input, the current logic effectively calculates with
ULONG_MAX = -1, which results in the worst parent clock being chosen
instead of the best one.

For example on Rockchip RK3588 the eMMC driver tries to figure out
the highest available clock rate. There are three parent clocks
available resulting in the following rate diffs with the existing
logic:

GPLL:   abs(18446744073709551615 - 1188000000) = 1188000001
CPLL:   abs(18446744073709551615 - 1500000000) = 1500000001
XIN24M: abs(18446744073709551615 -   24000000) =   24000001

As a result the clock framework will promote a maximum supported
clock rate of 24 MHz, even though 1.5GHz are possible. With the
updated logic any casting between signed and unsigned is avoided
and the numbers look like this instead:

GPLL:   18446744073709551615 - 1188000000 = 18446744072521551615
CPLL:   18446744073709551615 - 1500000000 = 18446744072209551615
XIN24M: 18446744073709551615 -   24000000 = 18446744073685551615

As a result the parent with the highest acceptable rate is chosen
instead of the parent clock with the lowest one.

Cc: stable@vger.kernel.org
Fixes: 49502408007b ("mmc: sdhci-of-dwcmshc: properly determine max clock on Rockchip")
Tested-by: Christopher Obbard <chris.obbard@collabora.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Link: https://lore.kernel.org/r/20230526171057.66876-2-sebastian.reichel@collabora.com
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
16 months agoMerge branch 'mptcp-fixes'
Jakub Kicinski [Mon, 12 Jun 2023 23:55:45 +0000 (16:55 -0700)]
Merge branch 'mptcp-fixes'

Matthieu Baerts says:

====================
selftests: mptcp: skip tests not supported by old kernels (part 3)

After a few years of increasing test coverage in the MPTCP selftests, we
realised [1] the last version of the selftests is supposed to run on old
kernels without issues.

Supporting older versions is not that easy for this MPTCP case: these
selftests are often validating the internals by checking packets that
are exchanged, when some MIB counters are incremented after some
actions, how connections are getting opened and closed in some cases,
etc. In other words, it is not limited to the socket interface between
the userspace and the kernelspace.

In addition to that, the current MPTCP selftests run a lot of different
sub-tests but the TAP13 protocol used in the selftests don't support
sub-tests: one failure in sub-tests implies that the whole selftest is
seen as failed at the end because sub-tests are not tracked. It is then
important to skip sub-tests not supported by old kernels.

To minimise the modifications and reduce the complexity to support old
versions, the idea is to look at external signs and skip the whole
selftest or just some sub-tests before starting them. This cannot be
applied in all cases.

Similar to the second part, this third one focuses on marking different
sub-tests as skipped if some MPTCP features are not supported. This
time, only in "mptcp_join.sh" selftest, the remaining one, is modified.
Several techniques are used here to achieve this task:

- Before starting some tests:

  - Check if a file (sysctl knob) is present: that's what patch 12/17 is
    doing for the userspace PM feature.

  - Check if a required kernel symbol is present in /proc/kallsyms:
    patches 9, 10, 14 and 15/17 are using this technique.

  - Check if it is possible to setup a particular network environment
    requiring Netfilter or TC: if the preparation step fail, the linked
    sub-test is marked as skipped. Patch 5/17 is doing that.

  - Check if a MIB counter is available: patches 7 and 13/17 do that.

  - Check if the kernel version is newer than a specific one: patch 1/17
    adds some helpers in mptcp_lib.sh to ease its use. That's not ideal
    and it is only used as last resort but as mentioned above, it is
    important to skip tests if they are not supported not to have the
    whole selftest always being marked as failed on old kernels. Patches
    11 and 17/17 are checking the kernel version. An alternative would
    be to ignore the results for some sub-tests but that's not ideal
    too. Note that SELFTESTS_MPTCP_LIB_NO_KVERSION_CHECK env var can be
    set to 1 not to skip these tests if the running kernel doesn't have
    a supported version.

- After having launched the tests:

  - Adapt the expectations depending on the presence of a kernel symbol
    (patch 6/17) or a kernel version (patch 8/17).

  - Check is a MIB counter is available and skip the verification if
    not. Patch 4/17 is using this technique.

Before skipping tests, SELFTESTS_MPTCP_LIB_EXPECT_ALL_FEATURES env var
value is checked: if it is set to 1, the test is marked as "failed"
instead of "skipped". MPTCP public CI expects to have all features
supported and it sets this env var to 1 to catch regressions in these
new checks.

Patch 2/17 uses 'iptables-legacy' if available because it might be
needed when using an older kernel not supporting iptables-nft.

Patch 3/17 adds some helpers used in the other patches mentioned to
easily mark sub-tests as skipped.

Patch 16/17 uniforms MPTCP Join "listener" tests: it was imported code
from userspace_pm.sh but without using the "code style" and ways of
using tools and printing messages from MPTCP Join selftest.

Link: https://lore.kernel.org/stable/CA+G9fYtDGpgT4dckXD-y-N92nqUxuvue_7AtDdBcHrbOMsDZLg@mail.gmail.com/
Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
====================

Link: https://lore.kernel.org/r/20230609-upstream-net-20230610-mptcp-selftests-support-old-kernels-part-3-v1-0-2896fe2ee8a3@tessares.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoselftests: mptcp: join: skip mixed tests if not supported
Matthieu Baerts [Sat, 10 Jun 2023 16:11:52 +0000 (18:11 +0200)]
selftests: mptcp: join: skip mixed tests if not supported

Selftests are supposed to run on any kernels, including the old ones not
supporting all MPTCP features.

One of them is the support of a mix of subflows in v4 and v6 by the
in-kernel PM introduced by commit b9d69db87fb7 ("mptcp: let the
in-kernel PM use mixed IPv4 and IPv6 addresses").

It looks like there is no external sign we can use to predict the
expected behaviour. Instead of accepting different behaviours and thus
not really checking for the expected behaviour, we are looking here for
a specific kernel version. That's not ideal but it looks better than
removing the test because it cannot support older kernel versions.

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Fixes: ad3493746ebe ("selftests: mptcp: add test-cases for mixed v4/v6 subflows")
Cc: stable@vger.kernel.org
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoselftests: mptcp: join: uniform listener tests
Matthieu Baerts [Sat, 10 Jun 2023 16:11:51 +0000 (18:11 +0200)]
selftests: mptcp: join: uniform listener tests

The alignment was different from the other tests because tabs were used
instead of spaces.

While at it, also use 'echo' instead of 'printf' to print the result to
keep the same style as done in the other sub-tests. And, even if it
should be better with, also remove 'stdbuf' and sed's '--unbuffered'
option because they are not used in the other subtests and they are not
available when using a minimal environment with busybox.

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Fixes: 178d023208eb ("selftests: mptcp: listener test for in-kernel PM")
Cc: stable@vger.kernel.org
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoselftests: mptcp: join: skip PM listener tests if not supported
Matthieu Baerts [Sat, 10 Jun 2023 16:11:50 +0000 (18:11 +0200)]
selftests: mptcp: join: skip PM listener tests if not supported

Selftests are supposed to run on any kernels, including the old ones not
supporting all MPTCP features.

One of them is the support of PM listener events introduced by commit
f8c9dfbd875b ("mptcp: add pm listener events").

It is possible to look for "mptcp_event_pm_listener" in kallsyms to know
in advance if the kernel supports this feature.

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Fixes: 178d023208eb ("selftests: mptcp: listener test for in-kernel PM")
Cc: stable@vger.kernel.org
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoselftests: mptcp: join: skip MPC backups tests if not supported
Matthieu Baerts [Sat, 10 Jun 2023 16:11:49 +0000 (18:11 +0200)]
selftests: mptcp: join: skip MPC backups tests if not supported

Selftests are supposed to run on any kernels, including the old ones not
supporting all MPTCP features.

One of them is the support of sending an MP_PRIO signal for the initial
subflow, introduced by commit c157bbe776b7 ("mptcp: allow the in kernel
PM to set MPC subflow priority").

It is possible to look for "mptcp_subflow_send_ack" in kallsyms because
it was needed to introduce the mentioned feature. So we can know in
advance if the feature is supported instead of trying and accepting any
results.

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Fixes: 914f6a59b10f ("selftests: mptcp: add MPC backup tests")
Cc: stable@vger.kernel.org
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoselftests: mptcp: join: skip fail tests if not supported
Matthieu Baerts [Sat, 10 Jun 2023 16:11:48 +0000 (18:11 +0200)]
selftests: mptcp: join: skip fail tests if not supported

Selftests are supposed to run on any kernels, including the old ones not
supporting all MPTCP features.

One of them is the support of the MP_FAIL / infinite mapping introduced
by commit 1e39e5a32ad7 ("mptcp: infinite mapping sending") and the
following ones.

It is possible to look for one of the infinite mapping counters to know
in advance if the this feature is available.

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Fixes: b6e074e171bc ("selftests: mptcp: add infinite map testcase")
Cc: stable@vger.kernel.org
Fixes: 2ba18161d407 ("selftests: mptcp: add MP_FAIL reset testcase")
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoselftests: mptcp: join: skip userspace PM tests if not supported
Matthieu Baerts [Sat, 10 Jun 2023 16:11:47 +0000 (18:11 +0200)]
selftests: mptcp: join: skip userspace PM tests if not supported

Selftests are supposed to run on any kernels, including the old ones not
supporting all MPTCP features.

One of them is the support of the userspace PM introduced by commit
4638de5aefe5 ("mptcp: handle local addrs announced by userspace PMs")
and the following ones.

It is possible to look for the MPTCP pm_type's sysctl knob to know in
advance if the userspace PM is available.

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Fixes: 5ac1d2d63451 ("selftests: mptcp: Add tests for userspace PM type")
Cc: stable@vger.kernel.org
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoselftests: mptcp: join: skip fullmesh flag tests if not supported
Matthieu Baerts [Sat, 10 Jun 2023 16:11:46 +0000 (18:11 +0200)]
selftests: mptcp: join: skip fullmesh flag tests if not supported

Selftests are supposed to run on any kernels, including the old ones not
supporting all MPTCP features.

One of them is the support of the fullmesh flag for the in-kernel PM
introduced by commit 2843ff6f36db ("mptcp: remote addresses fullmesh")
and commit 1a0d6136c5f0 ("mptcp: local addresses fullmesh").

It looks like there is no easy external sign we can use to predict the
expected behaviour. We could add the flag and then check if it has been
added but for that, and for each fullmesh test, we would need to setup a
new environment, do the checks, clean it and then only start the test
from yet another clean environment. To keep it simple and avoid
introducing new issues, we look for a specific kernel version. That's
not ideal but an acceptable solution for this case.

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Fixes: 6a0653b96f5d ("selftests: mptcp: add fullmesh setting tests")
Cc: stable@vger.kernel.org
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoselftests: mptcp: join: skip backup if set flag on ID not supported
Matthieu Baerts [Sat, 10 Jun 2023 16:11:45 +0000 (18:11 +0200)]
selftests: mptcp: join: skip backup if set flag on ID not supported

Selftests are supposed to run on any kernels, including the old ones not
supporting all MPTCP features.

Commit bccefb762439 ("selftests: mptcp: simplify pm_nl_change_endpoint")
has simplified the way the backup flag is set on an endpoint. Instead of
doing:

  ./pm_nl_ctl set 10.0.2.1 flags backup

Now we do:

  ./pm_nl_ctl set id 1 flags backup

The new way is easier to maintain but it is also incompatible with older
kernels not supporting the implicit endpoints putting in place the
infrastructure to set flags per ID, hence the second Fixes tag.

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Fixes: bccefb762439 ("selftests: mptcp: simplify pm_nl_change_endpoint")
Cc: stable@vger.kernel.org
Fixes: 4cf86ae84c71 ("mptcp: strict local address ID selection")
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoselftests: mptcp: join: skip implicit tests if not supported
Matthieu Baerts [Sat, 10 Jun 2023 16:11:44 +0000 (18:11 +0200)]
selftests: mptcp: join: skip implicit tests if not supported

Selftests are supposed to run on any kernels, including the old ones not
supporting all MPTCP features.

One of them is the support of the implicit endpoints introduced by
commit d045b9eb95a9 ("mptcp: introduce implicit endpoints").

It is possible to look for "mptcp_subflow_send_ack" in kallsyms because
it was needed to introduce the mentioned feature. So we can know in
advance if the feature is supported instead of trying and accepting any
results.

Note that here and in the following commits, we re-do the same check for
each sub-test of the same function for a few reasons. The main one is
not to break the ID assign to each test in order to be able to easily
compare results between different kernel versions. Also, we can still
run a specific test even if it is skipped. Another reason is that it
makes it clear during the review that a specific subtest will be skipped
or not under certain conditions. At the end, it looks OK to call the
exact same helper multiple times: it is not a critical path and it is
the same code that is executed, not really more cases to maintain.

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Fixes: 69c6ce7b6eca ("selftests: mptcp: add implicit endpoint test case")
Cc: stable@vger.kernel.org
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoselftests: mptcp: join: support RM_ADDR for used endpoints or not
Matthieu Baerts [Sat, 10 Jun 2023 16:11:43 +0000 (18:11 +0200)]
selftests: mptcp: join: support RM_ADDR for used endpoints or not

Selftests are supposed to run on any kernels, including the old ones not
supporting all MPTCP features.

At some points, a new feature caused internal behaviour changes we are
verifying in the selftests, see the Fixes tag below. It was not a UAPI
change but because in these selftests, we check some internal
behaviours, it is normal we have to adapt them from time to time after
having added some features.

It looks like there is no external sign we can use to predict the
expected behaviour. Instead of accepting different behaviours and thus
not really checking for the expected behaviour, we are looking here for
a specific kernel version. That's not ideal but it looks better than
removing the test because it cannot support older kernel versions.

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Fixes: 6fa0174a7c86 ("mptcp: more careful RM_ADDR generation")
Cc: stable@vger.kernel.org
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoselftests: mptcp: join: skip Fastclose tests if not supported
Matthieu Baerts [Sat, 10 Jun 2023 16:11:42 +0000 (18:11 +0200)]
selftests: mptcp: join: skip Fastclose tests if not supported

Selftests are supposed to run on any kernels, including the old ones not
supporting all MPTCP features.

One of them is the support of MP_FASTCLOSE introduced in commit
f284c0c77321 ("mptcp: implement fastclose xmit path").

If the MIB counter is not available, the test cannot be verified and the
behaviour will not be the expected one. So we can skip the test if the
counter is missing.

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Fixes: 01542c9bf9ab ("selftests: mptcp: add fastclose testcase")
Cc: stable@vger.kernel.org
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoselftests: mptcp: join: support local endpoint being tracked or not
Matthieu Baerts [Sat, 10 Jun 2023 16:11:41 +0000 (18:11 +0200)]
selftests: mptcp: join: support local endpoint being tracked or not

Selftests are supposed to run on any kernels, including the old ones not
supporting all MPTCP features.

At some points, a new feature caused internal behaviour changes we are
verifying in the selftests, see the Fixes tag below. It was not a uAPI
change but because in these selftests, we check some internal
behaviours, it is normal we have to adapt them from time to time after
having added some features.

It is possible to look for "mptcp_pm_subflow_check_next" in kallsyms
because it was needed to introduce the mentioned feature. So we can know
in advance what the behaviour we are expecting here instead of
supporting the two behaviours.

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Fixes: 86e39e04482b ("mptcp: keep track of local endpoint still available for each msk")
Cc: stable@vger.kernel.org
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoselftests: mptcp: join: skip test if iptables/tc cmds fail
Matthieu Baerts [Sat, 10 Jun 2023 16:11:40 +0000 (18:11 +0200)]
selftests: mptcp: join: skip test if iptables/tc cmds fail

Selftests are supposed to run on any kernels, including the old ones not
supporting all MPTCP features.

Some tests are using IPTables and/or TC commands to force some
behaviours. If one of these commands fails -- likely because some
features are not available due to missing kernel config -- we should
intercept the error and skip the tests requiring these features.

Note that if we expect to have these features available and if
SELFTESTS_MPTCP_LIB_EXPECT_ALL_FEATURES env var is set to 1, the tests
will be marked as failed instead of skipped.

This patch also replaces the 'exit 1' by 'return 1' not to stop the
selftest in the middle without the conclusion if there is an issue with
NF or TC.

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Fixes: 8d014eaa9254 ("selftests: mptcp: add ADD_ADDR timeout test case")
Cc: stable@vger.kernel.org
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoselftests: mptcp: join: skip check if MIB counter not supported
Matthieu Baerts [Sat, 10 Jun 2023 16:11:39 +0000 (18:11 +0200)]
selftests: mptcp: join: skip check if MIB counter not supported

Selftests are supposed to run on any kernels, including the old ones not
supporting all MPTCP features.

One of them is the MPTCP MIB counters introduced in commit fc518953bc9c
("mptcp: add and use MIB counter infrastructure") and more later. The
MPTCP Join selftest heavily relies on these counters.

If a counter is not supported by the kernel, it is not displayed when
using 'nstat -z'. We can then detect that and skip the verification. A
new helper (get_counter()) has been added to do the required checks and
return an error if the counter is not available.

Note that if we expect to have these features available and if
SELFTESTS_MPTCP_LIB_EXPECT_ALL_FEATURES env var is set to 1, the tests
will be marked as failed instead of skipped.

This new helper also makes sure we get the exact counter we want to
avoid issues we had in the past, e.g. with MPTcpExtRmAddr and
MPTcpExtRmAddrDrop sharing the same prefix. While at it, we uniform the
way we fetch a MIB counter.

Note for the backports: we rarely change these modified blocks so if
there is are conflicts, it is very likely because a counter is not used
in the older kernels and we don't need that chunk.

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Fixes: b08fbf241064 ("selftests: add test-cases for MPTCP MP_JOIN")
Cc: stable@vger.kernel.org
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoselftests: mptcp: join: helpers to skip tests
Matthieu Baerts [Sat, 10 Jun 2023 16:11:38 +0000 (18:11 +0200)]
selftests: mptcp: join: helpers to skip tests

Selftests are supposed to run on any kernels, including the old ones not
supporting all MPTCP features.

Here are some helpers that will be used to mark subtests as skipped if a
feature is not supported. Marking as a fix for the commit introducing
this selftest to help with the backports.

While at it, also check if kallsyms feature is available as it will also
be used in the following commits to check if MPTCP features are
available before starting a test.

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Fixes: b08fbf241064 ("selftests: add test-cases for MPTCP MP_JOIN")
Cc: stable@vger.kernel.org
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoselftests: mptcp: join: use 'iptables-legacy' if available
Matthieu Baerts [Sat, 10 Jun 2023 16:11:37 +0000 (18:11 +0200)]
selftests: mptcp: join: use 'iptables-legacy' if available

IPTables commands using 'iptables-nft' fail on old kernels, at least
5.15 because it doesn't see the default IPTables chains:

  $ iptables -L
  iptables/1.8.2 Failed to initialize nft: Protocol not supported

As a first step before switching to NFTables, we can use iptables-legacy
if available.

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Fixes: 8d014eaa9254 ("selftests: mptcp: add ADD_ADDR timeout test case")
Cc: stable@vger.kernel.org
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoselftests: mptcp: lib: skip if not below kernel version
Matthieu Baerts [Sat, 10 Jun 2023 16:11:36 +0000 (18:11 +0200)]
selftests: mptcp: lib: skip if not below kernel version

Selftests are supposed to run on any kernels, including the old ones not
supporting all MPTCP features.

A new function is now available to easily detect if a feature is
missing by looking at the kernel version. That's clearly not ideal and
this kind of check should be avoided as soon as possible. But sometimes,
there are no external sign that a "feature" is available or not:
internal behaviours can change without modifying the uAPI and these
selftests are verifying the internal behaviours. Sometimes, the only
(easy) way to verify if the feature is present is to run the test but
then the validation cannot determine if there is a failure with the
feature or if the feature is missing. Then it looks better to check the
kernel version instead of having tests that can never fail. In any case,
we need a solution not to have a whole selftest being marked as failed
just because one sub-test has failed.

Note that this env var car be set to 1 not to do such check and run the
linked sub-test: SELFTESTS_MPTCP_LIB_NO_KVERSION_CHECK.

This new helper is going to be used in the following commits. In order
to ease the backport of such future patches, it would be good if this
patch is backported up to the introduction of MPTCP selftests, hence the
Fixes tag below: this type of check was supposed to be done from the
beginning.

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Fixes: 048d19d444be ("mptcp: add basic kselftest for mptcp")
Cc: stable@vger.kernel.org
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoMerge branch 'fixes-for-q-usgmii-speeds-and-autoneg'
Jakub Kicinski [Mon, 12 Jun 2023 23:51:22 +0000 (16:51 -0700)]
Merge branch 'fixes-for-q-usgmii-speeds-and-autoneg'

Maxime Chevallier says:

====================
fixes for Q-USGMII speeds and autoneg

This is the second version of a small changeset for QUSGMII support,
fixing inconsistencies in reported max speed and control word parsing.

As reported here [1], there are some inconsistencies for the Q-USGMII
mode speeds and configuration. The first patch in this fixup series
makes so that we correctly report the max speed of 1Gbps for this mode.

The second patch uses a dedicated helper to decode the control word.
This is necessary as although USGMII control words are close to USXGMII,
they don't support the same speeds.

[1] : https://lore.kernel.org/netdev/ZHnd+6FUO77XFJvQ@shell.armlinux.org.uk/
====================

Link: https://lore.kernel.org/r/20230609080305.546028-1-maxime.chevallier@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agonet: phylink: use a dedicated helper to parse usgmii control word
Maxime Chevallier [Fri, 9 Jun 2023 08:03:05 +0000 (10:03 +0200)]
net: phylink: use a dedicated helper to parse usgmii control word

Q-USGMII is a derivative of USGMII, that uses a specific formatting for
the control word. The layout is close to the USXGMII control word, but
doesn't support speeds over 1Gbps. Use a dedicated decoding logic for
the USGMII control word, re-using USXGMII definitions but only considering
10/100/1000Mbps speeds

Fixes: 5e61fe157a27 ("net: phy: Introduce QUSGMII PHY mode")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agonet: phylink: report correct max speed for QUSGMII
Maxime Chevallier [Fri, 9 Jun 2023 08:03:04 +0000 (10:03 +0200)]
net: phylink: report correct max speed for QUSGMII

Q-USGMII is the quad port version of USGMII, and supports a max speed of
1Gbps on each line. Make so that phylink_interface_max_speed() reports
this information correctly.

Fixes: ae0e4bb2a0e0 ("net: phylink: Adjust link settings based on rate matching")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agobtrfs: do not ASSERT() on duplicated global roots
Qu Wenruo [Sun, 11 Jun 2023 00:09:13 +0000 (08:09 +0800)]
btrfs: do not ASSERT() on duplicated global roots

[BUG]
Syzbot reports a reproducible ASSERT() when using rescue=usebackuproot
mount option on a corrupted fs.

The full report can be found here:
https://syzkaller.appspot.com/bug?extid=c4614eae20a166c25bf0

  BTRFS error (device loop0: state C): failed to load root csum
  assertion failed: !tmp, in fs/btrfs/disk-io.c:1103
  ------------[ cut here ]------------
  kernel BUG at fs/btrfs/ctree.h:3664!
  invalid opcode: 0000 [#1] PREEMPT SMP KASAN
  CPU: 1 PID: 3608 Comm: syz-executor356 Not tainted 6.0.0-rc7-syzkaller-00029-g3800a713b607 #0
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/26/2022
  RIP: 0010:assertfail+0x1a/0x1c fs/btrfs/ctree.h:3663
  RSP: 0018:ffffc90003aaf250 EFLAGS: 00010246
  RAX: 0000000000000032 RBX: 0000000000000000 RCX: f21c13f886638400
  RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000
  RBP: ffff888021c640a0 R08: ffffffff816bd38d R09: ffffed10173667f1
  R10: ffffed10173667f1 R11: 1ffff110173667f0 R12: dffffc0000000000
  R13: ffff8880229c21f7 R14: ffff888021c64060 R15: ffff8880226c0000
  FS:  0000555556a73300(0000) GS:ffff8880b9b00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 000055a2637d7a00 CR3: 00000000709c4000 CR4: 00000000003506e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   <TASK>
   btrfs_global_root_insert+0x1a7/0x1b0 fs/btrfs/disk-io.c:1103
   load_global_roots_objectid+0x482/0x8c0 fs/btrfs/disk-io.c:2467
   load_global_roots fs/btrfs/disk-io.c:2501 [inline]
   btrfs_read_roots fs/btrfs/disk-io.c:2528 [inline]
   init_tree_roots+0xccb/0x203c fs/btrfs/disk-io.c:2939
   open_ctree+0x1e53/0x33df fs/btrfs/disk-io.c:3574
   btrfs_fill_super+0x1c6/0x2d0 fs/btrfs/super.c:1456
   btrfs_mount_root+0x885/0x9a0 fs/btrfs/super.c:1824
   legacy_get_tree+0xea/0x180 fs/fs_context.c:610
   vfs_get_tree+0x88/0x270 fs/super.c:1530
   fc_mount fs/namespace.c:1043 [inline]
   vfs_kern_mount+0xc9/0x160 fs/namespace.c:1073
   btrfs_mount+0x3d3/0xbb0 fs/btrfs/super.c:1884

[CAUSE]
Since the introduction of global roots, we handle
csum/extent/free-space-tree roots as global roots, even if no
extent-tree-v2 feature is enabled.

So for regular csum/extent/fst roots, we load them into
fs_info::global_root_tree rb tree.

And we should not expect any conflicts in that rb tree, thus we have an
ASSERT() inside btrfs_global_root_insert().

But rescue=usebackuproot can break the assumption, as we will try to
load those trees again and again as long as we have bad roots and have
backup roots slot remaining.

So in that case we can have conflicting roots in the rb tree, and
triggering the ASSERT() crash.

[FIX]
We can safely remove that ASSERT(), as the caller will properly put the
offending root.

To make further debugging easier, also add two explicit error messages:

- Error message for conflicting global roots
- Error message when using backup roots slot

Reported-by: syzbot+a694851c6ab28cbcfb9c@syzkaller.appspotmail.com
Fixes: abed4aaae4f7 ("btrfs: track the csum, extent, and free space trees in a rb tree")
CC: stable@vger.kernel.org # 6.1+
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
16 months agoMerge tag 'mm-hotfixes-stable-2023-06-12-12-22' of git://git.kernel.org/pub/scm/linux...
Linus Torvalds [Mon, 12 Jun 2023 23:14:34 +0000 (16:14 -0700)]
Merge tag 'mm-hotfixes-stable-2023-06-12-12-22' of git://git./linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "19 hotfixes. 14 are cc:stable and the remainder address issues which
  were introduced during this development cycle or which were considered
  inappropriate for a backport"

* tag 'mm-hotfixes-stable-2023-06-12-12-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  zswap: do not shrink if cgroup may not zswap
  page cache: fix page_cache_next/prev_miss off by one
  ocfs2: check new file size on fallocate call
  mailmap: add entry for John Keeping
  mm/damon/core: fix divide error in damon_nr_accesses_to_accesses_bp()
  epoll: ep_autoremove_wake_function should use list_del_init_careful
  mm/gup_test: fix ioctl fail for compat task
  nilfs2: reject devices with insufficient block count
  ocfs2: fix use-after-free when unmounting read-only filesystem
  lib/test_vmalloc.c: avoid garbage in page array
  nilfs2: fix possible out-of-bounds segment allocation in resize ioctl
  riscv/purgatory: remove PGO flags
  powerpc/purgatory: remove PGO flags
  x86/purgatory: remove PGO flags
  kexec: support purgatories with .text.hot sections
  mm/uffd: allow vma to merge as much as possible
  mm/uffd: fix vma operation where start addr cuts part of vma
  radix-tree: move declarations to header
  nilfs2: fix incomplete buffer cleanup in nilfs_btnode_abort_change_key()

16 months agobtrfs: can_nocow_file_extent should pass down args->strict from callers
Chris Mason [Fri, 9 Jun 2023 17:53:41 +0000 (10:53 -0700)]
btrfs: can_nocow_file_extent should pass down args->strict from callers

Commit 619104ba453ad0 ("btrfs: move common NOCOW checks against a file
extent into a helper") changed our call to btrfs_cross_ref_exist() to
always pass false for the 'strict' parameter.  We're passing this down
through the stack so that we can do a full check for cross references
during swapfile activation.

With strict always false, this test fails:

  btrfs subvol create swappy
  chattr +C swappy
  fallocate -l1G swappy/swapfile
  chmod 600 swappy/swapfile
  mkswap swappy/swapfile

  btrfs subvol snap swappy swapsnap
  btrfs subvol del -C swapsnap

  btrfs fi sync /
  sync;sync;sync

  swapon swappy/swapfile

The fix is to just use args->strict, and everyone except swapfile
activation is passing false.

Fixes: 619104ba453ad0 ("btrfs: move common NOCOW checks against a file extent into a helper")
CC: stable@vger.kernel.org # 6.1+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
16 months agobtrfs: fix iomap_begin length for nocow writes
Christoph Hellwig [Thu, 8 Jun 2023 09:10:25 +0000 (11:10 +0200)]
btrfs: fix iomap_begin length for nocow writes

can_nocow_extent can reduce the len passed in, which needs to be
propagated to btrfs_dio_iomap_begin so that iomap does not submit
more data then is mapped.

This problems exists since the btrfs_get_blocks_direct helper was added
in commit c5794e51784a ("btrfs: Factor out write portion of
btrfs_get_blocks_direct"), but the ordered_extent splitting added in
commit b73a6fd1b1ef ("btrfs: split partial dio bios before submit")
added a WARN_ON that made a syzkaller test fail.

Reported-by: syzbot+ee90502d5c8fd1d0dd93@syzkaller.appspotmail.com
Fixes: c5794e51784a ("btrfs: Factor out write portion of btrfs_get_blocks_direct")
CC: stable@vger.kernel.org # 6.1+
Tested-by: syzbot+ee90502d5c8fd1d0dd93@syzkaller.appspotmail.com
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David Sterba <dsterba@suse.com>