platform/kernel/linux-starfive.git
2 years agoext4: fix null-ptr-deref in ext4_write_info
Baokun Li [Fri, 5 Aug 2022 12:39:47 +0000 (20:39 +0800)]
ext4: fix null-ptr-deref in ext4_write_info

I caught a null-ptr-deref bug as follows:
==================================================================
KASAN: null-ptr-deref in range [0x0000000000000068-0x000000000000006f]
CPU: 1 PID: 1589 Comm: umount Not tainted 5.10.0-02219-dirty #339
RIP: 0010:ext4_write_info+0x53/0x1b0
[...]
Call Trace:
 dquot_writeback_dquots+0x341/0x9a0
 ext4_sync_fs+0x19e/0x800
 __sync_filesystem+0x83/0x100
 sync_filesystem+0x89/0xf0
 generic_shutdown_super+0x79/0x3e0
 kill_block_super+0xa1/0x110
 deactivate_locked_super+0xac/0x130
 deactivate_super+0xb6/0xd0
 cleanup_mnt+0x289/0x400
 __cleanup_mnt+0x16/0x20
 task_work_run+0x11c/0x1c0
 exit_to_user_mode_prepare+0x203/0x210
 syscall_exit_to_user_mode+0x5b/0x3a0
 do_syscall_64+0x59/0x70
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
 ==================================================================

Above issue may happen as follows:
-------------------------------------
exit_to_user_mode_prepare
 task_work_run
  __cleanup_mnt
   cleanup_mnt
    deactivate_super
     deactivate_locked_super
      kill_block_super
       generic_shutdown_super
        shrink_dcache_for_umount
         dentry = sb->s_root
         sb->s_root = NULL              <--- Here set NULL
        sync_filesystem
         __sync_filesystem
          sb->s_op->sync_fs > ext4_sync_fs
           dquot_writeback_dquots
            sb->dq_op->write_info > ext4_write_info
             ext4_journal_start(d_inode(sb->s_root), EXT4_HT_QUOTA, 2)
              d_inode(sb->s_root)
               s_root->d_inode          <--- Null pointer dereference

To solve this problem, we use ext4_journal_start_sb directly
to avoid s_root being used.

Cc: stable@kernel.org
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220805123947.565152-1-libaokun1@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2 years agoext4: don't run ext4lazyinit for read-only filesystems
Josh Triplett [Mon, 1 Aug 2022 03:24:53 +0000 (20:24 -0700)]
ext4: don't run ext4lazyinit for read-only filesystems

On a read-only filesystem, we won't invoke the block allocator, so we
don't need to prefetch the block bitmaps.

This avoids starting and running the ext4lazyinit thread at all on a
system with no read-write ext4 filesystems (for instance, a container VM
with read-only filesystems underneath an overlayfs).

Fixes: 21175ca434c5 ("ext4: make prefetch_block_bitmaps default")
Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
Link: https://lore.kernel.org/r/48b41da1498fcac3287e2e06b660680646c1c050.1659323972.git.josh@joshtriplett.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2 years agoext4: remove deprecated noacl/nouser_xattr options
Yang Xu [Thu, 28 Jul 2022 03:02:49 +0000 (11:02 +0800)]
ext4: remove deprecated noacl/nouser_xattr options

These two options should have been removed since 3.5, but none notices it.
Recently, I and Darrick found this. Also, have some discussion for this[1][2][3].

So now, let's remove them.

Link: https://lore.kernel.org/linux-ext4/6258F7BB.8010104@fujitsu.com/T/#u[1]
Link: https://lore.kernel.org/linux-ext4/20220602110421.ymoug3rwfspmryqg@fedora/T/#t[2]
Link: https://lore.kernel.org/linux-ext4/08e2ca4c8f6344bdcd76d75b821116c6147fd57a.camel@kernel.org/T/#t[3]
Signed-off-by: Yang Xu <xuyang2018.jy@fujitsu.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/1658977369-2478-1-git-send-email-xuyang2018.jy@fujitsu.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2 years agoext4: avoid crash when inline data creation follows DIO write
Jan Kara [Wed, 27 Jul 2022 15:57:53 +0000 (17:57 +0200)]
ext4: avoid crash when inline data creation follows DIO write

When inode is created and written to using direct IO, there is nothing
to clear the EXT4_STATE_MAY_INLINE_DATA flag. Thus when inode gets
truncated later to say 1 byte and written using normal write, we will
try to store the data as inline data. This confuses the code later
because the inode now has both normal block and inline data allocated
and the confusion manifests for example as:

kernel BUG at fs/ext4/inode.c:2721!
invalid opcode: 0000 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 359 Comm: repro Not tainted 5.19.0-rc8-00001-g31ba1e3b8305-dirty #15
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-1.fc36 04/01/2014
RIP: 0010:ext4_writepages+0x363d/0x3660
RSP: 0018:ffffc90000ccf260 EFLAGS: 00010293
RAX: ffffffff81e1abcd RBX: 0000008000000000 RCX: ffff88810842a180
RDX: 0000000000000000 RSI: 0000008000000000 RDI: 0000000000000000
RBP: ffffc90000ccf650 R08: ffffffff81e17d58 R09: ffffed10222c680b
R10: dfffe910222c680c R11: 1ffff110222c680a R12: ffff888111634128
R13: ffffc90000ccf880 R14: 0000008410000000 R15: 0000000000000001
FS:  00007f72635d2640(0000) GS:ffff88811b000000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000565243379180 CR3: 000000010aa74000 CR4: 0000000000150eb0
Call Trace:
 <TASK>
 do_writepages+0x397/0x640
 filemap_fdatawrite_wbc+0x151/0x1b0
 file_write_and_wait_range+0x1c9/0x2b0
 ext4_sync_file+0x19e/0xa00
 vfs_fsync_range+0x17b/0x190
 ext4_buffered_write_iter+0x488/0x530
 ext4_file_write_iter+0x449/0x1b90
 vfs_write+0xbcd/0xf40
 ksys_write+0x198/0x2c0
 __x64_sys_write+0x7b/0x90
 do_syscall_64+0x3d/0x90
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
 </TASK>

Fix the problem by clearing EXT4_STATE_MAY_INLINE_DATA when we are doing
direct IO write to a file.

Cc: stable@kernel.org
Reported-by: Tadeusz Struk <tadeusz.struk@linaro.org>
Reported-by: syzbot+bd13648a53ed6933ca49@syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?id=a1e89d09bbbcbd5c4cb45db230ee28c822953984
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
Tested-by: Tadeusz Struk<tadeusz.struk@linaro.org>
Link: https://lore.kernel.org/r/20220727155753.13969-1-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2 years agoext4: minor defrag code improvements
Eric Whitney [Fri, 22 Jul 2022 16:39:10 +0000 (12:39 -0400)]
ext4: minor defrag code improvements

Modify the error returns for two file types that can't be defragged to
more clearly communicate those restrictions to a caller.  When the
defrag code is applied to swap files, return -ETXTBSY, and when applied
to quota files, return -EOPNOTSUPP.  Move an extent tree search whose
results are only occasionally required to the site always requiring them
for improved efficiency.  Address a few typos.

Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Link: https://lore.kernel.org/r/20220722163910.268564-1-enwlinux@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2 years agoext4: continue to expand file system when the target size doesn't reach
Jerry Lee 李修賢 [Mon, 18 Jul 2022 10:25:19 +0000 (10:25 +0000)]
ext4: continue to expand file system when the target size doesn't reach

When expanding a file system from (16TiB-2MiB) to 18TiB, the operation
exits early which leads to result inconsistency between resize2fs and
Ext4 kernel driver.

=== before ===
○ → resize2fs /dev/mapper/thin
resize2fs 1.45.5 (07-Jan-2020)
Filesystem at /dev/mapper/thin is mounted on /mnt/test; on-line resizing required
old_desc_blocks = 2048, new_desc_blocks = 2304
The filesystem on /dev/mapper/thin is now 4831837696 (4k) blocks long.

[  865.186308] EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
[  912.091502] dm-4: detected capacity change from 34359738368 to 38654705664
[  970.030550] dm-5: detected capacity change from 34359734272 to 38654701568
[ 1000.012751] EXT4-fs (dm-5): resizing filesystem from 4294966784 to 4831837696 blocks
[ 1000.012878] EXT4-fs (dm-5): resized filesystem to 4294967296

=== after ===
[  129.104898] EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
[  143.773630] dm-4: detected capacity change from 34359738368 to 38654705664
[  198.203246] dm-5: detected capacity change from 34359734272 to 38654701568
[  207.918603] EXT4-fs (dm-5): resizing filesystem from 4294966784 to 4831837696 blocks
[  207.918754] EXT4-fs (dm-5): resizing filesystem from 4294967296 to 4831837696 blocks
[  207.918758] EXT4-fs (dm-5): Converting file system to meta_bg
[  207.918790] EXT4-fs (dm-5): resizing filesystem from 4294967296 to 4831837696 blocks
[  221.454050] EXT4-fs (dm-5): resized to 4658298880 blocks
[  227.634613] EXT4-fs (dm-5): resized filesystem to 4831837696

Signed-off-by: Jerry Lee <jerrylee@qnap.com>
Link: https://lore.kernel.org/r/PU1PR04MB22635E739BD21150DC182AC6A18C9@PU1PR04MB2263.apcprd04.prod.outlook.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2 years agoext4: fixup possible uninitialized variable access in ext4_mb_choose_next_group_cr1()
Jan Kara [Thu, 22 Sep 2022 09:09:29 +0000 (11:09 +0200)]
ext4: fixup possible uninitialized variable access in ext4_mb_choose_next_group_cr1()

Variable 'grp' may be left uninitialized if there's no group with
suitable average fragment size (or larger). Fix the problem by
initializing it earlier.

Link: https://lore.kernel.org/r/20220922091542.pkhedytey7wzp5fi@quack3
Fixes: 83e80a6e3543 ("ext4: use buckets for cr 1 block scan instead of rbtree")
Cc: stable@kernel.org
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2 years agoext4: limit the number of retries after discarding preallocations blocks
Theodore Ts'o [Thu, 1 Sep 2022 22:03:14 +0000 (18:03 -0400)]
ext4: limit the number of retries after discarding preallocations blocks

This patch avoids threads live-locking for hours when a large number
threads are competing over the last few free extents as they blocks
getting added and removed from preallocation pools.  From our bug
reporter:

   A reliable way for triggering this has multiple writers
   continuously write() to files when the filesystem is full, while
   small amounts of space are freed (e.g. by truncating a large file
   -1MiB at a time). In the local filesystem, this can be done by
   simply not checking the return code of write (0) and/or the error
   (ENOSPACE) that is set. Over NFS with an async mount, even clients
   with proper error checking will behave this way since the linux NFS
   client implementation will not propagate the server errors [the
   write syscalls immediately return success] until the file handle is
   closed. This leads to a situation where NFS clients send a
   continuous stream of WRITE rpcs which result in ERRNOSPACE -- but
   since the client isn't seeing this, the stream of writes continues
   at maximum network speed.

   When some space does appear, multiple writers will all attempt to
   claim it for their current write. For NFS, we may see dozens to
   hundreds of threads that do this.

   The real-world scenario of this is database backup tooling (in
   particular, github.com/mdkent/percona-xtrabackup) which may write
   large files (>1TiB) to NFS for safe keeping. Some temporary files
   are written, rewound, and read back -- all before closing the file
   handle (the temp file is actually unlinked, to trigger automatic
   deletion on close/crash.) An application like this operating on an
   async NFS mount will not see an error code until TiB have been
   written/read.

   The lockup was observed when running this database backup on large
   filesystems (64 TiB in this case) with a high number of block
   groups and no free space. Fragmentation is generally not a factor
   in this filesystem (~thousands of large files, mostly contiguous
   except for the parts written while the filesystem is at capacity.)

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
2 years agoext4: fix bug in extents parsing when eh_entries == 0 and eh_depth > 0
Luís Henriques [Mon, 22 Aug 2022 09:42:35 +0000 (10:42 +0100)]
ext4: fix bug in extents parsing when eh_entries == 0 and eh_depth > 0

When walking through an inode extents, the ext4_ext_binsearch_idx() function
assumes that the extent header has been previously validated.  However, there
are no checks that verify that the number of entries (eh->eh_entries) is
non-zero when depth is > 0.  And this will lead to problems because the
EXT_FIRST_INDEX() and EXT_LAST_INDEX() will return garbage and result in this:

[  135.245946] ------------[ cut here ]------------
[  135.247579] kernel BUG at fs/ext4/extents.c:2258!
[  135.249045] invalid opcode: 0000 [#1] PREEMPT SMP
[  135.250320] CPU: 2 PID: 238 Comm: tmp118 Not tainted 5.19.0-rc8+ #4
[  135.252067] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b-rebuilt.opensuse.org 04/01/2014
[  135.255065] RIP: 0010:ext4_ext_map_blocks+0xc20/0xcb0
[  135.256475] Code:
[  135.261433] RSP: 0018:ffffc900005939f8 EFLAGS: 00010246
[  135.262847] RAX: 0000000000000024 RBX: ffffc90000593b70 RCX: 0000000000000023
[  135.264765] RDX: ffff8880038e5f10 RSI: 0000000000000003 RDI: ffff8880046e922c
[  135.266670] RBP: ffff8880046e9348 R08: 0000000000000001 R09: ffff888002ca580c
[  135.268576] R10: 0000000000002602 R11: 0000000000000000 R12: 0000000000000024
[  135.270477] R13: 0000000000000000 R14: 0000000000000024 R15: 0000000000000000
[  135.272394] FS:  00007fdabdc56740(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000
[  135.274510] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  135.276075] CR2: 00007ffc26bd4f00 CR3: 0000000006261004 CR4: 0000000000170ea0
[  135.277952] Call Trace:
[  135.278635]  <TASK>
[  135.279247]  ? preempt_count_add+0x6d/0xa0
[  135.280358]  ? percpu_counter_add_batch+0x55/0xb0
[  135.281612]  ? _raw_read_unlock+0x18/0x30
[  135.282704]  ext4_map_blocks+0x294/0x5a0
[  135.283745]  ? xa_load+0x6f/0xa0
[  135.284562]  ext4_mpage_readpages+0x3d6/0x770
[  135.285646]  read_pages+0x67/0x1d0
[  135.286492]  ? folio_add_lru+0x51/0x80
[  135.287441]  page_cache_ra_unbounded+0x124/0x170
[  135.288510]  filemap_get_pages+0x23d/0x5a0
[  135.289457]  ? path_openat+0xa72/0xdd0
[  135.290332]  filemap_read+0xbf/0x300
[  135.291158]  ? _raw_spin_lock_irqsave+0x17/0x40
[  135.292192]  new_sync_read+0x103/0x170
[  135.293014]  vfs_read+0x15d/0x180
[  135.293745]  ksys_read+0xa1/0xe0
[  135.294461]  do_syscall_64+0x3c/0x80
[  135.295284]  entry_SYSCALL_64_after_hwframe+0x46/0xb0

This patch simply adds an extra check in __ext4_ext_check(), verifying that
eh_entries is not 0 when eh_depth is > 0.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=215941
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216283
Cc: Baokun Li <libaokun1@huawei.com>
Cc: stable@kernel.org
Signed-off-by: Luís Henriques <lhenriques@suse.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Baokun Li <libaokun1@huawei.com>
Link: https://lore.kernel.org/r/20220822094235.2690-1-lhenriques@suse.de
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2 years agoext4: use buckets for cr 1 block scan instead of rbtree
Jan Kara [Thu, 8 Sep 2022 09:21:28 +0000 (11:21 +0200)]
ext4: use buckets for cr 1 block scan instead of rbtree

Using rbtree for sorting groups by average fragment size is relatively
expensive (needs rbtree update on every block freeing or allocation) and
leads to wide spreading of allocations because selection of block group
is very sentitive both to changes in free space and amount of blocks
allocated. Furthermore selecting group with the best matching average
fragment size is not necessary anyway, even more so because the
variability of fragment sizes within a group is likely large so average
is not telling much. We just need a group with large enough average
fragment size so that we have high probability of finding large enough
free extent and we don't want average fragment size to be too big so
that we are likely to find free extent only somewhat larger than what we
need.

So instead of maintaing rbtree of groups sorted by fragment size keep
bins (lists) or groups where average fragment size is in the interval
[2^i, 2^(i+1)). This structure requires less updates on block allocation
/ freeing, generally avoids chaotic spreading of allocations into block
groups, and still is able to quickly (even faster that the rbtree)
provide a block group which is likely to have a suitably sized free
space extent.

This patch reduces number of block groups used when untarring archive
with medium sized files (size somewhat above 64k which is default
mballoc limit for avoiding locality group preallocation) to about half
and thus improves write speeds for eMMC flash significantly.

Fixes: 196e402adf2e ("ext4: improve cr 0 / cr 1 group scanning")
CC: stable@kernel.org
Reported-and-tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Tested-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Link: https://lore.kernel.org/all/0d81a7c2-46b7-6010-62a4-3e6cfc1628d6@i2se.com/
Link: https://lore.kernel.org/r/20220908092136.11770-5-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2 years agoext4: use locality group preallocation for small closed files
Jan Kara [Thu, 8 Sep 2022 09:21:27 +0000 (11:21 +0200)]
ext4: use locality group preallocation for small closed files

Curently we don't use any preallocation when a file is already closed
when allocating blocks (from writeback code when converting delayed
allocation). However for small files, using locality group preallocation
is actually desirable as that is not specific to a particular file.
Rather it is a method to pack small files together to reduce
fragmentation and for that the fact the file is closed is actually even
stronger hint the file would benefit from packing. So change the logic
to allow locality group preallocation in this case.

Fixes: 196e402adf2e ("ext4: improve cr 0 / cr 1 group scanning")
CC: stable@kernel.org
Reported-and-tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Tested-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/all/0d81a7c2-46b7-6010-62a4-3e6cfc1628d6@i2se.com/
Link: https://lore.kernel.org/r/20220908092136.11770-4-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2 years agoext4: make directory inode spreading reflect flexbg size
Jan Kara [Thu, 8 Sep 2022 09:21:26 +0000 (11:21 +0200)]
ext4: make directory inode spreading reflect flexbg size

Currently the Orlov inode allocator searches for free inodes for a
directory only in flex block groups with at most inodes_per_group/16
more directory inodes than average per flex block group. However with
growing size of flex block group this becomes unnecessarily strict.
Scale allowed difference from average directory count per flex block
group with flex block group size as we do with other metrics.

Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Tested-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Cc: stable@kernel.org
Link: https://lore.kernel.org/all/0d81a7c2-46b7-6010-62a4-3e6cfc1628d6@i2se.com/
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220908092136.11770-3-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2 years agoext4: avoid unnecessary spreading of allocations among groups
Jan Kara [Thu, 8 Sep 2022 09:21:25 +0000 (11:21 +0200)]
ext4: avoid unnecessary spreading of allocations among groups

mb_set_largest_free_order() updates lists containing groups with largest
chunk of free space of given order. The way it updates it leads to
always moving the group to the tail of the list. Thus allocations
looking for free space of given order effectively end up cycling through
all groups (and due to initialization in last to first order). This
spreads allocations among block groups which reduces performance for
rotating disks or low-end flash media. Change
mb_set_largest_free_order() to only update lists if the order of the
largest free chunk in the group changed.

Fixes: 196e402adf2e ("ext4: improve cr 0 / cr 1 group scanning")
CC: stable@kernel.org
Reported-and-tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Tested-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/all/0d81a7c2-46b7-6010-62a4-3e6cfc1628d6@i2se.com/
Link: https://lore.kernel.org/r/20220908092136.11770-2-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2 years agoext4: make mballoc try target group first even with mb_optimize_scan
Jan Kara [Thu, 8 Sep 2022 09:21:24 +0000 (11:21 +0200)]
ext4: make mballoc try target group first even with mb_optimize_scan

One of the side-effects of mb_optimize_scan was that the optimized
functions to select next group to try were called even before we tried
the goal group. As a result we no longer allocate files close to
corresponding inodes as well as we don't try to expand currently
allocated extent in the same group. This results in reaim regression
with workfile.disk workload of upto 8% with many clients on my test
machine:

                     baseline               mb_optimize_scan
Hmean     disk-1       2114.16 (   0.00%)     2099.37 (  -0.70%)
Hmean     disk-41     87794.43 (   0.00%)    83787.47 *  -4.56%*
Hmean     disk-81    148170.73 (   0.00%)   135527.05 *  -8.53%*
Hmean     disk-121   177506.11 (   0.00%)   166284.93 *  -6.32%*
Hmean     disk-161   220951.51 (   0.00%)   207563.39 *  -6.06%*
Hmean     disk-201   208722.74 (   0.00%)   203235.59 (  -2.63%)
Hmean     disk-241   222051.60 (   0.00%)   217705.51 (  -1.96%)
Hmean     disk-281   252244.17 (   0.00%)   241132.72 *  -4.41%*
Hmean     disk-321   255844.84 (   0.00%)   245412.84 *  -4.08%*

Also this is causing huge regression (time increased by a factor of 5 or
so) when untarring archive with lots of small files on some eMMC storage
cards.

Fix the problem by making sure we try goal group first.

Fixes: 196e402adf2e ("ext4: improve cr 0 / cr 1 group scanning")
CC: stable@kernel.org
Reported-and-tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Tested-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Link: https://lore.kernel.org/all/20220727105123.ckwrhbilzrxqpt24@quack3/
Link: https://lore.kernel.org/all/0d81a7c2-46b7-6010-62a4-3e6cfc1628d6@i2se.com/
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220908092136.11770-1-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2 years agoLinux 6.0-rc4
Linus Torvalds [Sun, 4 Sep 2022 20:10:01 +0000 (13:10 -0700)]
Linux 6.0-rc4

2 years agoMerge tag 'powerpc-6.0-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc...
Linus Torvalds [Sun, 4 Sep 2022 18:33:22 +0000 (11:33 -0700)]
Merge tag 'powerpc-6.0-4' of git://git./linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:

 - Fix handling of PCI domains in /proc on 32-bit systems using the
   recently added support for numbering buses from zero for each domain.

 - A fix and a revert for some changes to use READ/WRITE_ONCE() which
   caused problems with KASAN enabled due to sanitisation calls being
   introduced in low-level paths that can't cope with it.

 - Fix build errors on 32-bit caused by the syscall table being
   misaligned sometimes.

 - Two fixes to get IBM Cell native machines booting again, which had
   bit-rotted while my QS22 was temporarily out of action.

 - Fix the papr_scm driver to not assume the order of events returned by
   the hypervisor is stable, and a related compile fix.

Thanks to Aneesh Kumar K.V, Christophe Leroy, Jordan Niethe, Kajol Jain,
Masahiro Yamada, Nathan Chancellor, Pali Rohár, Vaibhav Jain, and Zhouyi
Zhou.

* tag 'powerpc-6.0-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/papr_scm: Ensure rc is always initialized in papr_scm_pmu_register()
  Revert "powerpc/irq: Don't open code irq_soft_mask helpers"
  powerpc: Fix hard_irq_disable() with sanitizer
  powerpc/rtas: Fix RTAS MSR[HV] handling for Cell
  Revert "powerpc: Remove unused FW_FEATURE_NATIVE references"
  powerpc: align syscall table for ppc32
  powerpc/pci: Enable PCI domains in /proc when PCI bus numbers are not unique
  powerpc/papr_scm: Fix nvdimm event mappings

2 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Sun, 4 Sep 2022 18:27:14 +0000 (11:27 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "s390:

   - PCI interpretation compile fixes

  RISC-V:

   - fix unused variable warnings in vcpu_timer.c

   - move extern sbi_ext declarations to a header

  x86:

   - check validity of argument to KVM_SET_MP_STATE

   - use guest's global_ctrl to completely disable guest PEBS

   - fix a memory leak on memory allocation failure

   - mask off unsupported and unknown bits of IA32_ARCH_CAPABILITIES

   - fix build failure with Clang integrated assembler

   - fix MSR interception

   - always flush TLBs when enabling dirty logging"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: check validity of argument to KVM_SET_MP_STATE
  perf/x86/core: Completely disable guest PEBS via guest's global_ctrl
  KVM: x86: fix memoryleak in kvm_arch_vcpu_create()
  KVM: x86: Mask off unsupported and unknown bits of IA32_ARCH_CAPABILITIES
  KVM: s390: pci: Hook to access KVM lowlevel from VFIO
  riscv: kvm: move extern sbi_ext declarations to a header
  riscv: kvm: vcpu_timer: fix unused variable warnings
  KVM: selftests: Fix ambiguous mov in KVM_ASM_SAFE()
  KVM: selftests: Fix KVM_EXCEPTION_MAGIC build with Clang
  KVM: VMX: Heed the 'msr' argument in msr_write_intercepted()
  kvm: x86: mmu: Always flush TLBs when enabling dirty logging
  kvm: x86: mmu: Drop the need_remote_flush() function

2 years agoMakefile.extrawarn: re-enable -Wformat for clang; take 2
Nick Desaulniers [Thu, 1 Sep 2022 17:59:13 +0000 (10:59 -0700)]
Makefile.extrawarn: re-enable -Wformat for clang; take 2

-Wformat was recently re-enabled for builds with clang, then quickly
re-disabled, due to concerns stemming from the frequency of default
argument promotion related warning instances.

commit 258fafcd0683 ("Makefile.extrawarn: re-enable -Wformat for clang")
commit 21f9c8a13bb2 ("Revert "Makefile.extrawarn: re-enable -Wformat for clang"")

ISO WG14 has ratified N2562 to address default argument promotion
explicitly for printf, as part of the upcoming ISO C2X standard.

The behavior of clang was changed in clang-16 to not warn for the cited
cases in all language modes.

Add a version check, so that users of clang-16 now get the full effect
of -Wformat. For older clang versions, re-enable flags under the
-Wformat group that way users still get some useful checks related to
format strings, without noisy default argument promotion warnings. I
intentionally omitted -Wformat-y2k and -Wformat-security from being
re-enabled, which are also part of -Wformat in clang-16.

Link: https://github.com/ClangBuiltLinux/linux/issues/378
Link: https://github.com/llvm/llvm-project/issues/57102
Link: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2562.pdf
Suggested-by: Justin Stitt <jstitt007@gmail.com>
Suggested-by: Nathan Chancellor <nathan@kernel.org>
Suggested-by: Youngmin Nam <youngmin.nam@samsung.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agoMerge tag 'gpio-fixes-for-v6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 4 Sep 2022 04:27:27 +0000 (21:27 -0700)]
Merge tag 'gpio-fixes-for-v6.0-rc4' of git://git./linux/kernel/git/brgl/linux

Pull gpio fixes from Bartosz Golaszewski:
 "A a set of fixes from the GPIO subsystem.

  Most are small driver fixes except the realtek-otto driver patch which
  is pretty big but addresses a significant flaw that can cause the CPU
  to stay infinitely busy on uncleared ISR on some platforms.

  Summary:
   - MAINTAINERS update
   - fix resource leaks in gpio-mockup and gpio-pxa
   - add missing locking in gpio-pca953x
   - use 32-bit I/O in gpio-realtek-otto
   - make irq_chip structures immutable in four more drivers"

* tag 'gpio-fixes-for-v6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: ws16c48: Make irq_chip immutable
  gpio: 104-idio-16: Make irq_chip immutable
  gpio: 104-idi-48: Make irq_chip immutable
  gpio: 104-dio-48e: Make irq_chip immutable
  gpio: realtek-otto: switch to 32-bit I/O
  gpio: pca953x: Add mutex_lock for regcache sync in PM
  gpio: mockup: remove gpio debugfs when remove device
  gpio: pxa: use devres for the clock struct
  MAINTAINERS: rectify entry for XILINX GPIO DRIVER

2 years agogpio: ws16c48: Make irq_chip immutable
William Breathitt Gray [Fri, 2 Sep 2022 17:45:26 +0000 (13:45 -0400)]
gpio: ws16c48: Make irq_chip immutable

Kernel warns about mutable irq_chips:

    "not an immutable chip, please consider fixing!"

Make the struct irq_chip const, flag it as IRQCHIP_IMMUTABLE, add the
new helper functions, and call the appropriate gpiolib functions.

Signed-off-by: William Breathitt Gray <william.gray@linaro.org>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2 years agogpio: 104-idio-16: Make irq_chip immutable
William Breathitt Gray [Fri, 2 Sep 2022 17:45:25 +0000 (13:45 -0400)]
gpio: 104-idio-16: Make irq_chip immutable

Kernel warns about mutable irq_chips:

    "not an immutable chip, please consider fixing!"

Make the struct irq_chip const, flag it as IRQCHIP_IMMUTABLE, add the
new helper functions, and call the appropriate gpiolib functions.

Signed-off-by: William Breathitt Gray <william.gray@linaro.org>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2 years agogpio: 104-idi-48: Make irq_chip immutable
William Breathitt Gray [Fri, 2 Sep 2022 17:45:24 +0000 (13:45 -0400)]
gpio: 104-idi-48: Make irq_chip immutable

Kernel warns about mutable irq_chips:

    "not an immutable chip, please consider fixing!"

Make the struct irq_chip const, flag it as IRQCHIP_IMMUTABLE, add the
new helper functions, and call the appropriate gpiolib functions.

Signed-off-by: William Breathitt Gray <william.gray@linaro.org>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2 years agogpio: 104-dio-48e: Make irq_chip immutable
William Breathitt Gray [Fri, 2 Sep 2022 17:45:23 +0000 (13:45 -0400)]
gpio: 104-dio-48e: Make irq_chip immutable

Kernel warns about mutable irq_chips:

    "not an immutable chip, please consider fixing!"

Make the struct irq_chip const, flag it as IRQCHIP_IMMUTABLE, add the
new helper functions, and call the appropriate gpiolib functions.

Signed-off-by: William Breathitt Gray <william.gray@linaro.org>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2 years agoMerge tag 'for-linus-6.0-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 3 Sep 2022 20:23:11 +0000 (13:23 -0700)]
Merge tag 'for-linus-6.0-rc4-tag' of git://git./linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:

 - a minor fix for the Xen grant driver

 - a small series fixing a recently introduced problem in the Xen
   blkfront/blkback drivers with negotiation of feature usage

* tag 'for-linus-6.0-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/grants: prevent integer overflow in gnttab_dma_alloc_pages()
  xen-blkfront: Cache feature_persistent value before advertisement
  xen-blkfront: Advertise feature-persistent as user requested
  xen-blkback: Advertise feature-persistent as user requested

2 years agoMerge tag 'loongarch-fixes-6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 3 Sep 2022 20:21:01 +0000 (13:21 -0700)]
Merge tag 'loongarch-fixes-6.0-2' of git://git./linux/kernel/git/chenhuacai/linux-loongson

Pull LoongArch fixes from Huacai Chen:
 "Fix several build errors or warnings, cleanup some code, and adjust
  arch_do_signal_or_restart() to adapt generic entry"

* tag 'loongarch-fixes-6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
  LoongArch: mm: Remove the unneeded result variable
  LoongArch: Fix arch_remove_memory() undefined build error
  LoongArch: Fix section mismatch due to acpi_os_ioremap()
  LoongArch: Improve dump_tlb() output messages
  LoongArch: Adjust arch_do_signal_or_restart() to adapt generic entry
  LoongArch: Avoid orphan input sections

2 years agoMerge tag 's390-6.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Linus Torvalds [Sat, 3 Sep 2022 20:17:33 +0000 (13:17 -0700)]
Merge tag 's390-6.0-3' of git://git./linux/kernel/git/s390/linux

Pull s390 fixes from Vasily Gorbik:

 - Update defconfigs

 - Fix linker script to align nospec tables correctly to avoid
   potentially unbootable kernel with some config options

 - Fix alignment check in prepare_hugepage_range() for 2GB hugepages to
   avoid BUG in __unmap_hugepage_range() for unaligned mappings later

 - Remove useless hugepage address alignment in hugetlb fault handling

* tag 's390-6.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/hugetlb: fix prepare_hugepage_range() check for 2 GB hugepages
  s390: update defconfigs
  s390: fix nospec table alignments
  s390/mm: remove useless hugepage address alignment

2 years agoMerge tag 'input-for-v6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor...
Linus Torvalds [Sat, 3 Sep 2022 20:09:46 +0000 (13:09 -0700)]
Merge tag 'input-for-v6.0-rc3' of git://git./linux/kernel/git/dtor/input

Pull input fixes from Dmitry Torokhov:

 - GT1158 ID added to Goodix touchscreen driver

 - Boeder Force Feedback Wheel USB added to iforce joystick driver

 - fixup for iforce driver to avoid hangups

 - fix autoloading of rk805-pwrkey driver.

* tag 'input-for-v6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: iforce - add support for Boeder Force Feedback Wheel
  Input: iforce - wake up after clearing IFORCE_XMIT_RUNNING flag
  Input: goodix - add compatible string for GT1158
  MAINTAINERS: add include/dt-bindings/input to INPUT DRIVERS
  Input: rk805-pwrkey - fix module autoloading
  Input: goodix - add support for GT1158
  dt-bindings: input: touchscreen: add compatible string for Goodix GT1158

2 years agoMerge tag 'tty-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Linus Torvalds [Sat, 3 Sep 2022 17:34:02 +0000 (10:34 -0700)]
Merge tag 'tty-6.0-rc4' of git://git./linux/kernel/git/gregkh/tty

Pull tty/serial driver fixes from Greg KH:
 "Here are some small tty/serial/vt driver fixes for 6.0-rc4 that
  resolve a number of reported issues:

   - n_gsm fixups for previous changes that caused problems

   - much-reported serdev crash fix that showed up in 6.0-rc1

   - vt font selection bugfix

   - kerneldoc build warning fixes

   - other tiny serial core fixes

  All of these have been in linux-next for a while with no reported
  problems"

* tag 'tty-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  tty: n_gsm: avoid call of sleeping functions from atomic context
  tty: n_gsm: replace kicktimer with delayed_work
  tty: n_gsm: initialize more members at gsm_alloc_mux()
  tty: n_gsm: add sanity check for gsm->receive in gsm_receive_buf()
  tty: serial: atmel: Preserve previous USART mode if RS485 disabled
  tty: serial: lpuart: disable flow control while waiting for the transmit engine to complete
  tty: Fix lookahead_buf crash with serdev
  serial: fsl_lpuart: RS485 RTS polariy is inverse
  vt: Clear selection before changing the font
  serial: document start_rx member at struct uart_ops

2 years agoMerge tag 'staging-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Sat, 3 Sep 2022 17:32:17 +0000 (10:32 -0700)]
Merge tag 'staging-6.0-rc4' of git://git./linux/kernel/git/gregkh/staging

Pull staging driver fixes from Greg KH:
 "Here are three small staging driver fixes for 6.0-rc4 that resolve
  some reported problems and add some a device id:

   - new device id for r8188eu driver

   - use-after-free bugfixes for the rtl8712 driver

   - fix up firmware dependency problem for the r8188eu driver

  All of these have been in linux-next for a while with no reported
  problems"

* tag 'staging-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  staging: rtl8712: fix use after free bugs
  staging: r8188eu: Add Rosewill USB-N150 Nano to device tables
  staging: r8188eu: add firmware dependency

2 years agoMerge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 3 Sep 2022 17:27:25 +0000 (10:27 -0700)]
Merge tag 'clk-fixes-for-linus' of git://git./linux/kernel/git/clk/linux

Pull clk fixes from Stephen Boyd:
 "Here's a collection of primarily clk driver fixes, with a couple fixes
  to the core framework.

  We had to revert out a commit that affected boot on some devices that
  have the CLK_OPS_PARENT_ENABLE flag set. It isn't critical to have
  that fix so we'll try again next time.

  Driver side fixes include:

   - Plug an OF-node refcount bug in the TI clk driver

   - Fix the error handling in the raspberry pi firmware get_rate so
     that errors don't look like valid frequencies

   - Avoid going out of bounds in the raspberry pi driver too if the
     video firmware returns something we're not expecting"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  Revert "clk: core: Honor CLK_OPS_PARENT_ENABLE for clk gate ops"
  clk: bcm: rpi: Show clock id limit in error case
  clk: bcm: rpi: Add missing newline
  clk: bcm: rpi: Prevent out-of-bounds access
  clk: bcm: rpi: Fix error handling of raspberrypi_fw_get_rate
  clk: core: Fix runtime PM sequence in clk_core_unprepare()
  clk: core: Honor CLK_OPS_PARENT_ENABLE for clk gate ops
  clk: ti: Fix missing of_node_get() ti_find_clock_provider()

2 years agoMerge tag 'hwmon-for-v6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groec...
Linus Torvalds [Sat, 3 Sep 2022 17:24:30 +0000 (10:24 -0700)]
Merge tag 'hwmon-for-v6.0-rc4' of git://git./linux/kernel/git/groeck/linux-staging

Pull hwmon fixes from Guenter Roeck:

 - Fix out of bounds access in gpio-fan driver

 - Fix VOUT margin caching in PMBus core

 - Avoid error message after -EPROBE_DEFER from devm_regulator_register()

* tag 'hwmon-for-v6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (gpio-fan) Fix array out of bounds access
  hwmon: (pmbus) Fix vout margin caching
  hwmon: (pmbus) Use dev_err_probe() to filter -EPROBE_DEFER error messages

2 years agomm: pagewalk: Fix race between unmap and page walker
Steven Price [Fri, 2 Sep 2022 11:26:12 +0000 (12:26 +0100)]
mm: pagewalk: Fix race between unmap and page walker

The mmap lock protects the page walker from changes to the page tables
during the walk.  However a read lock is insufficient to protect those
areas which don't have a VMA as munmap() detaches the VMAs before
downgrading to a read lock and actually tearing down PTEs/page tables.

For users of walk_page_range() the solution is to simply call pte_hole()
immediately without checking the actual page tables when a VMA is not
present. We now never call __walk_page_range() without a valid vma.

For walk_page_range_novma() the locking requirements are tightened to
require the mmap write lock to be taken, and then walking the pgd
directly with 'no_vma' set.

This in turn means that all page walkers either have a valid vma, or
it's that special 'novma' case for page table debugging.  As a result,
all the odd '(!walk->vma && !walk->no_vma)' tests can be removed.

Fixes: dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agoLoongArch: mm: Remove the unneeded result variable
ye xingchen [Fri, 26 Aug 2022 07:29:03 +0000 (07:29 +0000)]
LoongArch: mm: Remove the unneeded result variable

Return the value pa_to_nid() directly instead of storing it in another
redundant variable.

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2 years agoLoongArch: Fix arch_remove_memory() undefined build error
Yupeng Li [Wed, 31 Aug 2022 05:40:17 +0000 (13:40 +0800)]
LoongArch: Fix arch_remove_memory() undefined build error

The kernel build error when unslected CONFIG_MEMORY_HOTREMOVE because
arch_remove_memory() is needed by mm/memory_hotplug.c but undefined.

Some build error messages like:

 LD      vmlinux.o
 MODPOST vmlinux.symvers
 MODINFO modules.builtin.modinfo
 GEN     modules.builtin
 LD      .tmp_vmlinux.kallsyms1
loongarch64-linux-gnu-ld: mm/memory_hotplug.o: in function `.L242':
memory_hotplug.c:(.ref.text+0x930): undefined reference to `arch_remove_memory'
make: *** [Makefile:1169:vmlinux] 错误 1

Removed CONFIG_MEMORY_HOTREMOVE requirement and rearrange the file refer
to the definitions of other platform architectures.

Signed-off-by: Yupeng Li <liyupeng@zbhlos.com>
Signed-off-by: Caicai <caizp2008@163.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2 years agoLoongArch: Fix section mismatch due to acpi_os_ioremap()
Huacai Chen [Fri, 2 Sep 2022 14:33:42 +0000 (22:33 +0800)]
LoongArch: Fix section mismatch due to acpi_os_ioremap()

Now acpi_os_ioremap() is marked with __init because it calls memblock_
is_memory() which is also marked with __init in the !ARCH_KEEP_MEMBLOCK
case. However, acpi_os_ioremap() is called by ordinary functions such
as acpi_os_{read, write}_memory() and causes section mismatch warnings:

WARNING: modpost: vmlinux.o: section mismatch in reference: acpi_os_read_memory (section: .text) -> acpi_os_ioremap (section: .init.text)
WARNING: modpost: vmlinux.o: section mismatch in reference: acpi_os_write_memory (section: .text) -> acpi_os_ioremap (section: .init.text)

Fix these warnings by selecting ARCH_KEEP_MEMBLOCK unconditionally and
removing the __init modifier of acpi_os_ioremap(). This can also give a
chance to track "memory" and "reserved" memblocks after early boot.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2 years agoLoongArch: Improve dump_tlb() output messages
Huacai Chen [Wed, 31 Aug 2022 06:22:43 +0000 (14:22 +0800)]
LoongArch: Improve dump_tlb() output messages

1, Use nr/nx to replace ri/xi;
2, Add 0x prefix for hexadecimal data.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2 years agoLoongArch: Adjust arch_do_signal_or_restart() to adapt generic entry
Huacai Chen [Wed, 31 Aug 2022 03:19:27 +0000 (11:19 +0800)]
LoongArch: Adjust arch_do_signal_or_restart() to adapt generic entry

Commit 8ba62d37949e248c69 ("task_work: Call tracehook_notify_signal from
get_signal on all architectures") adjust arch_do_signal_or_restart() for
all architectures. LoongArch hasn't been upstream yet at that time and
can be still built successfully without adjustment because this function
has a weak version with the correct prototype. It is obviously that we
should convert LoongArch to use new API, otherwise some signal handlings
will be lost.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2 years agoLoongArch: Avoid orphan input sections
Ard Biesheuvel [Wed, 24 Aug 2022 15:31:10 +0000 (17:31 +0200)]
LoongArch: Avoid orphan input sections

Ensure that all input sections are listed explicitly in the linker
script, and issue a warning otherwise. This ensures that the binary
image matches the PE/COFF and other image metadata exactly, which is
important for things like code signing.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2 years agoMerge tag 'block-6.0-2022-09-02' of git://git.kernel.dk/linux-block
Linus Torvalds [Fri, 2 Sep 2022 23:44:30 +0000 (16:44 -0700)]
Merge tag 'block-6.0-2022-09-02' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:

 - NVMe pull request via Christoph:
     - error handling fix for the new auth code (Hannes Reinecke)
     - fix unhandled tcp states in nvmet_tcp_state_change (Maurizio
       Lombardi)
     - add NVME_QUIRK_BOGUS_NID for Lexar NM610 (Shyamin Ayesh)

 - Add documentation for the ublk driver merged in this merge window
   (Ming)

* tag 'block-6.0-2022-09-02' of git://git.kernel.dk/linux-block:
  Documentation: document ublk
  nvmet-tcp: fix unhandled tcp states in nvmet_tcp_state_change()
  nvmet-auth: add missing goto in nvmet_setup_auth()
  nvme-pci: add NVME_QUIRK_BOGUS_NID for Lexar NM610

2 years agoMerge tag 'io_uring-6.0-2022-09-02' of git://git.kernel.dk/linux-block
Linus Torvalds [Fri, 2 Sep 2022 23:37:01 +0000 (16:37 -0700)]
Merge tag 'io_uring-6.0-2022-09-02' of git://git.kernel.dk/linux-block

Pull io_uring fixes from Jens Axboe:

 - A single fix for over-eager retries for networking (Pavel)

 - Revert the notification slot support for zerocopy sends.

   It turns out that even after more than a year or development and
   testing, there's not full agreement on whether just using plain
   ordered notifications is Good Enough to avoid the complexity of using
   the notifications slots. Because of that, we decided that it's best
   left to a future final decision.

   We can always bring back this feature, but we can't really change it
   or remove it once we've released 6.0 with it enabled. The reverts
   leave the usual CQE notifications as the primary interface for
   knowing when data was sent, and when it was acked. (Pavel)

* tag 'io_uring-6.0-2022-09-02' of git://git.kernel.dk/linux-block:
  selftests/net: return back io_uring zc send tests
  io_uring/net: simplify zerocopy send user API
  io_uring/notif: remove notif registration
  Revert "io_uring: rename IORING_OP_FILES_UPDATE"
  Revert "io_uring: add zc notification flush requests"
  selftests/net: temporarily disable io_uring zc test
  io_uring/net: fix overexcessive retries

2 years agoMerge tag '6.0-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Fri, 2 Sep 2022 23:20:24 +0000 (16:20 -0700)]
Merge tag '6.0-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs fixes from Steve French:
 "Five fixes, all also marked for stable:

   - fixes for collapse range and insert range (also fixes xfstest
     generic/031)

   - memory leak fix"

* tag '6.0-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: fix small mempool leak in SMB2_negotiate()
  smb3: use filemap_write_and_wait_range instead of filemap_write_and_wait
  smb3: fix temporary data corruption in insert range
  smb3: fix temporary data corruption in collapse range
  smb3: Move the flush out of smb2_copychunk_range() into its callers

2 years agoMerge tag 'landlock-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mic...
Linus Torvalds [Fri, 2 Sep 2022 22:24:08 +0000 (15:24 -0700)]
Merge tag 'landlock-6.0-rc4' of git://git./linux/kernel/git/mic/linux

Pull landlock fix from Mickaël Salaün:
 "This fixes a mis-handling of the LANDLOCK_ACCESS_FS_REFER right when
  multiple rulesets/domains are stacked.

  The expected behaviour was that an additional ruleset can only
  restrict the set of permitted operations, but in this particular case,
  it was potentially possible to re-gain the LANDLOCK_ACCESS_FS_REFER
  right"

* tag 'landlock-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
  landlock: Fix file reparenting without explicit LANDLOCK_ACCESS_FS_REFER

2 years agoMerge tag 'mmc-v6.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Linus Torvalds [Fri, 2 Sep 2022 22:03:12 +0000 (15:03 -0700)]
Merge tag 'mmc-v6.0-rc2' of git://git./linux/kernel/git/ulfh/mmc

Pull MMC fixes from Ulf Hansson:

 - Fix workaround for SD UHS-I voltage switch

* tag 'mmc-v6.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: core: Fix inconsistent sd3_bus_mode at UHS-I SD voltage switch failure
  mmc: core: Fix UHS-I SD 1.8V workaround branch

2 years agoMerge tag 'drm-fixes-2022-09-02' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Fri, 2 Sep 2022 21:56:09 +0000 (14:56 -0700)]
Merge tag 'drm-fixes-2022-09-02' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Regular fixes pull. One core dma-buf fix, then two weeks of i915
  fixes, a lot of amdgpu fixes mostly for new IP, and a bunch of msm
  fixes, mostly modesetting ones.

  Nothing seems too bad at this point.

  dma-buf/dma-resv:
   - Fence-handling fix

  i915:
   - GVT fixes including fix for a CommetLake regression in mmio table
     and misc doc and typo fixes
   - Fix CCS handling
   - Fix for guc requests after reset
   - Display DSI related fixes
   - Display backlight related fixes
   - Fix for a null pointer dereference
   - HDMI related quirk for ECS Liva Q2 with GLK graphics
   - Skip wm/ddb readout for disabled pipes

  amdgpu:
   - FRU error message fix
   - MES 11 updates
   - DCN 3.2.x fixes
   - DCN 3.1.4 fixes
   - Fix possible use after free in CS IOCTL
   - SMU 13.0.x fixes
   - Fix iolink reporting on devices with direct connections to CPU
   - GFX10 tap delay firmware fixes

  msm:
   - Fix for inconsistent indenting in msm_dsi_dphy_timing_calc_v3().
   - Fix to make eDP the first connector in the connected list.
   - Fix to populate intf_cfg correctly before calling reset_intf_cfg().
   - Specify the correct number of DSI regulators for SDM660.
   - Specify the correct number of DSI regulators for MSM8996.
   - Fix for removing DP_RECOVERED_CLOCK_OUT_EN bit for tps4 link training
   - Fix probe-deferral crash in gpu devfreq
   - Fix gpu debugfs deadlock"

* tag 'drm-fixes-2022-09-02' of git://anongit.freedesktop.org/drm/drm: (51 commits)
  drm/amd/amdgpu: skip ucode loading if ucode_size == 0
  drm/amdgpu: only init tap_delay ucode when it's included in ucode binary
  drm/amd/display: Fix black flash when switching from ODM2to1 to ODMBypass
  drm/amd/display: Fix check for stream and plane
  drm/amd/display: Re-initialize viewport after pipe merge
  drm/amd/display: Use correct plane for CAB cursor size allocation
  drm/amdgpu: ensure no PCIe peer access for CPU XGMI iolinks
  drm/amd/pm: bump SMU 13.0.0 driver_if header version
  drm/amd/pm: use vbios carried pptable for all SMU13.0.7 SKUs
  drm/amd/pm: use vbios carried pptable for those supported SKUs
  drm/amd/display: fix wrong register access
  drm/amd/display: use actual cursor size instead of max for CAB allocation
  drm/amd/display: disable display fresh from MALL on an edge case for DCN321
  drm/amd/display: Fix CAB cursor size allocation for DCN32/321
  drm/amd/display: Missing HPO instance added
  drm/amd/display: set dig fifo read start level to 7 before dig fifo reset
  drm/amdgpu: Fix use-after-free in amdgpu_cs_ioctl
  drm/amd/display: Fix OTG H timing reset for dcn314
  drm/amd/display: Fix DCN32 DPSTREAMCLK_CNTL programming
  drm/amdgpu: Update mes_v11_api_def.h
  ...

2 years agoMerge tag 'driver-core-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 2 Sep 2022 17:55:23 +0000 (10:55 -0700)]
Merge tag 'driver-core-6.0-rc4' of git://git./linux/kernel/git/gregkh/driver-core

Pull driver core fixes from Greg KH:
 "Here are some small driver core fixes for some oft-reported problems
  in 6.0-rc1.  They include:

   - a bunch of reverts to handle driver_deferred_probe_check_state()
     problems that were part of the 6.0-rc1 merge.

   - firmware_loader bugfixes now that the code is being properly tested
     and used by others

   - arch_topology fix

   - deferred driver probe bugfix to solve a long-suffering amba bus
     problem that many people have reported.

  All of these have been in linux-next for a while with no reported
  problems"

* tag 'driver-core-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  firmware_loader: Fix memory leak in firmware upload
  firmware_loader: Fix use-after-free during unregister
  arch_topology: Silence early cacheinfo errors when non-existent
  driver core: Don't probe devices after bus_type.match() probe deferral
  Revert "iommu/of: Delete usage of driver_deferred_probe_check_state()"
  Revert "PM: domains: Delete usage of driver_deferred_probe_check_state()"
  Revert "net: mdio: Delete usage of driver_deferred_probe_check_state()"
  Revert "driver core: Delete driver_deferred_probe_check_state()"

2 years agoMerge tag 'char-misc-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Fri, 2 Sep 2022 17:50:08 +0000 (10:50 -0700)]
Merge tag 'char-misc-6.0-rc4' of git://git./linux/kernel/git/gregkh/char-misc

Pull char/misc driver fixes from Greg KH:
 "Here are some small char/misc and other driver fixes for 6.0-rc4.

  Included in here are:

   - binder fixes for previous fixes, and a few more fixes uncovered by
     them.

   - iio driver fixes

   - soundwire driver fixes

   - fastrpc driver fixes for memory corruption on some hardware

   - peci driver fix

   - mhi driver fix

  All of these have been in linux-next with no reported problems"

* tag 'char-misc-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  binder: fix alloc->vma_vm_mm null-ptr dereference
  misc: fastrpc: increase maximum session count
  misc: fastrpc: fix memory corruption on open
  misc: fastrpc: fix memory corruption on probe
  soundwire: qcom: fix device status array range
  bus: mhi: host: Fix up null pointer access in mhi_irq_handler
  soundwire: qcom: remove duplicate reset control get
  iio: light: cm32181: make cm32181_pm_ops static
  iio: ad7292: Prevent regulator double disable
  dt-bindings: iio: gyroscope: bosch,bmg160: correct number of pins
  iio: adc: mcp3911: use correct formula for AD conversion
  iio: adc: mcp3911: correct "microchip,device-addr" property
  Revert "binder_alloc: Add missing mmap_lock calls when using the VMA"
  binder_alloc: Add missing mmap_lock calls when using the VMA
  binder: fix UAF of ref->proc caused by race condition
  iio: light: cm3605: Fix an error handling path in cm3605_probe()
  iio: adc: mcp3911: make use of the sign bit
  peci: cpu: Fix use-after-free in adev_release()
  peci: aspeed: fix error check return value of platform_get_irq()

2 years agoMerge tag 'usb-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Linus Torvalds [Fri, 2 Sep 2022 17:43:46 +0000 (10:43 -0700)]
Merge tag 'usb-6.0-rc4' of git://git./linux/kernel/git/gregkh/usb

Pull USB/Thunderbolt driver fixes from Greg KH:
 "Here are a lot of small USB and Thunderbolt driver fixes for 6.0-rc4
  for reported problems. Included in here are:

   - new usb-serial driver ids

   - dwc3 driver bugfixes for reported problems with 6.0-rc1

   - new device quirks, and reverts of some quirks that were incorrect

   - gadget driver bugfixes for reported problems

   - USB host controller bugfixes (xhci and others)

   - other small USB fixes, details in the shortlog

   - small thunderbolt driver fixes

  All of these have been in linux-next with no reported issues"

* tag 'usb-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (51 commits)
  Revert "usb: gadget: udc-xilinx: replace memcpy with memcpy_toio"
  usb: storage: Add ASUS <0x0b05:0x1932> to IGNORE_UAS
  USB: serial: ch341: fix disabled rx timer on older devices
  USB: serial: ch341: fix lost character on LCR updates
  USB: serial: cp210x: add Decagon UCA device id
  Revert "usb: add quirks for Lenovo OneLink+ Dock"
  usb: cdns3: fix issue with rearming ISO OUT endpoint
  usb: cdns3: fix incorrect handling TRB_SMM flag for ISOC transfer
  usb: gadget: mass_storage: Fix cdrom data transfers on MAC-OS
  media: mceusb: Use new usb_control_msg_*() routines
  USB: core: Prevent nested device-reset calls
  USB: gadget: Fix obscure lockdep violation for udc_mutex
  usb: dwc2: fix wrong order of phy_power_on and phy_init
  usb: gadget: udc-xilinx: replace memcpy with memcpy_toio
  usb: typec: Remove retimers properly
  usb: dwc3: disable USB core PHY management
  usb: add quirks for Lenovo OneLink+ Dock
  USB: serial: option: add support for Cinterion MV32-WA/WB RmNet mode
  USB: serial: ftdi_sio: add Omron CS1W-CIF31 device id
  USB: serial: option: add Quectel EM060K modem
  ...

2 years agoMerge tag 'platform-drivers-x86-v6.0-2' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 2 Sep 2022 17:35:51 +0000 (10:35 -0700)]
Merge tag 'platform-drivers-x86-v6.0-2' of git://git./linux/kernel/git/pdx86/platform-drivers-x86

Pull x86 platform driver fixes from Hans de Goede:
 "Various small fixes and hardware-id additions"

* tag 'platform-drivers-x86-v6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
  platform/x86: p2sb: Fix UAF when caller uses resource name
  platform/x86: asus-wmi: Increase FAN_CURVE_BUF_LEN to 32
  platform/mellanox: Remove redundant 'NULL' check
  platform/mellanox: Remove unnecessary code
  platform/mellanox: mlxreg-lc: Fix locking issue
  platform/mellanox: mlxreg-lc: Fix coverity warning
  platform/x86: acer-wmi: Acer Aspire One AOD270/Packard Bell Dot keymap fixes
  platform/x86: thinkpad_acpi: Explicitly set to balanced mode on startup
  platform/x86: asus-wmi: Fix the name of the mic-mute LED classdev
  platform/surface: aggregator_registry: Add HID devices for sensors and UCSI client to SP8
  platform/surface: aggregator_registry: Rename HID device nodes based on new findings
  platform/surface: aggregator_registry: Rename HID device nodes based on their function
  platform/surface: aggregator_registry: Add support for Surface Laptop Go 2
  platform/x86: x86-android-tablets: Fix broken touchscreen on Chuwi Hi8 with Windows BIOS
  platform/x86: pmc_atom: Fix SLP_TYPx bitfield mask

2 years agoMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Fri, 2 Sep 2022 17:32:30 +0000 (10:32 -0700)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
 "It's a lot smaller than last week, with the star of the show being a
  couple of fixes to head.S addressing a boot regression introduced by
  the recent overhaul of that code in non-default configurations (i.e.
  KASLR disabled).

  The first of those two resolves the issue reported (and bisected) by
  Mikulus in the wait_on_bit() thread.

  Summary:

   - Fix two boot issues caused by the recent head.S rework when !KASLR

   - Fix calculation of crashkernel memory reservation

   - Fix bogus error check in PMU IRQ probing code"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: mm: Reserve enough pages for the initial ID map
  perf/arm_pmu_platform: fix tests for platform_get_irq() failure
  arm64: head: Ignore bogus KASLR displacement on non-relocatable kernels
  arm64/kexec: Fix missing extra range for crashkres_low.

2 years agoDocumentation: document ublk
Ming Lei [Fri, 2 Sep 2022 15:23:02 +0000 (23:23 +0800)]
Documentation: document ublk

Add documentation for ublk subsystem. It was supposed to be documented when
merging the driver, but missing at that time.

Cc: Bagas Sanjaya <bagasdotme@gmail.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Richard W.M. Jones <rjones@redhat.com>
Cc: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: ZiyangZhang <ZiyangZhang@linux.alibaba.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
[axboe: correct MAINTAINERS addition]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agolandlock: Fix file reparenting without explicit LANDLOCK_ACCESS_FS_REFER
Mickaël Salaün [Wed, 31 Aug 2022 20:38:40 +0000 (22:38 +0200)]
landlock: Fix file reparenting without explicit LANDLOCK_ACCESS_FS_REFER

This change fixes a mis-handling of the LANDLOCK_ACCESS_FS_REFER right
when multiple rulesets/domains are stacked. The expected behaviour was
that an additional ruleset can only restrict the set of permitted
operations, but in this particular case, it was potentially possible to
re-gain the LANDLOCK_ACCESS_FS_REFER right.

With the introduction of LANDLOCK_ACCESS_FS_REFER, we added the first
globally denied-by-default access right.  Indeed, this lifted an initial
Landlock limitation to rename and link files, which was initially always
denied when the source or the destination were different directories.

This led to an inconsistent backward compatibility behavior which was
only taken into account if no domain layer were using the new
LANDLOCK_ACCESS_FS_REFER right. However, when restricting a thread with
a new ruleset handling LANDLOCK_ACCESS_FS_REFER, all inherited parent
rulesets/layers not explicitly handling LANDLOCK_ACCESS_FS_REFER would
behave as if they were handling this access right and with all their
rules allowing it. This means that renaming and linking files could
became allowed by these parent layers, but all the other required
accesses must also be granted: all layers must allow file removal or
creation, and renaming and linking operations cannot lead to privilege
escalation according to the Landlock policy.  See detailed explanation
in commit b91c3e4ea756 ("landlock: Add support for file reparenting with
LANDLOCK_ACCESS_FS_REFER").

To say it another way, this bug may lift the renaming and linking
limitations of the initial Landlock version, and a same ruleset can
enforce different restrictions depending on previous or next enforced
ruleset (i.e. inconsistent behavior). The LANDLOCK_ACCESS_FS_REFER right
cannot give access to data not already allowed, but this doesn't follow
the contract of the first Landlock ABI. This fix puts back the
limitation for sandboxes that didn't opt-in for this additional right.

For instance, if a first ruleset allows LANDLOCK_ACCESS_FS_MAKE_REG on
/dst and LANDLOCK_ACCESS_FS_REMOVE_FILE on /src, renaming /src/file to
/dst/file is denied. However, without this fix, stacking a new ruleset
which allows LANDLOCK_ACCESS_FS_REFER on / would now permit the
sandboxed thread to rename /src/file to /dst/file .

This change fixes the (absolute) rule access rights, which now always
forbid LANDLOCK_ACCESS_FS_REFER except when it is explicitly allowed
when creating a rule.

Making all domain handle LANDLOCK_ACCESS_FS_REFER was an initial
approach but there is two downsides:
* it makes the code more complex because we still want to check that a
  rule allowing LANDLOCK_ACCESS_FS_REFER is legitimate according to the
  ruleset's handled access rights (i.e. ABI v1 != ABI v2);
* it would not allow to identify if the user created a ruleset
  explicitly handling LANDLOCK_ACCESS_FS_REFER or not, which will be an
  issue to audit Landlock.

Instead, this change adds an ACCESS_INITIALLY_DENIED list of
denied-by-default rights, which (only) contains
LANDLOCK_ACCESS_FS_REFER.  All domains are treated as if they are also
handling this list, but without modifying their fs_access_masks field.

A side effect is that the errno code returned by rename(2) or link(2)
*may* be changed from EXDEV to EACCES according to the enforced
restrictions.  Indeed, we now have the mechanic to identify if an access
is denied because of a required right (e.g. LANDLOCK_ACCESS_FS_MAKE_REG,
LANDLOCK_ACCESS_FS_REMOVE_FILE) or if it is denied because of missing
LANDLOCK_ACCESS_FS_REFER rights.  This may result in different errno
codes than for the initial Landlock version, but this approach is more
consistent and better for rename/link compatibility reasons, and it
wasn't possible before (hence no backport to ABI v1).  The
layout1.rename_file test reflects this change.

Add 4 layout1.refer_denied_by_default* test suites to check that the
behavior of a ruleset not handling LANDLOCK_ACCESS_FS_REFER (ABI v1) is
unchanged even if another layer handles LANDLOCK_ACCESS_FS_REFER (i.e.
ABI v1 precedence).  Make sure rule's absolute access rights are correct
by testing with and without a matching path.  Add test_rename() and
test_exchange() helpers.

Extend layout1.inval tests to check that a denied-by-default access
right is not necessarily part of a domain's handled access rights.

Test coverage for security/landlock is 95.3% of 599 lines according to
gcc/gcov-11.

Fixes: b91c3e4ea756 ("landlock: Add support for file reparenting with LANDLOCK_ACCESS_FS_REFER")
Reviewed-by: Paul Moore <paul@paul-moore.com>
Reviewed-by: Günther Noack <gnoack3000@gmail.com>
Link: https://lore.kernel.org/r/20220831203840.1370732-1-mic@digikod.net
Cc: stable@vger.kernel.org
[mic: Constify and slightly simplify test helpers]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2 years agoxen/grants: prevent integer overflow in gnttab_dma_alloc_pages()
Dan Carpenter [Thu, 1 Sep 2022 15:35:20 +0000 (18:35 +0300)]
xen/grants: prevent integer overflow in gnttab_dma_alloc_pages()

The change from kcalloc() to kvmalloc() means that arg->nr_pages
might now be large enough that the "args->nr_pages << PAGE_SHIFT" can
result in an integer overflow.

Fixes: b3f7931f5c61 ("xen/gntdev: switch from kcalloc() to kvcalloc()")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/YxDROJqu/RPvR0bi@kili
Signed-off-by: Juergen Gross <jgross@suse.com>
2 years agoxen-blkfront: Cache feature_persistent value before advertisement
SeongJae Park [Wed, 31 Aug 2022 16:58:24 +0000 (16:58 +0000)]
xen-blkfront: Cache feature_persistent value before advertisement

Xen blkfront advertises its support of the persistent grants feature
when it first setting up and when resuming in 'talk_to_blkback()'.
Then, blkback reads the advertised value when it connects with blkfront
and decides if it will use the persistent grants feature or not, and
advertises its decision to blkfront.  Blkfront reads the blkback's
decision and it also makes the decision for the use of the feature.

Commit 402c43ea6b34 ("xen-blkfront: Apply 'feature_persistent' parameter
when connect"), however, made the blkfront's read of the parameter for
disabling the advertisement, namely 'feature_persistent', to be done
when it negotiate, not when advertise.  Therefore blkfront advertises
without reading the parameter.  As the field for caching the parameter
value is zero-initialized, it always advertises as the feature is
disabled, so that the persistent grants feature becomes always disabled.

This commit fixes the issue by making the blkfront does parmeter caching
just before the advertisement.

Fixes: 402c43ea6b34 ("xen-blkfront: Apply 'feature_persistent' parameter when connect")
Cc: <stable@vger.kernel.org> # 5.10.x
Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: SeongJae Park <sj@kernel.org>
Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20220831165824.94815-4-sj@kernel.org
Signed-off-by: Juergen Gross <jgross@suse.com>
2 years agoxen-blkfront: Advertise feature-persistent as user requested
SeongJae Park [Wed, 31 Aug 2022 16:58:23 +0000 (16:58 +0000)]
xen-blkfront: Advertise feature-persistent as user requested

The advertisement of the persistent grants feature (writing
'feature-persistent' to xenbus) should mean not the decision for using
the feature but only the availability of the feature.  However, commit
74a852479c68 ("xen-blkfront: add a parameter for disabling of persistent
grants") made a field of blkfront, which was a place for saving only the
negotiation result, to be used for yet another purpose: caching of the
'feature_persistent' parameter value.  As a result, the advertisement,
which should follow only the parameter value, becomes inconsistent.

This commit fixes the misuse of the semantic by making blkfront saves
the parameter value in a separate place and advertises the support based
on only the saved value.

Fixes: 74a852479c68 ("xen-blkfront: add a parameter for disabling of persistent grants")
Cc: <stable@vger.kernel.org> # 5.10.x
Suggested-by: Juergen Gross <jgross@suse.com>
Signed-off-by: SeongJae Park <sj@kernel.org>
Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20220831165824.94815-3-sj@kernel.org
Signed-off-by: Juergen Gross <jgross@suse.com>
2 years agoxen-blkback: Advertise feature-persistent as user requested
SeongJae Park [Wed, 31 Aug 2022 16:58:22 +0000 (16:58 +0000)]
xen-blkback: Advertise feature-persistent as user requested

The advertisement of the persistent grants feature (writing
'feature-persistent' to xenbus) should mean not the decision for using
the feature but only the availability of the feature.  However, commit
aac8a70db24b ("xen-blkback: add a parameter for disabling of persistent
grants") made a field of blkback, which was a place for saving only the
negotiation result, to be used for yet another purpose: caching of the
'feature_persistent' parameter value.  As a result, the advertisement,
which should follow only the parameter value, becomes inconsistent.

This commit fixes the misuse of the semantic by making blkback saves the
parameter value in a separate place and advertises the support based on
only the saved value.

Fixes: aac8a70db24b ("xen-blkback: add a parameter for disabling of persistent grants")
Cc: <stable@vger.kernel.org> # 5.10.x
Suggested-by: Juergen Gross <jgross@suse.com>
Signed-off-by: SeongJae Park <sj@kernel.org>
Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20220831165824.94815-2-sj@kernel.org
Signed-off-by: Juergen Gross <jgross@suse.com>
2 years agopowerpc/papr_scm: Ensure rc is always initialized in papr_scm_pmu_register()
Nathan Chancellor [Tue, 30 Aug 2022 15:12:56 +0000 (08:12 -0700)]
powerpc/papr_scm: Ensure rc is always initialized in papr_scm_pmu_register()

Clang warns:

  arch/powerpc/platforms/pseries/papr_scm.c:492:6: warning: variable 'rc' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
          if (!p->stat_buffer_len)
              ^~~~~~~~~~~~~~~~~~~
  arch/powerpc/platforms/pseries/papr_scm.c:523:64: note: uninitialized use occurs here
          dev_info(&p->pdev->dev, "nvdimm pmu didn't register rc=%d\n", rc);
                                                                        ^~
  include/linux/dev_printk.h:150:67: note: expanded from macro 'dev_info'
          dev_printk_index_wrap(_dev_info, KERN_INFO, dev, dev_fmt(fmt), ##__VA_ARGS__)
                                                                          ^~~~~~~~~~~
  include/linux/dev_printk.h:110:23: note: expanded from macro 'dev_printk_index_wrap'
                  _p_func(dev, fmt, ##__VA_ARGS__);                       \
                                      ^~~~~~~~~~~
  arch/powerpc/platforms/pseries/papr_scm.c:492:2: note: remove the 'if' if its condition is always false
          if (!p->stat_buffer_len)
          ^~~~~~~~~~~~~~~~~~~~~~~~
  arch/powerpc/platforms/pseries/papr_scm.c:484:8: note: initialize the variable 'rc' to silence this warning
          int rc, nodeid;
                ^
                = 0
  1 warning generated.

The call to papr_scm_pmu_check_events() was eliminated but a return code
was not added to the if statement. Add the same return code from
papr_scm_pmu_check_events() for this condition so there is no more
warning.

Fixes: 9b1ac04698a4 ("powerpc/papr_scm: Fix nvdimm event mappings")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://github.com/ClangBuiltLinux/linux/issues/1701
Link: https://lore.kernel.org/r/20220830151256.1473169-1-nathan@kernel.org
2 years agoRevert "powerpc/irq: Don't open code irq_soft_mask helpers"
Michael Ellerman [Wed, 31 Aug 2022 13:10:52 +0000 (23:10 +1000)]
Revert "powerpc/irq: Don't open code irq_soft_mask helpers"

This reverts commit ef5b570d3700fbb8628a58da0487486ceeb713cd.

Zhouyi reported that commit is causing crashes when running rcutorture
with KASAN enabled:

  BUG: using smp_processor_id() in preemptible [00000000] code: rcu_torture_rea/100
  caller is rcu_preempt_deferred_qs_irqrestore+0x74/0xed0
  CPU: 4 PID: 100 Comm: rcu_torture_rea Tainted: G        W          5.19.0-rc5-next-20220708-dirty #253
  Call Trace:
    dump_stack_lvl+0xbc/0x108 (unreliable)
    check_preemption_disabled+0x154/0x160
    rcu_preempt_deferred_qs_irqrestore+0x74/0xed0
    __rcu_read_unlock+0x290/0x3b0
    rcu_torture_read_unlock+0x30/0xb0
    rcutorture_one_extend+0x198/0x810
    rcu_torture_one_read+0x58c/0xc90
    rcu_torture_reader+0x12c/0x360
    kthread+0x1e8/0x220
    ret_from_kernel_thread+0x5c/0x64

KASAN will generate instrumentation instructions around the
WRITE_ONCE(local_paca->irq_soft_mask, mask):

   0xc000000000295cb0 <+0>: addis   r2,r12,774
   0xc000000000295cb4 <+4>: addi    r2,r2,16464
   0xc000000000295cb8 <+8>: mflr    r0
   0xc000000000295cbc <+12>: bl      0xc00000000008bb4c <mcount>
   0xc000000000295cc0 <+16>: mflr    r0
   0xc000000000295cc4 <+20>: std     r31,-8(r1)
   0xc000000000295cc8 <+24>: addi    r3,r13,2354
   0xc000000000295ccc <+28>: mr      r31,r13
   0xc000000000295cd0 <+32>: std     r0,16(r1)
   0xc000000000295cd4 <+36>: stdu    r1,-48(r1)
   0xc000000000295cd8 <+40>: bl      0xc000000000609b98 <__asan_store1+8>
   0xc000000000295cdc <+44>: nop
   0xc000000000295ce0 <+48>: li      r9,1
   0xc000000000295ce4 <+52>: stb     r9,2354(r31)
   0xc000000000295ce8 <+56>: addi    r1,r1,48
   0xc000000000295cec <+60>: ld      r0,16(r1)
   0xc000000000295cf0 <+64>: ld      r31,-8(r1)
   0xc000000000295cf4 <+68>: mtlr    r0

If there is a context switch before "stb     r9,2354(r31)", r31 may
not equal to r13, in such case, irq soft mask will not work.

The usual solution of marking the code ineligible for instrumentation
forces the code out-of-line, which we would prefer to avoid. Christophe
proposed a partial revert, but Nick raised some concerns with that. So
for now do a full revert.

Reported-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
[mpe: Construct change log based on Zhouyi's original report]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220831131052.42250-1-mpe@ellerman.id.au
2 years agoRevert "usb: gadget: udc-xilinx: replace memcpy with memcpy_toio"
Greg Kroah-Hartman [Fri, 2 Sep 2022 07:10:08 +0000 (09:10 +0200)]
Revert "usb: gadget: udc-xilinx: replace memcpy with memcpy_toio"

This reverts commit 8cb339f1c1f04baede9d54c1e40ac96247a6393b as it
throws up a bunch of sparse warnings as reported by the kernel test
robot.

Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/r/202209020044.CX2PfZzM-lkp@intel.com
Fixes: 8cb339f1c1f0 ("usb: gadget: udc-xilinx: replace memcpy with memcpy_toio")
Cc: stable@vger.kernel.org
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Piyush Mehta <piyush.mehta@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 years agoMerge tag 'soundwire-6.0-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git...
Greg Kroah-Hartman [Fri, 2 Sep 2022 06:59:45 +0000 (08:59 +0200)]
Merge tag 'soundwire-6.0-fixes' of git://git./linux/kernel/git/vkoul/soundwire into char-misc-linus

Vinod writes:
  "soundwire fixes for v6.0

   This contains two fixes to qcom sdw driver which resolve duplicate reset
   control get and second one fixes device array indices."

* tag 'soundwire-6.0-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
  soundwire: qcom: fix device status array range
  soundwire: qcom: remove duplicate reset control get

2 years agoMerge tag 'drm-intel-fixes-2022-09-01' of git://anongit.freedesktop.org/drm/drm-intel...
Dave Airlie [Fri, 2 Sep 2022 01:26:29 +0000 (11:26 +1000)]
Merge tag 'drm-intel-fixes-2022-09-01' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

- Fix for a null pointer dereference (Lukasz)
- HDMI related quirk for ECS Liva Q2 with GLK graphics (Diego)
- Skip wm/ddb readout for disabled pipes (Ville)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YxC3GmSOpDiZTdIJ@intel.com
2 years agoMerge tag 'kvm-s390-master-6.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git...
Paolo Bonzini [Thu, 1 Sep 2022 23:21:27 +0000 (19:21 -0400)]
Merge tag 'kvm-s390-master-6.0-1' of git://git./linux/kernel/git/kvms390/linux into HEAD

PCI interpretation compile fixes

2 years agoMerge tag 'kvm-riscv-fixes-6.0-1' of https://github.com/kvm-riscv/linux into HEAD
Paolo Bonzini [Thu, 1 Sep 2022 23:21:09 +0000 (19:21 -0400)]
Merge tag 'kvm-riscv-fixes-6.0-1' of https://github.com/kvm-riscv/linux into HEAD

KVM/riscv fixes for 6.0, take #1

- Fix unused variable warnings in vcpu_timer.c
- Move extern sbi_ext declarations to a header

2 years agoKVM: x86: check validity of argument to KVM_SET_MP_STATE
Paolo Bonzini [Thu, 11 Aug 2022 16:41:25 +0000 (12:41 -0400)]
KVM: x86: check validity of argument to KVM_SET_MP_STATE

An invalid argument to KVM_SET_MP_STATE has no effect other than making the
vCPU fail to run at the next KVM_RUN.  Since it is extremely unlikely that
any userspace is relying on it, fail with -EINVAL just like for other
architectures.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2 years agoperf/x86/core: Completely disable guest PEBS via guest's global_ctrl
Like Xu [Wed, 31 Aug 2022 03:35:24 +0000 (11:35 +0800)]
perf/x86/core: Completely disable guest PEBS via guest's global_ctrl

When a guest PEBS counter is cross-mapped by a host counter, software
will remove the corresponding bit in the arr[global_ctrl].guest and
expect hardware to perform a change of state "from enable to disable"
via the msr_slot[] switch during the vmx transaction.

The real world is that if user adjust the counter overflow value small
enough, it still opens a tiny race window for the previously PEBS-enabled
counter to write cross-mapped PEBS records into the guest's PEBS buffer,
when arr[global_ctrl].guest has been prioritised (switch_msr_special stuff)
to switch into the enabled state, while the arr[pebs_enable].guest has not.

Close this window by clearing invalid bits in the arr[global_ctrl].guest.

Cc: linux-perf-users@vger.kernel.org
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sean Christopherson <seanjc@google.com>
Fixes: 854250329c02 ("KVM: x86/pmu: Disable guest PEBS temporarily in two rare situations")
Signed-off-by: Like Xu <likexu@tencent.com>
Message-Id: <20220831033524.58561-1-likexu@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2 years agoKVM: x86: fix memoryleak in kvm_arch_vcpu_create()
Miaohe Lin [Thu, 1 Sep 2022 12:23:00 +0000 (20:23 +0800)]
KVM: x86: fix memoryleak in kvm_arch_vcpu_create()

When allocating memory for mci_ctl2_banks fails, KVM doesn't release
mce_banks leading to memoryleak. Fix this issue by calling kfree()
for it when kcalloc() fails.

Fixes: 281b52780b57 ("KVM: x86: Add emulation for MSR_IA32_MCx_CTL2 MSRs.")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Message-Id: <20220901122300.22298-1-linmiaohe@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2 years agoKVM: x86: Mask off unsupported and unknown bits of IA32_ARCH_CAPABILITIES
Jim Mattson [Tue, 30 Aug 2022 17:49:47 +0000 (10:49 -0700)]
KVM: x86: Mask off unsupported and unknown bits of IA32_ARCH_CAPABILITIES

KVM should not claim to virtualize unknown IA32_ARCH_CAPABILITIES
bits. When kvm_get_arch_capabilities() was originally written, there
were only a few bits defined in this MSR, and KVM could virtualize all
of them. However, over the years, several bits have been defined that
KVM cannot just blindly pass through to the guest without additional
work (such as virtualizing an MSR promised by the
IA32_ARCH_CAPABILITES feature bit).

Define a mask of supported IA32_ARCH_CAPABILITIES bits, and mask off
any other bits that are set in the hardware MSR.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Fixes: 5b76a3cff011 ("KVM: VMX: Tell the nested hypervisor to skip L1D flush on vmentry")
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Vipin Sharma <vipinsh@google.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20220830174947.2182144-1-jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2 years agoMerge tag 'drm-msm-fixes-2022-08-27' of https://gitlab.freedesktop.org/drm/msm into...
Dave Airlie [Thu, 1 Sep 2022 19:58:52 +0000 (05:58 +1000)]
Merge tag 'drm-msm-fixes-2022-08-27' of https://gitlab.freedesktop.org/drm/msm into drm-fixes

Fixes for v6.0

- Fix for inconsistent indenting in function msm_dsi_dphy_timing_calc_v3.
  This fixes a smatch warning reported by kbot
- Fix to make eDP the first connector in the connected list. This was
  mainly done to address a screen corruption issue we were seeing on
  sc7280 boards which have eDP as the primary display. The corruption
  itself is from usermode but we decided to fix it this way because
  things work correct with the primary display as the first one for
  usermode
- Fix to populate intf_cfg correctly before calling reset_intf_cfg().
  Without this, the display pipeline is not torn down correctly for
  writeback
- Specify the correct number of DSI regulators for SDM660. It should
  have been 1 but 2 was mentioned
- Specify the correct number of DSI regulators for MSM8996. It should
  have been 3 but 2 was mentioned
- Fix for removing DP_RECOVERED_CLOCK_OUT_EN bit for tps4 link training
  for DP. This was causing link training failures and hence no display
  for a specific DP to HDMI cable on chromebooks
- Fix probe-deferral crash in gpu devfreq
- Fix gpu debugfs deadlock

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rob Clark <robdclark@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGtuY=jd44itwTkLXVqhnoKgY0BswPTrxDTxCiPG3WbmLA@mail.gmail.com
2 years agoMerge tag 'amd-drm-fixes-6.0-2022-08-31' of https://gitlab.freedesktop.org/agd5f...
Dave Airlie [Thu, 1 Sep 2022 19:56:25 +0000 (05:56 +1000)]
Merge tag 'amd-drm-fixes-6.0-2022-08-31' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-6.0-2022-08-31:

amdgpu:
- FRU error message fix
- MES 11 updates
- DCN 3.2.x fixes
- DCN 3.1.4 fixes
- Fix possible use after free in CS IOCTL
- SMU 13.0.x fixes
- Fix iolink reporting on devices with direct connections to CPU
- GFX10 tap delay firmware fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220831212312.5921-1-alexander.deucher@amd.com
2 years agoMerge tag 'drm-misc-fixes-2022-08-31' of git://anongit.freedesktop.org/drm/drm-misc...
Dave Airlie [Thu, 1 Sep 2022 19:34:33 +0000 (05:34 +1000)]
Merge tag 'drm-misc-fixes-2022-08-31' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

Short summary of fixes pull:

 * dma-buf/dma-resv: Fence-handling fix

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/Yw+pZnEbPxkJ1nHa@linux-uq9g.fritz.box
2 years agoMerge tag 'net-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 1 Sep 2022 16:20:42 +0000 (09:20 -0700)]
Merge tag 'net-6.0-rc4' of git://git./linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from bluetooth, bpf and wireless.

  Current release - regressions:

   - bpf:
      - fix wrong last sg check in sk_msg_recvmsg()
      - fix kernel BUG in purge_effective_progs()

   - mac80211:
      - fix possible leak in ieee80211_tx_control_port()
      - potential NULL dereference in ieee80211_tx_control_port()

  Current release - new code bugs:

   - nfp: fix the access to management firmware hanging

  Previous releases - regressions:

   - ip: fix triggering of 'icmp redirect'

   - sched: tbf: don't call qdisc_put() while holding tree lock

   - bpf: fix corrupted packets for XDP_SHARED_UMEM

   - bluetooth: hci_sync: fix suspend performance regression

   - micrel: fix probe failure

  Previous releases - always broken:

   - tcp: make global challenge ack rate limitation per net-ns and
     default disabled

   - tg3: fix potential hang-up on system reboot

   - mac802154: fix reception for no-daddr packets

  Misc:

   - r8152: add PID for the lenovo onelink+ dock"

* tag 'net-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (56 commits)
  net/smc: Remove redundant refcount increase
  Revert "sch_cake: Return __NET_XMIT_STOLEN when consuming enqueued skb"
  tcp: make global challenge ack rate limitation per net-ns and default disabled
  tcp: annotate data-race around challenge_timestamp
  net: dsa: hellcreek: Print warning only once
  ip: fix triggering of 'icmp redirect'
  sch_cake: Return __NET_XMIT_STOLEN when consuming enqueued skb
  selftests: net: sort .gitignore file
  Documentation: networking: correct possessive "its"
  kcm: fix strp_init() order and cleanup
  mlxbf_gige: compute MDIO period based on i1clk
  ethernet: rocker: fix sleep in atomic context bug in neigh_timer_handler
  net: lan966x: improve error handle in lan966x_fdma_rx_get_frame()
  nfp: fix the access to management firmware hanging
  net: phy: micrel: Make the GPIO to be non-exclusive
  net: virtio_net: fix notification coalescing comments
  net/sched: fix netdevice reference leaks in attach_default_qdiscs()
  net: sched: tbf: don't call qdisc_put() while holding tree lock
  net: Use u64_stats_fetch_begin_irq() for stats fetch.
  net: dsa: xrs700x: Use irqsave variant for u64 stats update
  ...

2 years agoMerge tag 'slab-for-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka...
Linus Torvalds [Thu, 1 Sep 2022 16:14:56 +0000 (09:14 -0700)]
Merge tag 'slab-for-6.0-rc4' of git://git./linux/kernel/git/vbabka/slab

Pull slab fix from Vlastimil Babka:

 - A fix from Waiman Long to avoid a theoretical deadlock reported by
   lockdep.

* tag 'slab-for-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
  mm/slab_common: Deleting kobject in kmem_cache_destroy() without holding slab_mutex/cpu_hotplug_lock

2 years agoMerge tag 'sound-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai...
Linus Torvalds [Thu, 1 Sep 2022 16:05:25 +0000 (09:05 -0700)]
Merge tag 'sound-6.0-rc4' of git://git./linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "Just handful changes at this time. The only major change is the
  regression fix about the x86 WC-page buffer allocation.

  The rest are trivial data-race fixes for ALSA sequencer core, the
  possible out-of-bounds access fixes in the new ALSA control hash code,
  and a few device-specific workarounds and fixes"

* tag 'sound-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: usb-audio: Add quirk for LH Labs Geek Out HD Audio 1V5
  ALSA: hda/realtek: Add speaker AMP init for Samsung laptops with ALC298
  ALSA: control: Re-order bounds checking in get_ctl_id_hash()
  ALSA: control: Fix an out-of-bounds bug in get_ctl_id_hash()
  ALSA: hda: intel-nhlt: Correct the handling of fmt_config flexible array
  ALSA: seq: Fix data-race at module auto-loading
  ALSA: seq: oss: Fix data-race for max_midi_devs access
  ALSA: memalloc: Revive x86-specific WC page allocations again

2 years agoplatform/x86: p2sb: Fix UAF when caller uses resource name
Andy Shevchenko [Thu, 1 Sep 2022 11:34:06 +0000 (14:34 +0300)]
platform/x86: p2sb: Fix UAF when caller uses resource name

We have to copy only selected fields from the original resource.
Because a PCI device will be removed immediately after getting
its resources, we may not use any allocated data, hence we may
not copy any pointers.

Consider the following scenario:

  1/ a caller of p2sb_bar() gets the resource;

  2/ the resource has been copied by platform_device_add_data()
     in order to create a platform device;

  3/ the platform device creation will call for the device driver's
     ->probe() as soon as a match found;

  4/ the ->probe() takes given resources (see 2/) and tries to
     access one of its field, i.e. 'name', in the
     __devm_ioremap_resource() to create a pretty looking output;

  5/ but the 'name' is a dangling pointer because p2sb_bar()
     removed a PCI device, which 'name' had been copied to
     the caller's memory.

  6/ UAF (Use-After-Free) as a result.

Kudos to Mika for the initial analisys of the issue.

Fixes: 9745fb07474f ("platform/x86/intel: Add Primary to Sideband (P2SB) bridge support")
Reported-by: kernel test robot <oliver.sang@intel.com>
Suggested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Link: https://lore.kernel.org/linux-i2c/YvPCbnKqDiL2XEKp@xsang-OptiPlex-9020/
Link: https://lore.kernel.org/linux-i2c/YtjAswDKfiuDfWYs@xsang-OptiPlex-9020/
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220901113406.65876-1-andriy.shevchenko@linux.intel.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2 years agoplatform/x86: asus-wmi: Increase FAN_CURVE_BUF_LEN to 32
Luke D. Jones [Sun, 28 Aug 2022 07:46:38 +0000 (19:46 +1200)]
platform/x86: asus-wmi: Increase FAN_CURVE_BUF_LEN to 32

Fix for TUF laptops returning with an -ENOSPC on calling
asus_wmi_evaluate_method_buf() when fetching default curves. The TUF method
requires at least 32 bytes space.

This also moves and changes the pr_debug() in fan_curve_check_present() to
pr_warn() in fan_curve_get_factory_default() so that there is at least some
indication in logs of why it fails.

Signed-off-by: Luke D. Jones <luke@ljones.dev>
Link: https://lore.kernel.org/r/20220828074638.5473-1-luke@ljones.dev
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2 years agofirmware_loader: Fix memory leak in firmware upload
Russ Weight [Wed, 31 Aug 2022 00:25:18 +0000 (17:25 -0700)]
firmware_loader: Fix memory leak in firmware upload

In the case of firmware-upload, an instance of struct fw_upload is
allocated in firmware_upload_register(). This data needs to be freed
in fw_dev_release(). Create a new fw_upload_free() function in
sysfs_upload.c to handle the firmware-upload specific memory frees
and incorporate the missing kfree call for the fw_upload structure.

Fixes: 97730bbb242c ("firmware_loader: Add firmware-upload support")
Cc: stable <stable@kernel.org>
Signed-off-by: Russ Weight <russell.h.weight@intel.com>
Link: https://lore.kernel.org/r/20220831002518.465274-1-russell.h.weight@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 years agofirmware_loader: Fix use-after-free during unregister
Russ Weight [Mon, 29 Aug 2022 17:45:57 +0000 (10:45 -0700)]
firmware_loader: Fix use-after-free during unregister

In the following code within firmware_upload_unregister(), the call to
device_unregister() could result in the dev_release function freeing the
fw_upload_priv structure before it is dereferenced for the call to
module_put(). This bug was found by the kernel test robot using
CONFIG_KASAN while running the firmware selftests.

  device_unregister(&fw_sysfs->dev);
  module_put(fw_upload_priv->module);

The problem is fixed by copying fw_upload_priv->module to a local variable
for use when calling device_unregister().

Fixes: 97730bbb242c ("firmware_loader: Add firmware-upload support")
Cc: stable <stable@kernel.org>
Reported-by: kernel test robot <oliver.sang@intel.com>
Reviewed-by: Matthew Gerlach <matthew.gerlach@linux.intel.com>
Signed-off-by: Russ Weight <russell.h.weight@intel.com>
Link: https://lore.kernel.org/r/20220829174557.437047-1-russell.h.weight@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 years agoselftests/net: return back io_uring zc send tests
Pavel Begunkov [Thu, 1 Sep 2022 10:54:05 +0000 (11:54 +0100)]
selftests/net: return back io_uring zc send tests

Enable io_uring zerocopy send tests back and fix them up to follow the
new inteface.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/c8e5018c516093bdad0b6e19f2f9847dea17e4d2.1662027856.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agoio_uring/net: simplify zerocopy send user API
Pavel Begunkov [Thu, 1 Sep 2022 10:54:04 +0000 (11:54 +0100)]
io_uring/net: simplify zerocopy send user API

Following user feedback, this patch simplifies zerocopy send API. One of
the main complaints is that the current API is difficult with the
userspace managing notification slots, and then send retries with error
handling make it even worse.

Instead of keeping notification slots change it to the per-request
notifications model, which posts both completion and notification CQEs
for each request when any data has been sent, and only one CQE if it
fails. All notification CQEs will have IORING_CQE_F_NOTIF set and
IORING_CQE_F_MORE in completion CQEs indicates whether to wait a
notification or not.

IOSQE_CQE_SKIP_SUCCESS is disallowed with zerocopy sends for now.

This is less flexible, but greatly simplifies the user API and also the
kernel implementation. We reuse notif helpers in this patch, but in the
future there won't be need for keeping two requests.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/95287640ab98fc9417370afb16e310677c63e6ce.1662027856.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agoio_uring/notif: remove notif registration
Pavel Begunkov [Thu, 1 Sep 2022 10:54:03 +0000 (11:54 +0100)]
io_uring/notif: remove notif registration

We're going to remove the userspace exposed zerocopy notification API,
remove notification registration.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/6ff00b97be99869c386958a990593c9c31cf105b.1662027856.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agoRevert "io_uring: rename IORING_OP_FILES_UPDATE"
Pavel Begunkov [Thu, 1 Sep 2022 10:54:02 +0000 (11:54 +0100)]
Revert "io_uring: rename IORING_OP_FILES_UPDATE"

This reverts commit 4379d5f15b3fd4224c37841029178aa8082a242e.

We removed notification flushing, also cleanup uapi preparation changes
to not pollute it.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/89edc3905350f91e1b6e26d9dbf42ee44fd451a2.1662027856.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agoRevert "io_uring: add zc notification flush requests"
Pavel Begunkov [Thu, 1 Sep 2022 10:54:01 +0000 (11:54 +0100)]
Revert "io_uring: add zc notification flush requests"

This reverts commit 492dddb4f6e3a5839c27d41ff1fecdbe6c3ab851.

Soon we won't have the very notion of notification flushing, so remove
notification flushing requests.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/8850334ca56e65b413cb34fd158db81d7b2865a3.1662027856.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agoselftests/net: temporarily disable io_uring zc test
Pavel Begunkov [Thu, 1 Sep 2022 10:54:00 +0000 (11:54 +0100)]
selftests/net: temporarily disable io_uring zc test

We're going to change API, to avoid build problems with a couple of
following commits, disable io_uring testing.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/12b7507223df04fbd12aa05fc0cb544b51d7ed79.1662027856.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agoarch_topology: Silence early cacheinfo errors when non-existent
Florian Fainelli [Fri, 5 Aug 2022 23:07:36 +0000 (16:07 -0700)]
arch_topology: Silence early cacheinfo errors when non-existent

Architectures which do not have cacheinfo such as ARM 32-bit would spit
out the following during boot:

 Early cacheinfo failed, ret = -2

Treat -ENOENT specifically to silence this error since it means that the
platform does not support reporting its cache information.

Fixes: 3fcbf1c77d08 ("arch_topology: Fix cache attributes detection in the CPU hotplug path")
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Michael Walle <michael@walle.cc>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220805230736.1562801-1-f.fainelli@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 years agobinder: fix alloc->vma_vm_mm null-ptr dereference
Carlos Llamas [Mon, 29 Aug 2022 20:12:48 +0000 (20:12 +0000)]
binder: fix alloc->vma_vm_mm null-ptr dereference

Syzbot reported a couple issues introduced by commit 44e602b4e52f
("binder_alloc: add missing mmap_lock calls when using the VMA"), in
which we attempt to acquire the mmap_lock when alloc->vma_vm_mm has not
been initialized yet.

This can happen if a binder_proc receives a transaction without having
previously called mmap() to setup the binder_proc->alloc space in [1].
Also, a similar issue occurs via binder_alloc_print_pages() when we try
to dump the debugfs binder stats file in [2].

Sample of syzbot's crash report:
  ==================================================================
  KASAN: null-ptr-deref in range [0x0000000000000128-0x000000000000012f]
  CPU: 0 PID: 3755 Comm: syz-executor229 Not tainted 6.0.0-rc1-next-20220819-syzkaller #0
  syz-executor229[3755] cmdline: ./syz-executor2294415195
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/22/2022
  RIP: 0010:__lock_acquire+0xd83/0x56d0 kernel/locking/lockdep.c:4923
  [...]
  Call Trace:
   <TASK>
   lock_acquire kernel/locking/lockdep.c:5666 [inline]
   lock_acquire+0x1ab/0x570 kernel/locking/lockdep.c:5631
   down_read+0x98/0x450 kernel/locking/rwsem.c:1499
   mmap_read_lock include/linux/mmap_lock.h:117 [inline]
   binder_alloc_new_buf_locked drivers/android/binder_alloc.c:405 [inline]
   binder_alloc_new_buf+0xa5/0x19e0 drivers/android/binder_alloc.c:593
   binder_transaction+0x242e/0x9a80 drivers/android/binder.c:3199
   binder_thread_write+0x664/0x3220 drivers/android/binder.c:3986
   binder_ioctl_write_read drivers/android/binder.c:5036 [inline]
   binder_ioctl+0x3470/0x6d00 drivers/android/binder.c:5323
   vfs_ioctl fs/ioctl.c:51 [inline]
   __do_sys_ioctl fs/ioctl.c:870 [inline]
   __se_sys_ioctl fs/ioctl.c:856 [inline]
   __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:856
   do_syscall_x64 arch/x86/entry/common.c:50 [inline]
   do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
   entry_SYSCALL_64_after_hwframe+0x63/0xcd
   [...]
  ==================================================================

Fix these issues by setting up alloc->vma_vm_mm pointer during open()
and caching directly from current->mm. This guarantees we have a valid
reference to take the mmap_lock during scenarios described above.

[1] https://syzkaller.appspot.com/bug?extid=f7dc54e5be28950ac459
[2] https://syzkaller.appspot.com/bug?extid=a75ebe0452711c9e56d9

Fixes: 44e602b4e52f ("binder_alloc: add missing mmap_lock calls when using the VMA")
Cc: <stable@vger.kernel.org> # v5.15+
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Reported-by: syzbot+f7dc54e5be28950ac459@syzkaller.appspotmail.com
Reported-by: syzbot+a75ebe0452711c9e56d9@syzkaller.appspotmail.com
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Acked-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20220829201254.1814484-2-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 years agomisc: fastrpc: increase maximum session count
Johan Hovold [Mon, 29 Aug 2022 08:05:31 +0000 (10:05 +0200)]
misc: fastrpc: increase maximum session count

The SC8280XP platform uses 14 sessions for the compute DSP so increment
the maximum session count.

Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Link: https://lore.kernel.org/r/20220829080531.29681-4-johan+linaro@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 years agomisc: fastrpc: fix memory corruption on open
Johan Hovold [Mon, 29 Aug 2022 08:05:30 +0000 (10:05 +0200)]
misc: fastrpc: fix memory corruption on open

The probe session-duplication overflow check incremented the session
count also when there were no more available sessions so that memory
beyond the fixed-size slab-allocated session array could be corrupted in
fastrpc_session_alloc() on open().

Fixes: f6f9279f2bf0 ("misc: fastrpc: Add Qualcomm fastrpc basic driver model")
Cc: stable@vger.kernel.org # 5.1
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Link: https://lore.kernel.org/r/20220829080531.29681-3-johan+linaro@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 years agomisc: fastrpc: fix memory corruption on probe
Johan Hovold [Mon, 29 Aug 2022 08:05:29 +0000 (10:05 +0200)]
misc: fastrpc: fix memory corruption on probe

Add the missing sanity check on the probed-session count to avoid
corrupting memory beyond the fixed-size slab-allocated session array
when there are more than FASTRPC_MAX_SESSIONS sessions defined in the
devicetree.

Fixes: f6f9279f2bf0 ("misc: fastrpc: Add Qualcomm fastrpc basic driver model")
Cc: stable@vger.kernel.org # 5.1
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Link: https://lore.kernel.org/r/20220829080531.29681-2-johan+linaro@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 years agoMerge tag 'nvme-6.0-2022-09-01' of git://git.infradead.org/nvme into block-6.0
Jens Axboe [Thu, 1 Sep 2022 14:11:11 +0000 (08:11 -0600)]
Merge tag 'nvme-6.0-2022-09-01' of git://git.infradead.org/nvme into block-6.0

Pull NVMe fixes from Christoph:

"nvme fixes for Linux 6.0

 - error handling fix for the new auth code (Hannes Reinecke)
 - fix unhandled tcp states in nvmet_tcp_state_change (Maurizio Lombardi)
 - add NVME_QUIRK_BOGUS_NID for Lexar NM610 (Shyamin Ayesh)"

* tag 'nvme-6.0-2022-09-01' of git://git.infradead.org/nvme:
  nvmet-tcp: fix unhandled tcp states in nvmet_tcp_state_change()
  nvmet-auth: add missing goto in nvmet_setup_auth()
  nvme-pci: add NVME_QUIRK_BOGUS_NID for Lexar NM610

2 years agousb: storage: Add ASUS <0x0b05:0x1932> to IGNORE_UAS
Hu Xiaoying [Thu, 1 Sep 2022 04:57:37 +0000 (12:57 +0800)]
usb: storage: Add ASUS <0x0b05:0x1932> to IGNORE_UAS

USB external storage device(0x0b05:1932), use gnome-disk-utility tools
to test usb write  < 30MB/s.
if does not to load module of uas for this device, can increase the
write speed from 20MB/s to >40MB/s.

Suggested-by: Matthias Kaehlcke <mka@chromium.org>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Hu Xiaoying <huxiaoying@kylinos.cn>
Link: https://lore.kernel.org/r/20220901045737.3438046-1-huxiaoying@kylinos.cn
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 years agodriver core: Don't probe devices after bus_type.match() probe deferral
Isaac J. Manjarres [Wed, 17 Aug 2022 18:40:26 +0000 (11:40 -0700)]
driver core: Don't probe devices after bus_type.match() probe deferral

Both __device_attach_driver() and __driver_attach() check the return
code of the bus_type.match() function to see if the device needs to be
added to the deferred probe list. After adding the device to the list,
the logic attempts to bind the device to the driver anyway, as if the
device had matched with the driver, which is not correct.

If __device_attach_driver() detects that the device in question is not
ready to match with a driver on the bus, then it doesn't make sense for
the device to attempt to bind with the current driver or continue
attempting to match with any of the other drivers on the bus. So, update
the logic in __device_attach_driver() to reflect this.

If __driver_attach() detects that a driver tried to match with a device
that is not ready to match yet, then the driver should not attempt to bind
with the device. However, the driver can still attempt to match and bind
with other devices on the bus, as drivers can be bound to multiple
devices. So, update the logic in __driver_attach() to reflect this.

Fixes: 656b8035b0ee ("ARM: 8524/1: driver cohandle -EPROBE_DEFER from bus_type.match()")
Cc: stable@vger.kernel.org
Cc: Saravana Kannan <saravanak@google.com>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Saravana Kannan <saravanak@google.com>
Signed-off-by: Isaac J. Manjarres <isaacmanjarres@google.com>
Link: https://lore.kernel.org/r/20220817184026.3468620-1-isaacmanjarres@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 years agoplatform/mellanox: Remove redundant 'NULL' check
Vadim Pasternak [Tue, 23 Aug 2022 20:19:37 +0000 (23:19 +0300)]
platform/mellanox: Remove redundant 'NULL' check

Remove 'NULL' check for 'data->hpdev.client' in error flow of
mlxreg_lc_probe(). It cannot be 'NULL' at this point.

Fixes: b4b830a34d80 ("platform/mellanox: mlxreg-lc: Fix error flow and extend verbosity")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Link: https://lore.kernel.org/r/20220823201937.46855-5-vadimp@nvidia.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2 years agoplatform/mellanox: Remove unnecessary code
Vadim Pasternak [Tue, 23 Aug 2022 20:19:36 +0000 (23:19 +0300)]
platform/mellanox: Remove unnecessary code

Remove redundant 'NULL' check for of if 'data->notifier'.

Replace 'return err' by 'return 0' in mlxreg_lc_probe().

Fixes: 62f9529b8d5c87b ("platform/mellanox: mlxreg-lc: Add initial support for Nvidia line card devices")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Link: https://lore.kernel.org/r/20220823201937.46855-4-vadimp@nvidia.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2 years agoplatform/mellanox: mlxreg-lc: Fix locking issue
Vadim Pasternak [Tue, 23 Aug 2022 20:19:35 +0000 (23:19 +0300)]
platform/mellanox: mlxreg-lc: Fix locking issue

Fix locking issues:
- mlxreg_lc_state_update() takes a lock when set or clear
  "MLXREG_LC_POWERED".
- All the devices can be deleted before MLXREG_LC_POWERED flag is cleared.

To fix it:
- Add lock() / unlock() at the beginning / end of
  mlxreg_lc_event_handler() and remove locking from
  mlxreg_lc_power_on_off() and mlxreg_lc_enable_disable()
- Add locked version of mlxreg_lc_state_update() -
  mlxreg_lc_state_update_locked() for using outside
  mlxreg_lc_event_handler().

(2) Remove redundant NULL check for of if 'data->notifier'.

Fixes: 62f9529b8d5c87b ("platform/mellanox: mlxreg-lc: Add initial support for Nvidia line card devices")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Link: https://lore.kernel.org/r/20220823201937.46855-3-vadimp@nvidia.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2 years agoplatform/mellanox: mlxreg-lc: Fix coverity warning
Vadim Pasternak [Tue, 23 Aug 2022 20:19:34 +0000 (23:19 +0300)]
platform/mellanox: mlxreg-lc: Fix coverity warning

Fix smatch warning:
drivers/platform/mellanox/mlxreg-lc.c:866 mlxreg_lc_probe() warn: passing zero to 'PTR_ERR'
by removing 'err = PTR_ERR(regmap)'.

Fixes: b4b830a34d80 ("platform/mellanox: mlxreg-lc: Fix error flow and extend verbosity")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Link: https://lore.kernel.org/r/20220823201937.46855-2-vadimp@nvidia.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2 years agoplatform/x86: acer-wmi: Acer Aspire One AOD270/Packard Bell Dot keymap fixes
Hans de Goede [Mon, 29 Aug 2022 16:35:44 +0000 (18:35 +0200)]
platform/x86: acer-wmi: Acer Aspire One AOD270/Packard Bell Dot keymap fixes

2 keymap fixes for the Acer Aspire One AOD270 and the same hardware
rebranded as Packard Bell Dot SC:

1. The F2 key is marked with a big '?' symbol on the Packard Bell Dot SC,
this sends WMID_HOTKEY_EVENTs with a scancode of 0x27 add a mapping
for this.

2. Scancode 0x61 is KEY_SWITCHVIDEOMODE. Usually this is a duplicate
input event with the "Video Bus" input device events. But on these devices
the "Video Bus" does not send events for this key. Map 0x61 to KEY_UNKNOWN
instead of using KE_IGNORE so that udev/hwdb can override it on these devs.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20220829163544.5288-1-hdegoede@redhat.com
2 years agoarm64: mm: Reserve enough pages for the initial ID map
Ard Biesheuvel [Fri, 26 Aug 2022 16:48:00 +0000 (18:48 +0200)]
arm64: mm: Reserve enough pages for the initial ID map

The logic that conditionally allocates one additional page at each
swapper page table level if KASLR is enabled is also applied to the
initial ID map, now that we have started using the same set of macros
to allocate the space for it.

However, the placement of the kernel in physical memory might result in
additional pages being needed at any level, even if KASLR is disabled in
the build. So account for this in the computation.

Fixes: c3cee924bd85 ("arm64: head: cover entire kernel image in initial ID map")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220826164800.2059148-1-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2 years agoperf/arm_pmu_platform: fix tests for platform_get_irq() failure
Yu Zhe [Thu, 25 Aug 2022 01:18:44 +0000 (09:18 +0800)]
perf/arm_pmu_platform: fix tests for platform_get_irq() failure

The platform_get_irq() returns negative error codes.  It can't actually
return zero.

Signed-off-by: Yu Zhe <yuzhe@nfschina.com>
Link: https://lore.kernel.org/r/20220825011844.8536-1-yuzhe@nfschina.com
Signed-off-by: Will Deacon <will@kernel.org>
2 years agoarm64: head: Ignore bogus KASLR displacement on non-relocatable kernels
Ard Biesheuvel [Sat, 27 Aug 2022 07:09:04 +0000 (09:09 +0200)]
arm64: head: Ignore bogus KASLR displacement on non-relocatable kernels

Even non-KASLR kernels can be built as relocatable, to work around
broken bootloaders that violate the rules regarding physical placement
of the kernel image - in this case, the physical offset modulo 2 MiB is
used as the KASLR offset, and all absolute symbol references are fixed
up in the usual way. This workaround is enabled by default.

CONFIG_RELOCATABLE can also be disabled entirely, in which case the
relocation code and the code that captures the offset are omitted from
the build. However, since commit aacd149b6238 ("arm64: head: avoid
relocating the kernel twice for KASLR"), this code got out of sync, and
we still add the offset to the kernel virtual address before populating
the page tables even though we never capture it. This means we add a
bogus value instead, breaking the boot entirely.

Fixes: aacd149b6238 ("arm64: head: avoid relocating the kernel twice for KASLR")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Mikulas Patocka <mpatocka@redhat.com>
Link: https://lore.kernel.org/r/20220827070904.2216989-1-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2 years agoarm64/kexec: Fix missing extra range for crashkres_low.
Levi Yun [Wed, 31 Aug 2022 10:39:13 +0000 (19:39 +0900)]
arm64/kexec: Fix missing extra range for crashkres_low.

Like crashk_res, Calling crash_exclude_mem_range function with
crashk_low_res area would need extra crash_mem range too.

Add one more extra cmem slot in case of crashk_low_res is used.

Signed-off-by: Levi Yun <ppbuk5246@gmail.com>
Fixes: 944a45abfabc ("arm64: kdump: Reimplement crashkernel=X")
Cc: <stable@vger.kernel.org> # 5.19.x
Acked-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20220831103913.12661-1-ppbuk5246@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
2 years agomm/slab_common: Deleting kobject in kmem_cache_destroy() without holding slab_mutex...
Waiman Long [Fri, 12 Aug 2022 18:30:33 +0000 (14:30 -0400)]
mm/slab_common: Deleting kobject in kmem_cache_destroy() without holding slab_mutex/cpu_hotplug_lock

A circular locking problem is reported by lockdep due to the following
circular locking dependency.

  +--> cpu_hotplug_lock --> slab_mutex --> kn->active --+
  |                                                     |
  +-----------------------------------------------------+

The forward cpu_hotplug_lock ==> slab_mutex ==> kn->active dependency
happens in

  kmem_cache_destroy(): cpus_read_lock(); mutex_lock(&slab_mutex);
  ==> sysfs_slab_unlink()
      ==> kobject_del()
          ==> kernfs_remove()
      ==> __kernfs_remove()
          ==> kernfs_drain(): rwsem_acquire(&kn->dep_map, ...);

The backward kn->active ==> cpu_hotplug_lock dependency happens in

  kernfs_fop_write_iter(): kernfs_get_active();
  ==> slab_attr_store()
      ==> cpu_partial_store()
          ==> flush_all(): cpus_read_lock()

One way to break this circular locking chain is to avoid holding
cpu_hotplug_lock and slab_mutex while deleting the kobject in
sysfs_slab_unlink() which should be equivalent to doing a write_lock
and write_unlock pair of the kn->active virtual lock.

Since the kobject structures are not protected by slab_mutex or the
cpu_hotplug_lock, we can certainly release those locks before doing
the delete operation.

Move sysfs_slab_unlink() and sysfs_slab_release() to the newly
created kmem_cache_release() and call it outside the slab_mutex &
cpu_hotplug_lock critical sections. There will be a slight delay
in the deletion of sysfs files if kmem_cache_release() is called
indirectly from a work function.

Fixes: 5a836bf6b09f ("mm: slub: move flush_cpu_slab() invocations __free_slab() invocations out of IRQ context")
Signed-off-by: Waiman Long <longman@redhat.com>
Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Reviewed-by: Roman Gushchin <roman.gushchin@linux.dev>
Acked-by: David Rientjes <rientjes@google.com>
Link: https://lore.kernel.org/all/YwOImVd+nRUsSAga@hyeyoo/
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>