Nikolay Borisov [Wed, 3 Jun 2020 05:55:22 +0000 (08:55 +0300)]
btrfs: make cow_file_range_async take btrfs_inode
It only uses vfs inode for assigning it to the async_chunk function.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Wed, 3 Jun 2020 05:55:21 +0000 (08:55 +0300)]
btrfs: make run_delalloc_nocow take btrfs_inode
It only really uses btrfs_inode.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Wed, 3 Jun 2020 05:55:20 +0000 (08:55 +0300)]
btrfs: make fallback_to_cow take btrfs_inode
It really wants btrfs_inode and is prepration to converting
run_delalloc_nocow to taking btrfs_inode.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Wed, 3 Jun 2020 05:55:19 +0000 (08:55 +0300)]
btrfs: make insert_reserved_file_extent take btrfs_inode
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>c
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Wed, 3 Jun 2020 05:55:18 +0000 (08:55 +0300)]
btrfs: make btrfs_qgroup_release_data take btrfs_inode
It just forwards its argument to __btrfs_qgroup_release_data.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Wed, 3 Jun 2020 05:55:17 +0000 (08:55 +0300)]
btrfs: make submit_compressed_extents take btrfs_inode
All but 3 uses require vfs_inode so convert the logic to have
btrfs_inode be the main inode struct.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Wed, 3 Jun 2020 05:55:16 +0000 (08:55 +0300)]
btrfs: make btrfs_submit_compressed_write take btrfs_inode
Majority of its uses are for btrfs_inode so take it as an argument
directly.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Wed, 3 Jun 2020 05:55:15 +0000 (08:55 +0300)]
btrfs: make btrfs_add_ordered_extent_compress take btrfs_inode
It simpy forwards its inode argument to __btrfs_add_ordered_extent which
already takes btrfs_inode.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Wed, 3 Jun 2020 05:55:14 +0000 (08:55 +0300)]
btrfs: make cow_file_range take btrfs_inode
All its children functions take btrfs_inode so convert it to taking
btrfs_inode.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Wed, 3 Jun 2020 05:55:13 +0000 (08:55 +0300)]
btrfs: make btrfs_add_ordered_extent take btrfs_inode
Preparation to converting its callers to taking btrfs_inode.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Wed, 3 Jun 2020 05:55:12 +0000 (08:55 +0300)]
btrfs: make cow_file_range_inline take btrfs_inode
It has only 2 uses for the vfs_inode - insert_inline_extent and
i_size_read. On the flipside it will allow converting its callers to
btrfs_inode, so convert it to taking btrfs_inode.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Wed, 3 Jun 2020 05:55:11 +0000 (08:55 +0300)]
btrfs: make btrfs_qgroup_free_data take btrfs_inode
It passes btrfs_inode to its callee so change the interface.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Wed, 3 Jun 2020 05:55:10 +0000 (08:55 +0300)]
btrfs: make __btrfs_qgroup_release_data take btrfs_inode
It uses vfs_inode only for a tracepoint so convert its interface to take
btrfs_inode directly.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Wed, 3 Jun 2020 05:55:09 +0000 (08:55 +0300)]
btrfs: make qgroup_free_reserved_data take btrfs_inode
It only uses btrfs_inode so can just as easily take it as an argument.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Fri, 19 Jun 2020 12:24:51 +0000 (15:24 +0300)]
btrfs: tracepoints: convert flush states to using EM macros
Only 6 out of all flush states were being printed correctly since
only they were exported via the TRACE_DEFINE_ENUM macro. This patch
converts all flush states to use the newly introduced EM macro so that
they can all be printed correctly.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Fri, 19 Jun 2020 12:24:50 +0000 (15:24 +0300)]
btrfs: tracepoints: switch extent_io_tree_owner to using EM macro
This fixes correct pint out of the extent io tree owner in
btrfs_set_extent_bit/btrfs_clear_extent_bit/btrfs_convert_extent_bit
tracepoints.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Fri, 19 Jun 2020 12:24:49 +0000 (15:24 +0300)]
btrfs: tracepoints: fix qgroup reservation type printing
Since qgroup's reservation types are define in a macro they must be
exported to user space in order for user space tools to convert raw
binary data to symbolic names. Currently trace-cmd report produces
the following output:
kworker/u8:2-459 [003] 1208.543587: qgroup_update_reserve:
2b742cae-e0e5-4def-9ef7-
28a9b34a951e: qgid=5 type=0x2 cur_reserved=
54870016 diff=-32768
With this fix the output is:
kworker/u8:2-459 [003] 1208.543587: qgroup_update_reserve:
2b742cae-e0e5-4def-9ef7-
28a9b34a951e: qgid=5 type=BTRFS_QGROUP_RSV_META_PREALLOC cur_reserved=
54870016 diff=-32768
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Fri, 19 Jun 2020 12:24:48 +0000 (15:24 +0300)]
btrfs: tracepoints: move FLUSH_ACTIONS define
Since all enums used in btrfs' tracepoints are going to be redefined
to allow proper parsing of their values by userspace tools let's
rearrange when they are defined. This will allow to use only a single
set of #define EM/#undef EM sequence. No functional changes.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Fri, 19 Jun 2020 12:24:47 +0000 (15:24 +0300)]
btrfs: tracepoints: fix extent type symbolic name print
extent's type is an enum and this requires that the enum values be
exported to user space so that user space tools can correctly map raw
binary data to the symbolic name. Currently tracepoints using
btrfs__file_extent_item_regular or btrfs__file_extent_item_inline result
in the following output:
fio-443 [002] 586.609450: btrfs_get_extent_show_fi_regular:
f0c3bf8e-0174-4bcc-92aa-
6c2d62430420:i
root=5(FS_TREE) inode=258 size=
2136457216 disk_isize=0
file extent range=[
2126946304 2136457216] (num_bytes=9510912
ram_bytes=9510912 disk_bytenr=0 disk_num_bytes=0 extent_offset=0
type=0x1 compression=0
E.g type is 0x1 . With this patch applie the output is:
<ommitted for brevity> disk_bytenr=
141348864 disk_num_bytes=4096 extent_offset=0 type=REG compression=0
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Fri, 19 Jun 2020 12:24:46 +0000 (15:24 +0300)]
btrfs: tracepoints: fix btrfs_trigger_flush symbolic string for flags
When tracepoints use __print_symbolic to print textual representation of
a value that comes from an ENUM each enum value needs to be exported
to user space so that user space tools can convert the binary value
data to the trings as user space does not know what those enums are
about.
Doing a trace-cmd record && trace-cmd report currently results in:
kworker/u8:1-61 [000] 66.299527:
btrfs_flush_space:
5302ee13-c65e-45bb-98ef-
8fe3835bd943:
state=3(0x3) flags=4(METADATA) num_bytes=2621440 ret=0
I.e state is not translated to its symbolic counterpart. With this patch
applied the output is:
fio-370 [002] 56.762402: btrfs_trigger_flush:
d04cd7ac-38e2-452f-a7f5-
8157529fd5f0:
preempt: flush=3(BTRFS_RESERVE_FLUSH_ALL) flags=4(METADATA) bytes=655360
See also
190f0b76ca49 ("mm: tracing: Export enums in tracepoints to user
space").
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
David Sterba [Thu, 25 Jun 2020 10:35:28 +0000 (12:35 +0200)]
btrfs: allow use of global block reserve for balance item deletion
On a filesystem with exhausted metadata, but still enough to start
balance, it's possible to hit this error:
[324402.053842] BTRFS info (device loop0): 1 enospc errors during balance
[324402.060769] BTRFS info (device loop0): balance: ended with status: -28
[324402.172295] BTRFS: error (device loop0) in reset_balance_state:3321: errno=-28 No space left
It fails inside reset_balance_state and turns the filesystem to
read-only, which is unnecessary and should be fixed too, but the problem
is caused by lack for space when the balance item is deleted. This is a
one-time operation and from the same rank as unlink that is allowed to
use the global block reserve. So do the same for the balance item.
Status of the filesystem (100GiB) just after the balance fails:
$ btrfs fi df mnt
Data, single: total=80.01GiB, used=38.58GiB
System, single: total=4.00MiB, used=16.00KiB
Metadata, single: total=19.99GiB, used=19.48GiB
GlobalReserve, single: total=512.00MiB, used=50.11MiB
CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Qu Wenruo [Tue, 23 Jun 2020 23:23:52 +0000 (07:23 +0800)]
btrfs: refactor btrfs_check_can_nocow() into two variants
The function btrfs_check_can_nocow() now has two completely different
call patterns.
For nowait variant, callers don't need to do any cleanup. While for
wait variant, callers need to release the lock if they can do nocow
write.
This is somehow confusing, and is already a problem for the exported
btrfs_check_can_nocow().
So this patch will separate the different patterns into different
functions.
For nowait variant, the function will be called check_nocow_nolock().
For wait variant, the function pair will be btrfs_check_nocow_lock()
btrfs_check_nocow_unlock().
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Qu Wenruo [Tue, 23 Jun 2020 23:23:51 +0000 (07:23 +0800)]
btrfs: add comments for btrfs_check_can_nocow() and can_nocow_extent()
These two functions have extra conditions that their callers need to
meet, and some not-that-common parameters used for return value.
So adding some comments may save reviewers some time.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Qu Wenruo [Tue, 23 Jun 2020 23:23:50 +0000 (07:23 +0800)]
btrfs: allow btrfs_truncate_block() to fallback to nocow for data space reservation
[BUG]
When the data space is exhausted, even if the inode has NOCOW attribute,
we will still refuse to truncate unaligned range due to ENOSPC.
The following script can reproduce it pretty easily:
#!/bin/bash
dev=/dev/test/test
mnt=/mnt/btrfs
umount $dev &> /dev/null
umount $mnt &> /dev/null
mkfs.btrfs -f $dev -b 1G
mount -o nospace_cache $dev $mnt
touch $mnt/foobar
chattr +C $mnt/foobar
xfs_io -f -c "pwrite -b 4k 0 4k" $mnt/foobar > /dev/null
xfs_io -f -c "pwrite -b 4k 0 1G" $mnt/padding &> /dev/null
sync
xfs_io -c "fpunch 0 2k" $mnt/foobar
umount $mnt
Currently this will fail at the fpunch part.
[CAUSE]
Because btrfs_truncate_block() always reserves space without checking
the NOCOW attribute.
Since the writeback path follows NOCOW bit, we only need to bother the
space reservation code in btrfs_truncate_block().
[FIX]
Make btrfs_truncate_block() follow btrfs_buffered_write() to try to
reserve data space first, and fall back to NOCOW check only when we
don't have enough space.
Such always-try-reserve is an optimization introduced in
btrfs_buffered_write(), to avoid expensive btrfs_check_can_nocow() call.
This patch will export check_can_nocow() as btrfs_check_can_nocow(), and
use it in btrfs_truncate_block() to fix the problem.
Reported-by: Martin Doucha <martin.doucha@suse.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
David Sterba [Thu, 18 Jun 2020 12:54:56 +0000 (14:54 +0200)]
btrfs: start deprecation of mount option inode_cache
Estimated time of removal of the functionality is 5.11, the option will
be still parsed but will have no effect.
Reasons for deprecation and removal:
- very poor naming choice of the mount option, it's supposed to cache
and reuse the inode _numbers_, but it sounds a some generic cache for
inodes
- the only known usecase where this option would make sense is on a
32bit architecture where inode numbers in one subvolume would be
exhausted due to 32bit inode::i_ino
- the cache is stored on disk, consumes space, needs to be loaded and
written back
- new inode number allocation is slower due to lookups into the cache
(compared to a simple increment which is the default)
- uses the free-space-cache code that is going to be deprecated as well
in the future
Known problems:
- since 2011, returning EEXIST when there's not enough space in a page
to store all checksums, see commit
4b9465cb9e38 ("Btrfs: add mount -o
inode_cache")
Remaining issues:
- if the option was enabled, new inodes created, the option disabled
again, the cache is still stored on the devices and there's currently
no way to remove it
Signed-off-by: David Sterba <dsterba@suse.com>
David Sterba [Tue, 23 Jun 2020 19:23:54 +0000 (21:23 +0200)]
btrfs: remove unused btrfs_root::defrag_trans_start
Last touched in 2013 by commit
de78b51a2852 ("btrfs: remove cache only
arguments from defrag path") that was the only code that used the value.
Now it's only set but never used for anything, so we can remove it.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
David Sterba [Tue, 23 Jun 2020 18:56:12 +0000 (20:56 +0200)]
btrfs: don't use UAPI types for fiemap callback
The fiemap callback is not part of UAPI interface and the prototypes
don't have the __u64 types either.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Denis Efremov [Mon, 22 Jun 2020 20:18:41 +0000 (23:18 +0300)]
btrfs: tests: remove if duplicate in __check_free_space_extents()
num_extents is already checked in the next if condition and can
be safely removed.
Signed-off-by: Denis Efremov <efremov@linux.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Johannes Thumshirn [Tue, 23 Jun 2020 08:40:07 +0000 (17:40 +0900)]
btrfs: use free_root_extent_buffer to free root
In btrfs_put_root() we're freeing a btrfs_root's 'node' and 'commit_root'
extent buffers manually via kfree(), while we're using
free_root_extent_buffers() in the free_root_pointers() function above.
free_root_extent_buffers() also NULLs the pointers after freeing, which
mitigates potential double frees.
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Wed, 17 Jun 2020 09:10:44 +0000 (12:10 +0300)]
btrfs: use for loop in prealloc_file_extent_cluster
This function iterates all extents in the extent cluster, make this
intention obvious by using a for loop. No functional chanes.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Wed, 17 Jun 2020 09:10:43 +0000 (12:10 +0300)]
btrfs: perform data management operations outside of inode lock
btrfs_alloc_data_chunk_ondemand and btrfs_free_reserved_data_space_noquota
don't really use the guts of the inodes being passed to them. This
implies it's not required to call them under extent lock. Move code
around in prealloc_file_extent_cluster to do the heavy, data alloc/free
operations outside of the lock. This also makes the 'out' label
unnecessary, so remove it.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Wed, 17 Jun 2020 09:10:42 +0000 (12:10 +0300)]
btrfs: remove hole check in prealloc_file_extent_cluster
Extents in the extent cluster are guaranteed to be contiguous as such
the hole check inside the loop can never trigger. In fact this check was
never functional since it was added in
18513091af94 ("btrfs: update
btrfs_space_info's bytes_may_use timely") which came after the commit
introducing clustered/contiguous extents
0257bb82d21b ("Btrfs: relocate
file extents in clusters").
Let's just remove it as it adds noise to the source.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Wed, 3 Jun 2020 05:55:08 +0000 (08:55 +0300)]
btrfs: make __btrfs_drop_extents take btrfs_inode
It has only 4 uses of a vfs_inode for inode_sub_bytes but unifies the
interface with the non __ prefixed version. Will also makes converting
its callers to btrfs_inode easier.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Wed, 3 Jun 2020 05:55:07 +0000 (08:55 +0300)]
btrfs: make btrfs_csum_one_bio takae btrfs_inode
Will enable converting btrfs_submit_compressed_write to btrfs_inode more
easily.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Wed, 3 Jun 2020 05:55:06 +0000 (08:55 +0300)]
btrfs: make extent_clear_unlock_delalloc take btrfs_inode
It has one VFS and 1 btrfs inode usages but converting it to btrfs_inode
interface will allow seamless conversion of its callers.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Wed, 3 Jun 2020 05:55:05 +0000 (08:55 +0300)]
btrfs: make create_io_em take btrfs_inode
It really wants a btrfs_inode and will allow submit_compressed_extents
to be completely converted to btrfs_inode in follow up patches.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Wed, 3 Jun 2020 05:55:04 +0000 (08:55 +0300)]
btrfs: make btrfs_reloc_clone_csums take btrfs_inode
It really wants btrfs_inode and not a vfs inode.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Wed, 3 Jun 2020 05:55:03 +0000 (08:55 +0300)]
btrfs: make btrfs_lookup_ordered_extent take btrfs_inode
It doesn't use the generic vfs inode for anything use btrfs_inode
directly.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Wed, 3 Jun 2020 05:55:02 +0000 (08:55 +0300)]
btrfs: make get_extent_allocation_hint take btrfs_inode
It doesn't use the vfs inode for anything, can just as easily take
btrfs_inode. Follow up patches will convert callers as well.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Wed, 3 Jun 2020 05:55:01 +0000 (08:55 +0300)]
btrfs: make __btrfs_add_ordered_extent take struct btrfs_inode
This is internal btrfs function what really needs the vfs_inode only for
igrab and a tracepoint.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Filipe Manana [Mon, 15 Jun 2020 09:36:58 +0000 (10:36 +0100)]
btrfs: remove no longer used trans_list member of struct btrfs_ordered_extent
The 'trans_list' member of an ordered extent was used to keep track of the
ordered extents for which a transaction commit had to wait. These were
ordered extents that were started and logged by an fsync. However we don't
do that anymore and before we stopped doing it we changed the approach to
wait for the ordered extents in commit
161c3549b45aee ("Btrfs: change how
we wait for pending ordered extents"), which stopped using that list and
therefore the 'trans_list' member is not used anymore since that commit.
So just remove it since it's doing nothing and making each ordered extent
structure waste memory (2 pointers).
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Filipe Manana [Mon, 15 Jun 2020 09:36:48 +0000 (10:36 +0100)]
btrfs: remove no longer used log_list member of struct btrfs_ordered_extent
The 'log_list' member of an ordered extent was used keep track of which
ordered extents we needed to wait after logging metadata, but is not used
anymore since commit
5636cf7d6dc86f ("btrfs: remove the logged extents
infrastructure"), as we now always wait on ordered extent completion
before logging metadata. So just remove it since it's doing nothing and
making each ordered extent structure waste more memory (2 pointers).
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
David Sterba [Mon, 8 Jun 2020 14:06:07 +0000 (16:06 +0200)]
btrfs: add little-endian optimized key helpers
The CPU and on-disk keys are mapped to two different structures because
of the endianness. There's an intermediate buffer used to do the
conversion, but this is not necessary when CPU and on-disk endianness
match.
Add optimized versions of helpers that take disk_key and use the buffer
directly for CPU keys or drop the intermediate buffer and conversion.
This saves a lot of stack space accross many functions and removes about
6K of generated binary code:
text data bss dec hex filename
1090439 17468 14912 1122819 112203 pre/btrfs.ko
1084613 17456 14912 1116981 110b35 post/btrfs.ko
Delta: -5826
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Qu Wenruo [Wed, 10 Jun 2020 01:04:44 +0000 (09:04 +0800)]
btrfs: qgroup: catch reserved space leaks at unmount time
Before this patch, qgroup completely relies on per-inode extent io tree
to detect reserved data space leak.
However previous bug has already shown how release page before
btrfs_finish_ordered_io() could lead to leak, and since it's
QGROUP_RESERVED bit cleared without triggering qgroup rsv, it can't be
detected by per-inode extent io tree.
So this patch adds another (and hopefully the final) safety net to catch
qgroup data reserved space leak. At least the new safety net catches
all the leaks during development, so it should be pretty useful in the
real world.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Qu Wenruo [Wed, 10 Jun 2020 01:04:43 +0000 (09:04 +0800)]
btrfs: change timing for qgroup reserved space for ordered extents to fix reserved space leak
[BUG]
The following simple workload from fsstress can lead to qgroup reserved
data space leak:
0/0: creat f0 x:0 0 0
0/0: creat add id=0,parent=-1
0/1: write f0[259 1 0 0 0 0] [600030,27288] 0
0/4: dwrite - xfsctl(XFS_IOC_DIOINFO) f0[259 1 0 0 64 627318] return 25, fallback to stat()
0/4: dwrite f0[259 1 0 0 64 627318] [610304,106496] 0
This would cause btrfs qgroup to leak 20480 bytes for data reserved
space. If btrfs qgroup limit is enabled, such leak can lead to
unexpected early EDQUOT and unusable space.
[CAUSE]
When doing direct IO, kernel will try to writeback existing buffered
page cache, then invalidate them:
generic_file_direct_write()
|- filemap_write_and_wait_range();
|- invalidate_inode_pages2_range();
However for btrfs, the bi_end_io hook doesn't finish all its heavy work
right after bio ends. In fact, it delays its work further:
submit_extent_page(end_io_func=end_bio_extent_writepage);
end_bio_extent_writepage()
|- btrfs_writepage_endio_finish_ordered()
|- btrfs_init_work(finish_ordered_fn);
<<< Work queue execution >>>
finish_ordered_fn()
|- btrfs_finish_ordered_io();
|- Clear qgroup bits
This means, when filemap_write_and_wait_range() returns,
btrfs_finish_ordered_io() is not guaranteed to be executed, thus the
qgroup bits for related range are not cleared.
Now into how the leak happens, this will only focus on the overlapping
part of buffered and direct IO part.
1. After buffered write
The inode had the following range with QGROUP_RESERVED bit:
596 616K
|///////////////|
Qgroup reserved data space: 20K
2. Writeback part for range [596K, 616K)
Write back finished, but btrfs_finish_ordered_io() not get called
yet.
So we still have:
596K 616K
|///////////////|
Qgroup reserved data space: 20K
3. Pages for range [596K, 616K) get released
This will clear all qgroup bits, but don't update the reserved data
space.
So we have:
596K 616K
| |
Qgroup reserved data space: 20K
That number doesn't match the qgroup bit range anymore.
4. Dio prepare space for range [596K, 700K)
Qgroup reserved data space for that range, we got:
596K 616K 700K
|///////////////|///////////////////////|
Qgroup reserved data space: 20K + 104K = 124K
5. btrfs_finish_ordered_range() gets executed for range [596K, 616K)
Qgroup free reserved space for that range, we got:
596K 616K 700K
| |///////////////////////|
We need to free that range of reserved space.
Qgroup reserved data space: 124K - 20K = 104K
6. btrfs_finish_ordered_range() gets executed for range [596K, 700K)
However qgroup bit for range [596K, 616K) is already cleared in
previous step, so we only free 84K for qgroup reserved space.
596K 616K 700K
| | |
We need to free that range of reserved space.
Qgroup reserved data space: 104K - 84K = 20K
Now there is no way to release that 20K unless disabling qgroup or
unmounting the fs.
[FIX]
This patch will change the timing of btrfs_qgroup_release/free_data()
call. Here it uses buffered COW write as an example.
The new timing | The old timing
----------------------------------------+---------------------------------------
btrfs_buffered_write() | btrfs_buffered_write()
|- btrfs_qgroup_reserve_data() | |- btrfs_qgroup_reserve_data()
|
btrfs_run_delalloc_range() | btrfs_run_delalloc_range()
|- btrfs_add_ordered_extent() |
|- btrfs_qgroup_release_data() |
The reserved is passed into |
btrfs_ordered_extent structure |
|
btrfs_finish_ordered_io() | btrfs_finish_ordered_io()
|- The reserved space is passed to | |- btrfs_qgroup_release_data()
btrfs_qgroup_record | The resereved space is passed
| to btrfs_qgroup_recrod
|
btrfs_qgroup_account_extents() | btrfs_qgroup_account_extents()
|- btrfs_qgroup_free_refroot() | |- btrfs_qgroup_free_refroot()
The point of such change is to ensure, when ordered extents are
submitted, the qgroup reserved space is already released, to keep the
timing aligned with file_write_and_wait_range().
So that qgroup data reserved space is all bound to btrfs_ordered_extent
and solve the timing mismatch.
Fixes:
f695fdcef83a ("btrfs: qgroup: Introduce functions to release/free qgroup reserve data space")
Suggested-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Qu Wenruo [Wed, 10 Jun 2020 01:04:42 +0000 (09:04 +0800)]
btrfs: file: reserve qgroup space after the hole punch range is locked
The incoming qgroup reserved space timing will move the data reservation
to ordered extent completely.
However in btrfs_punch_hole_lock_range() will call
btrfs_invalidate_page(), which will clear QGROUP_RESERVED bit for the
range.
In current stage it's OK, but if we're making ordered extents handle the
reserved space, then btrfs_punch_hole_lock_range() can clear the
QGROUP_RESERVED bit before we submit ordered extent, leading to qgroup
reserved space leakage.
So here change the timing to make reserve data space after
btrfs_punch_hole_lock_range().
The new timing is fine for either current code or the new code.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Qu Wenruo [Wed, 10 Jun 2020 01:04:41 +0000 (09:04 +0800)]
btrfs: inode: move qgroup reserved space release to the callers of insert_reserved_file_extent()
This is to prepare for the incoming timing change of qgroup reserved
data space and ordered extent.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Qu Wenruo [Wed, 10 Jun 2020 01:04:40 +0000 (09:04 +0800)]
btrfs: inode: refactor the parameters of insert_reserved_file_extent()
Function insert_reserved_file_extent() takes a long list of parameters,
which are all for btrfs_file_extent_item, even including two reserved
members, encryption and other_encoding.
This makes the parameter list unnecessary long for a function which only
gets called twice.
This patch will refactor the parameter list, by using
btrfs_file_extent_item as parameter directly to hugely reduce the number
of parameters.
Also, since there are only two callers, one in btrfs_finish_ordered_io()
which inserts file extent for ordered extent, and one
__btrfs_prealloc_file_range().
These two call sites have completely different context, where ordered
extent can be compressed, but will always be regular extent, while the
preallocated one is never going to be compressed and always has PREALLOC
type.
So use two small wrapper for these two different call sites to improve
readability.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
David Sterba [Fri, 29 May 2020 14:20:35 +0000 (16:20 +0200)]
btrfs: scrub: clean up temporary page variables in scrub_checksum_tree_block
Add proper variable for the scrub page and use it instead of repeatedly
dereferencing the other structures.
Signed-off-by: David Sterba <dsterba@suse.com>
David Sterba [Fri, 29 May 2020 13:54:41 +0000 (15:54 +0200)]
btrfs: scrub: simplify tree block checksum calculation
Use a simpler iteration over tree block pages, same what csum_tree_block
does: first page always exists, loop over the rest.
Signed-off-by: David Sterba <dsterba@suse.com>
David Sterba [Fri, 29 May 2020 14:20:35 +0000 (16:20 +0200)]
btrfs: scrub: clean up temporary page variables in scrub_checksum_data
Add proper variable for the scrub page and use it instead of repeatedly
dereferencing the other structures.
Signed-off-by: David Sterba <dsterba@suse.com>
David Sterba [Fri, 29 May 2020 13:54:41 +0000 (15:54 +0200)]
btrfs: scrub: simplify data block checksum calculation
We have sectorsize same as PAGE_SIZE, the checksum can be calculated in
one go.
Signed-off-by: David Sterba <dsterba@suse.com>
David Sterba [Fri, 29 May 2020 13:47:05 +0000 (15:47 +0200)]
btrfs: scrub: clean up temporary page variables in scrub_checksum_super
Add proper variable for the scrub page and use it instead of repeatedly
dereferencing the other structures.
Signed-off-by: David Sterba <dsterba@suse.com>
David Sterba [Fri, 29 May 2020 13:43:14 +0000 (15:43 +0200)]
btrfs: scrub: remove temporary csum array in scrub_checksum_super
The page contents with the checksum is available during the entire
function so we don't need to make a copy.
Signed-off-by: David Sterba <dsterba@suse.com>
David Sterba [Fri, 29 May 2020 13:40:36 +0000 (15:40 +0200)]
btrfs: scrub: simplify superblock checksum calculation
BTRFS_SUPER_INFO_SIZE is 4096, and fits to a page on all supported
architectures, so we can calculate the checksum in one go.
Signed-off-by: David Sterba <dsterba@suse.com>
David Sterba [Fri, 29 May 2020 13:32:51 +0000 (15:32 +0200)]
btrfs: scrub: unify naming of page address variables
As the page mapping has been removed, rename the variables to 'kaddr'
that we use everywhere else. The type is changed to 'char *' so pointer
arithmetic works without casts.
Signed-off-by: David Sterba <dsterba@suse.com>
David Sterba [Fri, 29 May 2020 13:26:07 +0000 (15:26 +0200)]
btrfs: scrub: remove kmap/kunmap of pages
All pages that scrub uses in the scrub_block::pagev array are allocated
with GFP_KERNEL and never part of any mapping, so kmap is not necessary,
we only need to know the page address.
In scrub_write_page_to_dev_replace we don't even need to call
flush_dcache_page because of the same reason as above.
Signed-off-by: David Sterba <dsterba@suse.com>
Qu Wenruo [Thu, 4 Jun 2020 07:18:06 +0000 (15:18 +0800)]
btrfs: introduce "rescue=" mount option
This patch introduces a new "rescue=" mount option group for all mount
options for data recovery.
Different rescue sub options are seperated by ':'. E.g
"ro,rescue=nologreplay:usebackuproot".
The original plan was to use ';', but ';' needs to be escaped/quoted,
or it will be interpreted by bash, similar to '|'.
And obviously, user can specify rescue options one by one like:
"ro,rescue=nologreplay,rescue=usebackuproot".
The following mount options are converted to "rescue=", old mount
options are deprecated but still available for compatibility purpose:
- usebackuproot
Now it's "rescue=usebackuproot"
- nologreplay
Now it's "rescue=nologreplay"
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Filipe Manana [Tue, 9 Jun 2020 10:19:42 +0000 (11:19 +0100)]
btrfs: use btrfs_alloc_data_chunk_ondemand() when allocating space for relocation
We currently use btrfs_check_data_free_space() when allocating space for
relocating data extents, but that is not necessary because that function
combines btrfs_alloc_data_chunk_ondemand(), which does the actual space
reservation, and btrfs_qgroup_reserve_data().
We can use btrfs_alloc_data_chunk_ondemand() directly because we know we
do not need to reserve qgroup space since we are dealing with a relocation
tree, which can never have qgroups (btrfs_qgroup_reserve_data() does
nothing as is_fstree() returns false for a relocation tree).
Conversely we can use btrfs_free_reserved_data_space_noquota() directly
instead of btrfs_free_reserved_data_space(), since we had no qgroup
reservation when allocating space.
This change is preparatory work for another patch in this series that
makes relocation reserve the exact amount of space it needs to relocate
a data block group. The function btrfs_check_data_free_space() has
the incovenient of requiring a start offset argument and we will want to
be able to allocate space for multiple ranges, which are not consecutive,
at once.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Filipe Manana [Tue, 9 Jun 2020 10:19:33 +0000 (11:19 +0100)]
btrfs: remove the start argument from btrfs_free_reserved_data_space_noquota()
The start argument for btrfs_free_reserved_data_space_noquota() is only
used to make sure the amount of bytes we decrement from the bytes_may_use
counter of the data space_info object is aligned to the filesystem's
sector size. It serves no other purpose.
All its current callers always pass a length argument that is already
aligned to the sector size, so we can make the start argument go away.
In fact its presence makes it impossible to use it in a context where we
just want to free a number of bytes for a range for which either we do
not know its start offset or for freeing multiple ranges at once (which
are not contiguous).
This change is preparatory work for a patch (third patch in this series)
that makes relocation of data block groups that are not full reserve less
data space.
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Liao Pingfang [Thu, 11 Jun 2020 00:40:36 +0000 (08:40 +0800)]
btrfs: check-integrity: remove unnecessary failure messages during memory allocation
As there is a dump_stack() done on memory allocation failures, these
messages might as well be deleted instead.
Signed-off-by: Liao Pingfang <liao.pingfang@zte.com.cn>
Reviewed-by: David Sterba <dsterba@suse.com>
[ minor tweaks ]
Signed-off-by: David Sterba <dsterba@suse.com>
Anand Jain [Wed, 3 Jun 2020 10:10:20 +0000 (18:10 +0800)]
btrfs: use helper btrfs_get_block_group
Use the helper function where it is open coded to increment the
block_group reference count As btrfs_get_block_group() is a one-liner we
could have open-coded it, but its partner function
btrfs_put_block_group() isn't one-liner which does the free part in it.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Anand Jain [Wed, 3 Jun 2020 10:10:18 +0000 (18:10 +0800)]
btrfs: let btrfs_return_cluster_to_free_space() return void
__btrfs_return_cluster_to_free_space() returns only 0. And all its
parent functions don't need the return value either so make this a void
function.
Further, as none of the callers of btrfs_return_cluster_to_free_space()
is actually using the return from this function, make this function also
return void.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Filipe Manana [Mon, 1 Jun 2020 18:12:27 +0000 (19:12 +0100)]
btrfs: remove no longer necessary chunk mutex locking cases
Initially when the 'removed' flag was added to a block group to avoid
races between block group removal and fitrim, by commit
04216820fe83d5
("Btrfs: fix race between fs trimming and block group remove/allocation"),
we had to lock the chunks mutex because we could be moving the block
group from its current list, the pending chunks list, into the pinned
chunks list, or we could just be adding it to the pinned chunks if it was
not in the pending chunks list. Both lists were protected by the chunk
mutex.
However we no longer have those lists since commit
1c11b63eff2a67
("btrfs: replace pending/pinned chunks lists with io tree"), and locking
the chunk mutex is no longer necessary because of that. The same happens
at btrfs_unfreeze_block_group(), we lock the chunk mutex because the block
group's extent map could be part of the pinned chunks list and the call
to remove_extent_mapping() could be deleting it from that list, which
used to be protected by that mutex.
So just remove those lock and unlock calls as they are not needed anymore.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Johannes Thumshirn [Tue, 2 Jun 2020 10:05:57 +0000 (19:05 +0900)]
btrfs: factor out reading of bg from find_frist_block_group
When find_first_block_group() finds a block group item in the extent-tree,
it does a lookup of the object in the extent mapping tree and does further
checks on the item.
Factor out this step from find_first_block_group() so we can further
simplify the code.
While we're at it, we can also just return early in
find_first_block_group(), if the tree slot isn't found.
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Johannes Thumshirn [Tue, 2 Jun 2020 10:05:56 +0000 (19:05 +0900)]
btrfs: get mapping tree directly from fsinfo in find_first_block_group
We already have an fs_info in our function parameters, there's no need
to do the maths again and get fs_info from the extent_root just to get
the mapping_tree.
Instead directly grab the mapping_tree from fs_info.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Fri, 3 Apr 2020 13:40:35 +0000 (16:40 +0300)]
btrfs: simplify checks when adding excluded ranges
Adresses held in 'logical' array are always guaranteed to fall within
the boundaries of the block group. That is, 'start' can never be
smaller than cache->start. This invariant follows from the way the
address are calculated in btrfs_rmap_block:
stripe_nr = physical - map->stripes[i].physical;
stripe_nr = div64_u64(stripe_nr, map->stripe_len);
bytenr = chunk_start + stripe_nr * io_stripe_size;
I.e it's always some IO stripe within the given chunk.
Exploit this invariant to simplify the body of the loop by removing the
unnecessary 'if' since its 'else' part is the one always executed.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Fri, 3 Apr 2020 13:40:34 +0000 (16:40 +0300)]
btrfs: read stripe len directly in btrfs_rmap_block
extent_map::orig_block_len contains the size of a physical stripe when
it's used to describe block groups (calculated in read_one_chunk via
calc_stripe_length or calculated in decide_stripe_size and then assigned
to extent_map::orig_block_len in create_chunk). Exploit this fact to get
the size directly rather than opencoding the calculations. No functional
changes.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Thu, 28 May 2020 08:05:13 +0000 (11:05 +0300)]
btrfs: don't balance btree inode pages from buffered write path
The call to btrfs_btree_balance_dirty has been there since the early
days of BTRFS, when the btree was directly modified from the write path,
hence dirtied btree inode pages. With the implementation of
b888db2bd7b6
("Btrfs: Add delayed allocation to the extent based page tree code")
13 years ago the btree is no longer modified from the write path, hence
there is no point in calling this function. Just remove it.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Linus Torvalds [Sun, 26 Jul 2020 21:14:06 +0000 (14:14 -0700)]
Linux 5.8-rc7
Linus Torvalds [Sun, 26 Jul 2020 20:46:57 +0000 (13:46 -0700)]
Merge tag 'kbuild-fixes-v5.8-3' of git://git./linux/kernel/git/masahiroy/linux-kbuild into master
Pull Kbuild fixes from Masahiro Yamada:
- do not use non-portable strsep() in a host program
- fix single target builds for external modules
- change Clang's --prefix option to make it work for the latest Clang
* tag 'kbuild-fixes-v5.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
Makefile: Fix GCC_TOOLCHAIN_DIR prefix for Clang cross compilation
kbuild: fix single target builds for external modules
modpost: remove use of non-standard strsep() in HOSTCC code
Linus Torvalds [Sun, 26 Jul 2020 19:14:46 +0000 (12:14 -0700)]
Merge branch 'parisc-5.8-2' of git://git./linux/kernel/git/deller/parisc-linux into master
Pull parisc fixes from Helge Deller:
"Two fixes:
- Add the cmpxchg() function for pointers to u8 values. This fixes a
kernel linking error when building the tusb1210 driver (from Liam
Beguin).
- Add a define for atomic64_set_release() to fix CPU soft lockups
which happen because of missing unlocks while processing bit
operations (from John David Anglin)"
* 'parisc-5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Add atomic64_set_release() define to avoid CPU soft lockups
parisc: add support for cmpxchg on u8 pointers
Linus Torvalds [Sun, 26 Jul 2020 16:33:25 +0000 (09:33 -0700)]
Merge tag 'char-misc-5.8-rc7' of git://git./linux/kernel/git/gregkh/char-misc into master
Pull char/misc driver fixes from Greg KH:
"Here are a few small driver fixes for 5.8-rc7
They include:
- habanalabs fixes
- tiny fpga driver fixes
- /dev/mem fixup from previous changes
- interconnect driver fixes
- binder fix
All of these have been in linux-next for a while with no reported
issues"
* tag 'char-misc-5.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
interconnect: msm8916: Fix buswidth of pcnoc_s nodes
interconnect: Do not skip aggregation for disabled paths
/dev/mem: Add missing memory barriers for devmem_inode
binder: Don't use mmput() from shrinker function.
habanalabs: prevent possible out-of-bounds array access
fpga: dfl: fix bug in port reset handshake
fpga: dfl: pci: reduce the scope of variable 'ret'
habanalabs: set 4s timeout for message to device CPU
habanalabs: set clock gating per engine
habanalabs: block WREG_BULK packet on PDMA
Linus Torvalds [Sun, 26 Jul 2020 16:29:22 +0000 (09:29 -0700)]
Merge tag 'driver-core-5.8-rc7' of git://git./linux/kernel/git/gregkh/driver-core into master
Pull driver core fix from Greg KH:
"A single driver core fix for 5.8-rc7. It resolves a problem found in
the previous fix for this code made in 5.8-rc6. Hopefully this is all
now cleared up, as this seems to be the last of the reported issues in
this area, and was tested on the problem hardware.
This patch has been in linux-next with no reported problems"
* tag 'driver-core-5.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
device property: Avoid NULL pointer dereference in device_get_next_child_node()
Linus Torvalds [Sun, 26 Jul 2020 16:14:59 +0000 (09:14 -0700)]
Merge tag 'staging-5.8-rc7' of git://git./linux/kernel/git/gregkh/staging into master
Pull staging driver fixes from Greg KH:
"Five small staging driver fixes for 5.8-rc7 to resolve some reported
problems:
- four comedi driver fixes for problems found with them
- a syzbot-found fix for the wlang-ng driver that resolves a much
reported problem.
All of these have been in linux-next with no reported issues"
* tag 'staging-5.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: wlan-ng: properly check endpoint types
staging: comedi: addi_apci_1564: check INSN_CONFIG_DIGITAL_TRIG shift
staging: comedi: addi_apci_1500: check INSN_CONFIG_DIGITAL_TRIG shift
staging: comedi: addi_apci_1032: check INSN_CONFIG_DIGITAL_TRIG shift
staging: comedi: ni_6527: fix INSN_CONFIG_DIGITAL_TRIG support
Linus Torvalds [Sun, 26 Jul 2020 16:09:43 +0000 (09:09 -0700)]
Merge tag 'tty-5.8-rc7' of git://git./linux/kernel/git/gregkh/tty into master
Pull tty/serial/fbcon fixes from Greg KH:
"Here are some small tty and serial and fbcon fixes for 5.8-rc7 to
resolve some reported issues.
The fbcon fix is in here as it was simpler to take it this way (and it
was acked by the maintainer) as it was related to the vt console fix
as well, both of which resolve syzbot-found issues in the console
handling code.
The other serial driver fixes are for small issues reported in the -rc
releases.
All of these have been in linux-next with no reported issues"
* tag 'tty-5.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
serial: exar: Fix GPIO configuration for Sealevel cards based on XR17V35X
fbdev: Detect integer underflow at "struct fbcon_ops"->clear_margins.
serial: 8250_mtk: Fix high-speed baud rates clamping
serial: 8250: fix null-ptr-deref in serial8250_start_tx()
serial: tegra: drop bogus NULL tty-port checks
serial: tegra: fix CREAD handling for PIO
tty: xilinx_uartps: Really fix id assignment
vt: Reject zero-sized screen buffer size.
Linus Torvalds [Sun, 26 Jul 2020 16:02:29 +0000 (09:02 -0700)]
Merge tag 'usb-5.8-rc7' of git://git./linux/kernel/git/gregkh/usb into master
Pull USB fixes from Greg KH:
"Three small USB XHCI driver fixes for 5.8-rc7.
They all resolve some minor issues that have been reported on some
different platforms.
All of these have been in linux-next with no reported issues"
* tag 'usb-5.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: tegra: Fix allocation for the FPCI context
usb: xhci: Fix ASM2142/ASM3142 DMA addressing
usb: xhci-mtk: fix the failure of bandwidth allocation
Linus Torvalds [Sun, 26 Jul 2020 15:59:15 +0000 (08:59 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi into master
Pull SCSI fix from James Bottomley:
"Small core patch to fix a corner case bug: we forgot to run the queues
to handle starvation in the error exit from the scsi_queue_rq routine,
which can lead to hangs on error conditions"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: core: Run queue in case of I/O resource contention failure
Linus Torvalds [Sat, 25 Jul 2020 21:42:11 +0000 (14:42 -0700)]
Merge tag 'riscv-for-linus-5.8-rc7' of git://git./linux/kernel/git/riscv/linux into master
Pull RISC-V fixes from Palmer Dabbelt:
"A few more fixes this week:
- A fix to avoid using SBI calls during kasan initialization, as the
SBI calls themselves have not been probed yet.
- Three fixes related to systems with multiple memory regions"
* tag 'riscv-for-linus-5.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: Parse all memory blocks to remove unusable memory
RISC-V: Do not rely on initrd_start/end computed during early dt parsing
RISC-V: Set maximum number of mapped pages correctly
riscv: kasan: use local_tlb_flush_all() to avoid uninitialized __sbi_rfence
Linus Torvalds [Sat, 25 Jul 2020 21:25:47 +0000 (14:25 -0700)]
Merge tag 'x86-urgent-2020-07-25' of git://git./linux/kernel/git/tip/tip into master
Pull x86 fixes from Ingo Molnar:
"Misc fixes:
- Fix a section end page alignment assumption that was causing
crashes
- Fix ORC unwinding on freshly forked tasks which haven't executed
yet and which have empty user task stacks
- Fix the debug.exception-trace=1 sysctl dumping of user stacks,
which was broken by recent maccess changes"
* tag 'x86-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/dumpstack: Dump user space code correctly again
x86/stacktrace: Fix reliable check for empty user task stacks
x86/unwind/orc: Fix ORC for newly forked tasks
x86, vmlinux.lds: Page-align end of ..page_aligned sections
Linus Torvalds [Sat, 25 Jul 2020 20:55:38 +0000 (13:55 -0700)]
Merge tag 'perf-urgent-2020-07-25' of git://git./linux/kernel/git/tip/tip into master
Pull uprobe fix from Ingo Molnar:
"Fix an interaction/regression between uprobes based shared library
tracing & GDB"
* tag 'perf-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
uprobes: Change handle_swbp() to send SIGTRAP with si_code=SI_KERNEL, to fix GDB regression
Linus Torvalds [Sat, 25 Jul 2020 20:27:12 +0000 (13:27 -0700)]
Merge tag 'timers-urgent-2020-07-25' of git://git./linux/kernel/git/tip/tip into master
Pull timer fix from Ingo Molnar:
"Fix a suspend/resume regression (crash) on TI AM3/AM4 SoC's"
* tag 'timers-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource/drivers/timer-ti-dm: Fix suspend and resume for am3 and am4
Linus Torvalds [Sat, 25 Jul 2020 20:24:40 +0000 (13:24 -0700)]
Merge tag 'sched-urgent-2020-07-25' of git://git./linux/kernel/git/tip/tip into master
Pull scheduler fixes from Ingo Molnar:
"Fix a race introduced by the recent loadavg race fix, plus add a debug
check for a hard to debug case of bogus wakeup function flags"
* tag 'sched-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: Warn if garbage is passed to default_wake_function()
sched: Fix race against ptrace_freeze_trace()
Linus Torvalds [Sat, 25 Jul 2020 20:18:42 +0000 (13:18 -0700)]
Merge tag 'efi-urgent-2020-07-25' of git://git./linux/kernel/git/tip/tip into master
Pull EFI fixes from Ingo Molnar:
"Various EFI fixes:
- Fix the layering violation in the use of the EFI runtime services
availability mask in users of the 'efivars' abstraction
- Revert build fix for GCC v4.8 which is no longer supported
- Clean up some x86 EFI stub details, some of which are borderline
bugs that copy around garbage into padding fields - let's fix these
out of caution.
- Fix build issues while working on RISC-V support
- Avoid --whole-archive when linking the stub on arm64"
* tag 'efi-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi: Revert "efi/x86: Fix build with gcc 4"
efi/efivars: Expose RT service availability via efivars abstraction
efi/libstub: Move the function prototypes to header file
efi/libstub: Fix gcc error around __umoddi3 for 32 bit builds
efi/libstub/arm64: link stub lib.a conditionally
efi/x86: Only copy upto the end of setup_header
efi/x86: Remove unused variables
Linus Torvalds [Sat, 25 Jul 2020 19:53:46 +0000 (12:53 -0700)]
Merge tag '5.8-rc6-cifs-fix' of git://git.samba.org/sfrench/cifs-2.6 into master
Pull cifs fix from Steve French:
"A fix for a recently discovered regression in rename to older servers
caused by a recent patch"
* tag '5.8-rc6-cifs-fix' of git://git.samba.org/sfrench/cifs-2.6:
Revert "cifs: Fix the target file was deleted when rename failed."
Linus Torvalds [Sat, 25 Jul 2020 18:50:59 +0000 (11:50 -0700)]
Merge git://git./linux/kernel/git/netdev/net into master
Pull networking fixes from David Miller:
1) Fix RCU locaking in iwlwifi, from Johannes Berg.
2) mt76 can access uninitialized NAPI struct, from Felix Fietkau.
3) Fix race in updating pause settings in bnxt_en, from Vasundhara
Volam.
4) Propagate error return properly during unbind failures in ax88172a,
from George Kennedy.
5) Fix memleak in adf7242_probe, from Liu Jian.
6) smc_drv_probe() can leak, from Wang Hai.
7) Don't muck with the carrier state if register_netdevice() fails in
the bonding driver, from Taehee Yoo.
8) Fix memleak in dpaa_eth_probe, from Liu Jian.
9) Need to check skb_put_padto() return value in hsr_fill_tag(), from
Murali Karicheri.
10) Don't lose ionic RSS hash settings across FW update, from Shannon
Nelson.
11) Fix clobbered SKB control block in act_ct, from Wen Xu.
12) Missing newlink in "tx_timeout" sysfs output, from Xiongfeng Wang.
13) IS_UDPLITE cleanup a long time ago, incorrectly handled
transformations involving UDPLITE_RECV_CC. From Miaohe Lin.
14) Unbalanced locking in netdevsim, from Taehee Yoo.
15) Suppress false-positive error messages in qed driver, from Alexander
Lobakin.
16) Out of bounds read in ax25_connect and ax25_sendmsg, from Peilin Ye.
17) Missing SKB release in cxgb4's uld_send(), from Navid Emamdoost.
18) Uninitialized value in geneve_changelink(), from Cong Wang.
19) Fix deadlock in xen-netfront, from Andera Righi.
19) flush_backlog() frees skbs with IRQs disabled, so should use
dev_kfree_skb_irq() instead of kfree_skb(). From Subash Abhinov
Kasiviswanathan.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (111 commits)
drivers/net/wan: lapb: Corrected the usage of skb_cow
dev: Defer free of skbs in flush_backlog
qrtr: orphan socket in qrtr_release()
xen-netfront: fix potential deadlock in xennet_remove()
flow_offload: Move rhashtable inclusion to the source file
geneve: fix an uninitialized value in geneve_changelink()
bonding: check return value of register_netdevice() in bond_newlink()
tcp: allow at most one TLP probe per flight
AX.25: Prevent integer overflows in connect and sendmsg
cxgb4: add missing release on skb in uld_send()
net: atlantic: fix PTP on AQC10X
AX.25: Prevent out-of-bounds read in ax25_sendmsg()
sctp: shrink stream outq when fails to do addstream reconf
sctp: shrink stream outq only when new outcnt < old outcnt
AX.25: Fix out-of-bounds read in ax25_connect()
enetc: Remove the mdio bus on PF probe bailout
net: ethernet: ti: add NETIF_F_HW_TC hw feature flag for taprio offload
net: ethernet: ave: Fix error returns in ave_init
drivers/net/wan/x25_asy: Fix to make it work
ipvs: fix the connection sync failed in some cases
...
Atish Patra [Wed, 15 Jul 2020 23:30:09 +0000 (16:30 -0700)]
riscv: Parse all memory blocks to remove unusable memory
Currently, maximum physical memory allowed is equal to -PAGE_OFFSET.
That's why we remove any memory blocks spanning beyond that size. However,
it is done only for memblock containing linux kernel which will not work
if there are multiple memblocks.
Process all memory blocks to figure out how much memory needs to be removed
and remove at the end instead of updating the memblock list in place.
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Atish Patra [Wed, 15 Jul 2020 23:30:08 +0000 (16:30 -0700)]
RISC-V: Do not rely on initrd_start/end computed during early dt parsing
Currently, initrd_start/end are computed during early_init_dt_scan
but used during arch_setup. We will get the following panic if initrd is used
and CONFIG_DEBUG_VIRTUAL is turned on.
[ 0.000000] ------------[ cut here ]------------
[ 0.000000] kernel BUG at arch/riscv/mm/physaddr.c:33!
[ 0.000000] Kernel BUG [#1]
[ 0.000000] Modules linked in:
[ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.8.0-rc4-00015-ged0b226fed02 #886
[ 0.000000] epc:
ffffffe0002058d2 ra :
ffffffe0000053f0 sp :
ffffffe001001f40
[ 0.000000] gp :
ffffffe00106e250 tp :
ffffffe001009d40 t0 :
ffffffe00107ee28
[ 0.000000] t1 :
0000000000000000 t2 :
ffffffe000a2e880 s0 :
ffffffe001001f50
[ 0.000000] s1 :
ffffffe0001383e8 a0 :
ffffffe00c087e00 a1 :
0000000080200000
[ 0.000000] a2 :
00000000010bf000 a3 :
ffffffe00106f3c8 a4 :
ffffffe0010bf000
[ 0.000000] a5 :
ffffffe000000000 a6 :
0000000000000006 a7 :
0000000000000001
[ 0.000000] s2 :
ffffffe00106f068 s3 :
ffffffe00106f070 s4 :
0000000080200000
[ 0.000000] s5 :
0000000082200000 s6 :
0000000000000000 s7 :
0000000000000000
[ 0.000000] s8 :
0000000080011010 s9 :
0000000080012700 s10:
0000000000000000
[ 0.000000] s11:
0000000000000000 t3 :
000000000001fe30 t4 :
000000000001fe30
[ 0.000000] t5 :
0000000000000000 t6 :
ffffffe00107c471
[ 0.000000] status:
0000000000000100 badaddr:
0000000000000000 cause:
0000000000000003
[ 0.000000] random: get_random_bytes called from print_oops_end_marker+0x22/0x46 with crng_init=0
To avoid the error, initrd_start/end can be computed from phys_initrd_start/size
in setup itself. It also improves the initrd placement by aligning the start
and size with the page size.
Fixes:
76d2a0493a17 ("RISC-V: Init and Halt Code")
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Xie He [Fri, 24 Jul 2020 16:33:47 +0000 (09:33 -0700)]
drivers/net/wan: lapb: Corrected the usage of skb_cow
This patch fixed 2 issues with the usage of skb_cow in LAPB drivers
"lapbether" and "hdlc_x25":
1) After skb_cow fails, kfree_skb should be called to drop a reference
to the skb. But in both drivers, kfree_skb is not called.
2) skb_cow should be called before skb_push so that is can ensure the
safety of skb_push. But in "lapbether", it is incorrectly called after
skb_push.
More details about these 2 issues:
1) The behavior of calling kfree_skb on failure is also the behavior of
netif_rx, which is called by this function with "return netif_rx(skb);".
So this function should follow this behavior, too.
2) In "lapbether", skb_cow is called after skb_push. This results in 2
logical issues:
a) skb_push is not protected by skb_cow;
b) An extra headroom of 1 byte is ensured after skb_push. This extra
headroom has no use in this function. It also has no use in the
upper-layer function that this function passes the skb to
(x25_lapb_receive_frame in net/x25/x25_dev.c).
So logically skb_cow should instead be called before skb_push.
Cc: Eric Dumazet <edumazet@google.com>
Cc: Martin Schiller <ms@dev.tdt.de>
Signed-off-by: Xie He <xie.he.0141@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Subash Abhinov Kasiviswanathan [Thu, 23 Jul 2020 17:31:48 +0000 (11:31 -0600)]
dev: Defer free of skbs in flush_backlog
IRQs are disabled when freeing skbs in input queue.
Use the IRQ safe variant to free skbs here.
Fixes:
145dd5f9c88f ("net: flush the softnet backlog in process context")
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Atish Patra [Wed, 15 Jul 2020 23:30:07 +0000 (16:30 -0700)]
RISC-V: Set maximum number of mapped pages correctly
Currently, maximum number of mapper pages are set to the pfn calculated
from the memblock size of the memblock containing kernel. This will work
until that memblock spans the entire memory. However, it will be set to
a wrong value if there are multiple memblocks defined in kernel
(e.g. with efi runtime services).
Set the the maximum value to the pfn calculated from dram size.
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Linus Torvalds [Sat, 25 Jul 2020 01:30:24 +0000 (18:30 -0700)]
Merge tag 'pci-v5.8-fixes-2' of git://git./linux/kernel/git/helgaas/pci into master
Pull PCI fixes from Bjorn Helgaas:
- Reject invalid IRQ 0 command line argument for virtio_mmio because
IRQ 0 now generates warnings (Bjorn Helgaas)
- Revert "PCI/PM: Assume ports without DLL Link Active train links in
100 ms", which broke nouveau (Bjorn Helgaas)
* tag 'pci-v5.8-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
Revert "PCI/PM: Assume ports without DLL Link Active train links in 100 ms"
virtio-mmio: Reject invalid IRQ 0 command line argument
Cong Wang [Fri, 24 Jul 2020 16:45:51 +0000 (09:45 -0700)]
qrtr: orphan socket in qrtr_release()
We have to detach sock from socket in qrtr_release(),
otherwise skb->sk may still reference to this socket
when the skb is released in tun->queue, particularly
sk->sk_wq still points to &sock->wq, which leads to
a UAF.
Reported-and-tested-by: syzbot+6720d64f31c081c2f708@syzkaller.appspotmail.com
Fixes:
28fb4e59a47d ("net: qrtr: Expose tunneling endpoint to user space")
Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 25 Jul 2020 00:26:09 +0000 (17:26 -0700)]
Merge tag 'wireless-drivers-2020-07-24' of git://git./linux/kernel/git/kvalo/wireless-drivers
Kalle Valo says:
====================
wireless-drivers fixes for v5.8
Second set of fixes for v5.8, and hopefully also the last. Three
important regressions fixed.
ath9k
* fix a regression which broke support for all ath9k usb devices
ath10k
* fix a regression which broke support for all QCA4019 AHB devices
iwlwifi
* fix a regression which broke support for some Killer Wireless-AC 1550 cards
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrea Righi [Fri, 24 Jul 2020 08:59:10 +0000 (10:59 +0200)]
xen-netfront: fix potential deadlock in xennet_remove()
There's a potential race in xennet_remove(); this is what the driver is
doing upon unregistering a network device:
1. state = read bus state
2. if state is not "Closed":
3. request to set state to "Closing"
4. wait for state to be set to "Closing"
5. request to set state to "Closed"
6. wait for state to be set to "Closed"
If the state changes to "Closed" immediately after step 1 we are stuck
forever in step 4, because the state will never go back from "Closed" to
"Closing".
Make sure to check also for state == "Closed" in step 4 to prevent the
deadlock.
Also add a 5 sec timeout any time we wait for the bus state to change,
to avoid getting stuck forever in wait_event().
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 24 Jul 2020 23:27:54 +0000 (16:27 -0700)]
Merge tag 'nfsd-5.8-2' of git://linux-nfs.org/~bfields/linux into master
Pull nfsd fix from Bruce Fields:
"Just one fix for a NULL dereference if someone happens to read
/proc/fs/nfsd/client/../state at the wrong moment"
* tag 'nfsd-5.8-2' of git://linux-nfs.org/~bfields/linux:
nfsd4: fix NULL dereference in nfsd/clients display code
Herbert Xu [Fri, 24 Jul 2020 00:50:22 +0000 (10:50 +1000)]
flow_offload: Move rhashtable inclusion to the source file
I noticed that touching linux/rhashtable.h causes lib/vsprintf.c to
be rebuilt. This dependency came through a bogus inclusion in the
file net/flow_offload.h. This patch moves it to the right place.
This patch also removes a lingering rhashtable inclusion in cls_api
created by the same commit.
Fixes:
4e481908c51b ("flow_offload: move tc indirect block to...")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 24 Jul 2020 21:24:35 +0000 (14:24 -0700)]
Merge branch 'akpm' into master (patches from Andrew)
Merge misc fixes from Andrew Morton:
"Subsystems affected by this patch series: mm/pagemap, mm/shmem,
mm/hotfixes, mm/memcg, mm/hugetlb, mailmap, squashfs, scripts,
io-mapping, MAINTAINERS, and gdb"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
scripts/gdb: fix lx-symbols 'gdb.error' while loading modules
MAINTAINERS: add KCOV section
io-mapping: indicate mapping failure
scripts/decode_stacktrace: strip basepath from all paths
squashfs: fix length field overlap check in metadata reading
mailmap: add entry for Mike Rapoport
khugepaged: fix null-pointer dereference due to race
mm/hugetlb: avoid hardcoding while checking if cma is enabled
mm: memcg/slab: fix memory leak at non-root kmem_cache destroy
mm/memcg: fix refcount error while moving and swapping
mm/memcontrol: fix OOPS inside mem_cgroup_get_nr_swap_pages()
mm: initialize return of vm_insert_pages
vfs/xattr: mm/shmem: kernfs: release simple xattr entry in a right way
mm/mmap.c: close race between munmap() and expand_upwards()/downwards()
Linus Torvalds [Fri, 24 Jul 2020 21:19:00 +0000 (14:19 -0700)]
Merge branch 'fixes' of git://git./linux/kernel/git/viro/vfs into master
Pull xtensa csum regression fix from Al Viro:
"Max Filippov caught a breakage introduced in xtensa this cycle
by the csum_and_copy_..._user() series.
Cut'n'paste from the wrong source - the check that belongs
in csum_and_copy_to_user() ended up both there and in
csum_and_copy_from_user()"
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
xtensa: fix access check in csum_and_copy_from_user
Linus Torvalds [Fri, 24 Jul 2020 21:16:12 +0000 (14:16 -0700)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux into master
Pull arm64 fix from Will Deacon:
"Fix compat vDSO build flags for recent versions of clang to tell it
where to find the assembler"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: vdso32: Fix '--prefix=' value for newer versions of clang