btrfs: Fix race condition between delayed refs and blockgroup removal
authorNikolay Borisov <nborisov@suse.com>
Wed, 18 Apr 2018 06:41:54 +0000 (09:41 +0300)
committerDavid Sterba <dsterba@suse.com>
Fri, 20 Apr 2018 17:17:25 +0000 (19:17 +0200)
commit5e388e95815408c27f3612190d089afc0774b870
treead9882fb36439900d634df34ea76796957732c3d
parent92d32170847bfff2dd08af2c016085779f2fd2a1
btrfs: Fix race condition between delayed refs and blockgroup removal

When the delayed refs for a head are all run, eventually
cleanup_ref_head is called which (in case of deletion) obtains a
reference for the relevant btrfs_space_info struct by querying the bg
for the range. This is problematic because when the last extent of a
bg is deleted a race window emerges between removal of that bg and the
subsequent invocation of cleanup_ref_head. This can result in cache being null
and either a null pointer dereference or assertion failure.

task: ffff8d04d31ed080 task.stack: ffff9e5dc10cc000
RIP: 0010:assfail.constprop.78+0x18/0x1a [btrfs]
RSP: 0018:ffff9e5dc10cfbe8 EFLAGS: 00010292
RAX: 0000000000000044 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff8d04ffc1f868 RSI: ffff8d04ffc178c8 RDI: ffff8d04ffc178c8
RBP: ffff8d04d29e5ea0 R08: 00000000000001f0 R09: 0000000000000001
R10: ffff9e5dc0507d58 R11: 0000000000000001 R12: ffff8d04d29e5ea0
R13: ffff8d04d29e5f08 R14: ffff8d04efe29b40 R15: ffff8d04efe203e0
FS:  00007fbf58ead500(0000) GS:ffff8d04ffc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fe6c6975648 CR3: 0000000013b2a000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 __btrfs_run_delayed_refs+0x10e7/0x12c0 [btrfs]
 btrfs_run_delayed_refs+0x68/0x250 [btrfs]
 btrfs_should_end_transaction+0x42/0x60 [btrfs]
 btrfs_truncate_inode_items+0xaac/0xfc0 [btrfs]
 btrfs_evict_inode+0x4c6/0x5c0 [btrfs]
 evict+0xc6/0x190
 do_unlinkat+0x19c/0x300
 do_syscall_64+0x74/0x140
 entry_SYSCALL_64_after_hwframe+0x3d/0xa2
RIP: 0033:0x7fbf589c57a7

To fix this, introduce a new flag "is_system" to head_ref structs,
which is populated at insertion time. This allows to decouple the
querying for the spaceinfo from querying the possibly deleted bg.

Fixes: d7eae3403f46 ("Btrfs: rework delayed ref total_bytes_pinned accounting")
CC: stable@vger.kernel.org # 4.14+
Suggested-by: Omar Sandoval <osandov@osandov.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
fs/btrfs/delayed-ref.c
fs/btrfs/delayed-ref.h
fs/btrfs/extent-tree.c