btrfs: fix refcount_t usage when deleting btrfs_delayed_nodes
authorChris Mason <clm@fb.com>
Fri, 15 Dec 2017 19:58:27 +0000 (11:58 -0800)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Wed, 10 Jan 2018 08:31:17 +0000 (09:31 +0100)
commitb9fa21f3daa6533ab2303caad72c996e3924cf3b
tree9a3f62d5288cc66333db2167a54f0b2f6f06c72d
parent319122a71f306943a75217b3e78a9a0f8e9c1e95
btrfs: fix refcount_t usage when deleting btrfs_delayed_nodes

commit ec35e48b286959991cdbb886f1bdeda4575c80b4 upstream.

refcounts have a generic implementation and an asm optimized one.  The
generic version has extra debugging to make sure that once a refcount
goes to zero, refcount_inc won't increase it.

The btrfs delayed inode code wasn't expecting this, and we're tripping
over the warnings when the generic refcounts are used.  We ended up with
this race:

Process A                                         Process B
                                                  btrfs_get_delayed_node()
  spin_lock(root->inode_lock)
  radix_tree_lookup()
__btrfs_release_delayed_node()
refcount_dec_and_test(&delayed_node->refs)
our refcount is now zero
  refcount_add(2) <---
  warning here, refcount
                                                  unchanged

spin_lock(root->inode_lock)
radix_tree_delete()

With the generic refcounts, we actually warn again when process B above
tries to release his refcount because refcount_add() turned into a
no-op.

We saw this in production on older kernels without the asm optimized
refcounts.

The fix used here is to use refcount_inc_not_zero() to detect when the
object is in the middle of being freed and return NULL.  This is almost
always the right answer anyway, since we usually end up pitching the
delayed_node if it didn't have fresh data in it.

This also changes __btrfs_release_delayed_node() to remove the extra
check for zero refcounts before radix tree deletion.
btrfs_get_delayed_node() was the only path that was allowing refcounts
to go from zero to one.

Fixes: 6de5f18e7b0da ("btrfs: fix refcount_t usage when deleting btrfs_delayed_node")
Signed-off-by: Chris Mason <clm@fb.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
fs/btrfs/delayed-inode.c