btrfs: fix dead lock while running replace and defrag concurrently
authorGui Hecheng <guihc.fnst@cn.fujitsu.com>
Mon, 10 Nov 2014 07:36:08 +0000 (15:36 +0800)
committerChris Mason <clm@fb.com>
Fri, 21 Nov 2014 01:20:08 +0000 (17:20 -0800)
commit321592427c0146126aadfab8a9b663de1875c9f4
tree7cf8d8427168e4410b369d67555a0c5c3efd96aa
parent5f5bc6b1e2d5a6f827bc860ef2dc5b6f365d1339
btrfs: fix dead lock while running replace and defrag concurrently

This can be reproduced by fstests: btrfs/070

The scenario is like the following:

replace worker thread defrag thread
--------------------- -------------
copy_nocow_pages_worker btrfs_defrag_file
  copy_nocow_pages_for_inode     ...
  btrfs_writepages
  |A| lock_extent_bits     extent_write_cache_pages
|B|   lock_page
__extent_writepage
...   writepage_delalloc
    find_lock_delalloc_range
|B|        lock_extent_bits
  find_or_create_page
    pagecache_get_page
  |A| lock_page

This leads to an ABBA pattern deadlock. To fix it,
o we just change it to an AABB pattern which means to @unlock_extent_bits()
  before we @lock_page(), and in this way the @extent_read_full_page_nolock()
  is no longer in an locked context, so change it back to @extent_read_full_page()
  to regain protection.

o Since we @unlock_extent_bits() earlier, then before @write_page_nocow(),
  the extent may not really point at the physical block we want, so we
  have to check it before write.

Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Tested-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
fs/btrfs/scrub.c