md/raid1: fix request counting bug in new 'barrier' code.
authorNeilBrown <neilb@suse.de>
Tue, 14 Jan 2014 00:56:14 +0000 (11:56 +1100)
committerNeilBrown <neilb@suse.de>
Tue, 14 Jan 2014 05:44:07 +0000 (16:44 +1100)
The new iobarrier implementation in raid1 (which keeps normal writes
and resync activity separate) counts every request what is not before
the current resync point in either next_window_requests or
current_window_requests.
It flags that the request is counted by setting ->start_next_window.

allow_barrier follows this model exactly and decrements one of the
*_window_requests if and only if ->start_next_window is set.

However wait_barrier(), which increments *_window_requests uses a
slightly different test for setting -.start_next_window (which is set
from the return value of this function).
So there is a possibility of the counts getting out of sync, and this
leads to the resync hanging.

So change wait_barrier() to return a non-zero value in exactly the
same cases that it increments *_window_requests.

But was introduced in 3.13-rc1.

Reported-by: Bruno Wolff III <bruno@wolff.to>
URL: https://bugzilla.kernel.org/show_bug.cgi?id=68061
Fixes: 79ef3a8aa1cb1523cc231c9a90a278333c21f761
Cc: majianpeng <majianpeng@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
drivers/md/raid1.c

index 1e5a540..a49cfcc 100644 (file)
@@ -924,9 +924,8 @@ static sector_t wait_barrier(struct r1conf *conf, struct bio *bio)
                                conf->next_window_requests++;
                        else
                                conf->current_window_requests++;
-               }
-               if (bio->bi_sector >= conf->start_next_window)
                        sector = conf->start_next_window;
+               }
        }
 
        conf->nr_pending++;