md/raid1: fix test for 'was read error from last working device'.
authorNeilBrown <neilb@suse.com>
Thu, 23 Jul 2015 23:22:16 +0000 (09:22 +1000)
committerNeilBrown <neilb@suse.com>
Fri, 24 Jul 2015 03:37:21 +0000 (13:37 +1000)
When we get a read error from the last working device, we don't
try to repair it, and don't fail the device.  We simple report a
read error to the caller.

However the current test for 'is this the last working device' is
wrong.
When there is only one fully working device, it assumes that a
non-faulty device is that device.  However a spare which is rebuilding
would be non-faulty but so not the only working device.

So change the test from "!Faulty" to "In_sync".  If ->degraded says
there is only one fully working device and this device is in_sync,
this must be the one.

This bug has existed since we allowed read_balance to read from
a recovering spare in v3.0

Reported-and-tested-by: Alexander Lyakas <alex.bolshoy@gmail.com>
Fixes: 76073054c95b ("md/raid1: clean up read_balance.")
Cc: stable@vger.kernel.org (v3.0+)
Signed-off-by: NeilBrown <neilb@suse.com>
drivers/md/raid1.c

index f80f1af61ce70bce15a72d242d87c1c34da49a79..50cf0c893b16851d83c0e9a3cbe58268c9b80ffc 100644 (file)
@@ -336,7 +336,7 @@ static void raid1_end_read_request(struct bio *bio, int error)
                spin_lock_irqsave(&conf->device_lock, flags);
                if (r1_bio->mddev->degraded == conf->raid_disks ||
                    (r1_bio->mddev->degraded == conf->raid_disks-1 &&
-                    !test_bit(Faulty, &conf->mirrors[mirror].rdev->flags)))
+                    test_bit(In_sync, &conf->mirrors[mirror].rdev->flags)))
                        uptodate = 1;
                spin_unlock_irqrestore(&conf->device_lock, flags);
        }