libceph: don't go through with the mapping if the PG is too wide
authorIlya Dryomov <idryomov@gmail.com>
Wed, 8 Feb 2017 17:57:48 +0000 (18:57 +0100)
committerIlya Dryomov <idryomov@gmail.com>
Mon, 20 Feb 2017 11:16:11 +0000 (12:16 +0100)
With EC overwrites maturing, the kernel client will be getting exposed
to potentially very wide EC pools.  While "min(pi->size, X)" works fine
when the cluster is stable and happy, truncating OSD sets interferes
with resend logic (ceph_is_new_interval(), etc).  Abort the mapping if
the pool is too wide, assigning the request to the homeless session.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Sage Weil <sage@redhat.com>
net/ceph/osdmap.c

index 2374956..6824c0e 100644 (file)
@@ -2013,8 +2013,14 @@ static void pg_to_raw_osds(struct ceph_osdmap *osdmap,
                return;
        }
 
-       len = do_crush(osdmap, ruleno, pps, raw->osds,
-                      min_t(int, pi->size, ARRAY_SIZE(raw->osds)),
+       if (pi->size > ARRAY_SIZE(raw->osds)) {
+               pr_err_ratelimited("pool %lld ruleset %d type %d too wide: size %d > %zu\n",
+                      pi->id, pi->crush_ruleset, pi->type, pi->size,
+                      ARRAY_SIZE(raw->osds));
+               return;
+       }
+
+       len = do_crush(osdmap, ruleno, pps, raw->osds, pi->size,
                       osdmap->osd_weight, osdmap->max_osd);
        if (len < 0) {
                pr_err("error %d from crush rule %d: pool %lld ruleset %d type %d size %d\n",