EDAC/mc: Fix grain_bits calculation
authorRobert Richter <rrichter@marvell.com>
Mon, 24 Jun 2019 15:08:55 +0000 (15:08 +0000)
committerBorislav Petkov <bp@suse.de>
Sat, 3 Aug 2019 10:05:51 +0000 (12:05 +0200)
The grain in EDAC is defined as "minimum granularity for an error
report, in bytes". The following calculation of the grain_bits in
edac_mc is wrong:

grain_bits = fls_long(e->grain) + 1;

Where grain_bits is defined as:

grain = 1 << grain_bits

Example:

grain = 8 # 64 bit (8 bytes)
grain_bits = fls_long(8) + 1
grain_bits = 4 + 1 = 5

grain = 1 << grain_bits
grain = 1 << 5 = 32

Replace it with the correct calculation:

grain_bits = fls_long(e->grain - 1);

The example gives now:

grain_bits = fls_long(8 - 1)
grain_bits = fls_long(7)
grain_bits = 3

grain = 1 << 3 = 8

Also, check if the hardware reports a reasonable grain != 0 and fallback
with a warning to 1 byte granularity otherwise.

 [ bp: massage a bit. ]

Signed-off-by: Robert Richter <rrichter@marvell.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20190624150758.6695-2-rrichter@marvell.com
drivers/edac/edac_mc.c

index 64922c8..d899d86 100644 (file)
@@ -1235,9 +1235,13 @@ void edac_mc_handle_error(const enum hw_event_mc_err_type type,
        if (p > e->location)
                *(p - 1) = '\0';
 
-       /* Report the error via the trace interface */
-       grain_bits = fls_long(e->grain) + 1;
+       /* Sanity-check driver-supplied grain value. */
+       if (WARN_ON_ONCE(!e->grain))
+               e->grain = 1;
+
+       grain_bits = fls_long(e->grain - 1);
 
+       /* Report the error via the trace interface */
        if (IS_ENABLED(CONFIG_RAS))
                trace_mc_event(type, e->msg, e->label, e->error_count,
                               mci->mc_idx, e->top_layer, e->mid_layer,