I hadn't had enough coffee when I wrote this. Currently, the final
increment of buf depends on the value loaded from the table, and
causes gcc to emit a cmov immediately before the return. It is smarter
to let it depend on r, since the increment can then be computed in
parallel with the final load/store pair. It also shaves 16 bytes of
.text.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Tejun Heo <tj@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
/*
* This will print a single '0' even if r == 0, since we would
- * immediately jump to out_r where two 0s would be written and one of
- * them then discarded. This is needed by ip4_string below. All other
- * callers pass a non-zero value of r.
+ * immediately jump to out_r where two 0s would be written but only
+ * one of them accounted for in buf. This is needed by ip4_string
+ * below. All other callers pass a non-zero value of r.
*/
static noinline_for_stack
char *put_dec_trunc8(char *buf, unsigned r)
out_r:
/* 1 <= r < 100 */
*((u16 *)buf) = decpair[r];
- buf += 2;
- if (buf[-1] == '0')
- buf--;
+ buf += r < 10 ? 1 : 2;
return buf;
}