powerpc/bitops: Force inlining of fls()
authorChristophe Leroy <christophe.leroy@csgroup.eu>
Fri, 11 Feb 2022 08:51:32 +0000 (09:51 +0100)
committerMichael Ellerman <mpe@ellerman.id.au>
Tue, 8 Mar 2022 11:33:03 +0000 (22:33 +1100)
commit0b0057cc4193c7cd9c0829a440e4901b29ce4ff8
tree284a6e964c2b9961fbdc1556dc877785770e48ba
parent6b3a3e12f8e6eea47428bb39aaf58832b50bb379
powerpc/bitops: Force inlining of fls()

Building a kernel with CONFIG_CC_OPTIMISE_FOR_SIZE leads to
the following functions being copied several times in vmlinux:

31 times __ilog2_u32()
34 times fls()

Disassembly follows:

c00f476c <fls>:
c00f476c: 7c 63 00 34  cntlzw  r3,r3
c00f4770: 20 63 00 20  subfic  r3,r3,32
c00f4774: 4e 80 00 20  blr

c00f4778 <__ilog2_u32>:
c00f4778: 94 21 ff f0  stwu    r1,-16(r1)
c00f477c: 7c 08 02 a6  mflr    r0
c00f4780: 90 01 00 14  stw     r0,20(r1)
c00f4784: 4b ff ff e9  bl      c00f476c <fls>
c00f4788: 80 01 00 14  lwz     r0,20(r1)
c00f478c: 38 63 ff ff  addi    r3,r3,-1
c00f4790: 7c 08 03 a6  mtlr    r0
c00f4794: 38 21 00 10  addi    r1,r1,16
c00f4798: 4e 80 00 20  blr

When forcing inlining of fls(), we get

c0008b80 <__ilog2_u32>:
c0008b80: 7c 63 00 34  cntlzw  r3,r3
c0008b84: 20 63 00 1f  subfic  r3,r3,31
c0008b88: 4e 80 00 20  blr

vmlinux size gets reduced by 1 kbyte with that change.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/adc9c9d6378f6b5008246ca717993d7870188efb.1644569473.git.christophe.leroy@csgroup.eu
arch/powerpc/include/asm/bitops.h