[TargetLowering] Expand the last stage of i16 popcnt using shift+add+and instead...
authorCraig Topper <craig.topper@sifive.com>
Mon, 16 May 2022 16:27:43 +0000 (09:27 -0700)
committerCraig Topper <craig.topper@sifive.com>
Mon, 16 May 2022 16:27:44 +0000 (09:27 -0700)
commit1c4880a2d39fbd95edced0dd97c34a9f53bf62ff
tree336664fb7135eca824b8ae4dcf1d393cec6c10d6
parente6fc8454bee5dc89be27fe1db826fb0bb30d74aa
[TargetLowering] Expand the last stage of i16 popcnt using shift+add+and instead of mul+shift.

If we use multiply it would be with 0x0101 which is 1 more than a power
of 2. On some targets we would expand this to shl+add. By avoiding the
multiply earlier, we can generate better code.

Note, PowerPC doesn't do the shl+add expansion of multiply so one of
the tests increased in instruction count.

Limiting to scalars because it almost always increased the number of
instructions in vector tests.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D125638
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
llvm/test/CodeGen/PowerPC/popcnt-zext.ll
llvm/test/CodeGen/RISCV/ctlz-cttz-ctpop.ll
llvm/test/CodeGen/X86/parity-vec.ll
llvm/test/CodeGen/X86/popcnt.ll