[AArch64] Improve popcount expansion
authorWilco Dijkstra <wdijkstr@arm.com>
Wed, 12 Feb 2020 18:19:25 +0000 (18:19 +0000)
committerWilco Dijkstra <wdijkstr@arm.com>
Wed, 12 Feb 2020 18:19:25 +0000 (18:19 +0000)
commit9921bbf9b2e27568d952fe6ee5bc083c93bbf7fd
treec475f27f25b512752a4c5904a3aaf3c584047a2f
parente5cc04a73a3e212114ca9725911eaaa66d32303c
[AArch64] Improve popcount expansion

The popcount expansion uses umov to extend the result and move it back
to the integer register file.  If we model ADDV as a zero-extending
operation, fmov can be used to move back to the integer side. This
results in a ~0.5% speedup on deepsjeng on Cortex-A57.

A typical __builtin_popcount expansion is now:

fmov s0, w0
cnt v0.8b, v0.8b
addv b0, v0.8b
fmov w0, s0

gcc/
* config/aarch64/aarch64-simd.md
(aarch64_zero_extend<GPI:mode>_reduc_plus_<VDQV_E:mode>): New pattern.
* config/aarch64/aarch64.md (popcount<mode>2): Use it instead of
generating separate ADDV and zero_extend patterns.
* config/aarch64/iterators.md (VDQV_E): New iterator.

testsuite/
* gcc.target/aarch64/popcnt2.c: New test.
gcc/ChangeLog
gcc/config/aarch64/aarch64-simd.md
gcc/config/aarch64/aarch64.md
gcc/config/aarch64/iterators.md
gcc/testsuite/ChangeLog
gcc/testsuite/gcc.target/aarch64/popcnt2.c [new file with mode: 0644]