Optimize Neon reductions in sum_neon.h using ADDV instruction
authorJonathan Wright <jonathan.wright@arm.com>
Fri, 7 May 2021 12:25:51 +0000 (13:25 +0100)
committerJonathan Wright <jonathan.wright@arm.com>
Sun, 9 May 2021 19:12:48 +0000 (20:12 +0100)
commit0f563e5fadbccb10fabd6ac80c256a4321401e22
tree8a888aa303277152fbe0fd8f5eb5c6c07722addd
parentf7364c05748b70a1e0fd57849665a9d9f0990803
Optimize Neon reductions in sum_neon.h using ADDV instruction

Use the AArch64-only ADDV and ADDLV instructions to accelerate
reductions that add across a Neon vector in sum_neon.h. This commit
also refactors the inline functions to return a scalar instead of a
vector - allowing for optimization of the surrounding code at each
call site.

Bug: b/181236880
Change-Id: Ieed2a2dd3c74f8a52957bf404141ffc044bd5d79
vpx_dsp/arm/avg_neon.c
vpx_dsp/arm/fdct_partial_neon.c
vpx_dsp/arm/sad_neon.c
vpx_dsp/arm/sum_neon.h
vpx_dsp/arm/variance_neon.c