Implement horizontal convolution using Neon SDOT instruction
authorJonathan Wright <jonathan.wright@arm.com>
Sun, 11 Apr 2021 14:20:36 +0000 (15:20 +0100)
committerJonathan Wright <jonathan.wright@arm.com>
Wed, 5 May 2021 15:07:18 +0000 (16:07 +0100)
commitc1f77a3689a6cf5e95e1c1ae35d76f4f171f5ef3
tree579a76c354947b65868ff1b23a93d28fc0ac2e4e
parent2eb934d9c1fb4a460e3f03c8578b7b4f4f195784
Implement horizontal convolution using Neon SDOT instruction

Add an alternative AArch64 implementation of vpx_convolve8_horiz_neon
for targets that implement the Armv8.4-A SDOT (signed dot product)
instruction.

The existing MLA-based implementation of vpx_convolve8_horiz_neon is
retained and used on target CPUs that do not implement the SDOT
instruction (or CPUs executing in AArch32 mode). The availability of
the SDOT instruction is indicated by the feature macro
__ARM_FEATURE_DOTPROD.

Co-authored by: James Greenhalgh <james.greenhalgh@arm.com>

Change-Id: I5337286b0f5f2775ad7cdbc0174785ae694363cc
vpx_dsp/arm/mem_neon.h
vpx_dsp/arm/vpx_convolve8_neon.c
vpx_dsp/arm/vpx_convolve8_neon.h