Add SSE2 support for hbd 4-tap interpolation filter.
authorchiyotsai <chiyotsai@google.com>
Mon, 29 Oct 2018 23:12:05 +0000 (16:12 -0700)
committerchiyotsai <chiyotsai@google.com>
Tue, 30 Oct 2018 19:12:28 +0000 (12:12 -0700)
commit5a51d961f2432e13ac2dc97ab75f5e56cab6c6ae
treeb1a63745668498c7f418b18fc128142225254c81
parent8886fe7e310db25af1ef04296fa9cd3c34acd804
Add SSE2 support for hbd 4-tap interpolation filter.

Unit test performance on bitdepth 10:
    | 4X4 | 8X8 |16X16|64X64|
 2D |1.582|1.461|1.425|1.572|
HORZ|1.643|1.247|1.346|1.345|
VERT|1.378|1.695|2.020|1.763|

Unit test performance on bitdepth 12:

    | 4X4 | 8X8 |16X16|64X64|
 2D |1.578|1.409|1.426|1.497|
HORZ|1.625|1.153|1.323|1.259|
VERT|1.392|1.707|2.030|1.787|

Change-Id: I6df85330ac33fcb17d46e4302b41415dda1219f5
vpx_dsp/x86/convolve_sse2.h
vpx_dsp/x86/vpx_asm_stubs.c
vpx_dsp/x86/vpx_subpixel_4t_intrin_sse2.c
vpx_dsp/x86/vpx_subpixel_8t_intrin_avx2.c
vpx_dsp/x86/vpx_subpixel_8t_intrin_ssse3.c