SSSE3 convolution optimization
authorlevytamar82 <levytamar82@gmail.com>
Thu, 21 Nov 2013 22:49:29 +0000 (15:49 -0700)
committerlevytamar82 <levytamar82@gmail.com>
Thu, 9 Jan 2014 19:27:51 +0000 (12:27 -0700)
commit511d218c60b9b6c1ab9383db746815e907af0359
treea7cbf64477adac2433384293d88d08f27c373fec
parenta622ed554f7072268e4c8d0b8f26d2e8865c2b3b
SSSE3 convolution optimization

Optimizing all SSSE3 assembly for convolution:
1. vp9_filter_block1d4_h8_sse2
2. vp9_filter_block1d8_h8_sse2
3. vp9_filter_block1d16_h8_sse2
4. vp9_filter_block1d4_v8_sse2
5. vp9_filter_block1d8_v8_sse2
6. vp9_filter_block1d16_v8_sse2
my optimization include:
-processing 2x8 elements in one 128 bit register instead of processing
8 elements in one 128 bit register.
-removing unecessary loads.
This optimization gives between 2.4% user level gain for 480p input
and 1.6% user level gain for 720p.
This Optimization done only for 64bit.

Change-Id: Icb586dc0c938b56699864fcee6c52fd43b36b969
vp9/common/x86/vp9_asm_stubs.c
vp9/common/x86/vp9_subpixel_8t_intrin_ssse3.c [new file with mode: 0644]
vp9/vp9_common.mk