SAD32xh and SAD64xh for AVX2
authorlevytamar82 <tamar.levy@intel.com>
Thu, 2 Oct 2014 06:47:31 +0000 (23:47 -0700)
committerlevytamar82 <tamar.levy@intel.com>
Sun, 19 Oct 2014 20:59:10 +0000 (13:59 -0700)
commit7045aec00a94bd49ed979b8dbd73bb81d58670dc
tree9db7c0d26d8eaa4d9e462961545535fd219d384e
parentfeee7d97b797dff46e9eaef0871098dee463d508
SAD32xh and SAD64xh for AVX2

All sad function that process above 32 consecutive elements are optimized
for AVX2:
vp9_sad64x64
vp9_sad64x32
vp9_sad32x64
vp9_sad32x32
vp9_sad32x16
vp9_sad64x64_avg
vp9_sad64x32_avg
vp9_sad32x64_avg
vp9_sad32x32_avg
vp9_sad32x16_avg
The functions that appeared as a hotspot is vp9_sad32x32 and vp9_sad64x64
vp9_sad32x32 was optimized by 68% and vp9_sad64x64 was optimized by 90%
both of them gave and overall ~2.3% user level gain

Change-Id: Iccf86b375a2b54c5fbbe685902ead0c9a561b9fd
test/sad_test.cc
vp9/common/vp9_rtcd_defs.pl
vp9/encoder/x86/vp9_sad_intrin_avx2.c [new file with mode: 0644]
vp9/vp9cx.mk