Speed up h_predictor_8x8
authorJian Zhou <zhoujian@google.com>
Wed, 25 Nov 2015 20:28:39 +0000 (12:28 -0800)
committerJian Zhou <zhoujian@google.com>
Fri, 4 Dec 2015 19:36:44 +0000 (11:36 -0800)
commitda3f08fac3f35a4d0a6f2d170ba5a27e9719eb73
tree1b1e683c6c02461d4b4c1deb6887954ca53116f9
parentaa2764abdd6af72730a130975c2b86d49e2ced70
Speed up h_predictor_8x8

Relocate the function from SSSE3 to SSE2, Unroll loop from 4 to 2,
and reduce mem access to left.
Speed up by >20% in ./test_intra_pred_speed.

Change-Id: Ib9f1846819783b6e05e2a310c930eb844b2b4d2e
test/test_intra_pred_speed.cc
vpx_dsp/vpx_dsp_rtcd_defs.pl
vpx_dsp/x86/intrapred_sse2.asm
vpx_dsp/x86/intrapred_ssse3.asm