Optimze inv 16x16 DCT with 10 non-zero coeffs - P2
authorJingning Han <jingning@google.com>
Thu, 9 Jan 2014 20:43:40 +0000 (12:43 -0800)
committerJingning Han <jingning@google.com>
Thu, 9 Jan 2014 20:46:09 +0000 (12:46 -0800)
commitaf31b27aae70d18bf5d52307bde1ab356b7c42b5
tree19e97bba9b1b9723ac39ae96afbc5fd4e03ce388
parentba6ab46cdcb1b3ae977984c9e18b122c72370eb6
Optimze inv 16x16 DCT with 10 non-zero coeffs - P2

This commit further optimizes SSE2 operations in the second 1-D
inverse 16x16 DCT, with (<10) non-zero coefficients. The average
runtime of this module goes down from 779 cycles -> 725 cycles.

Change-Id: Iac31b123640d9b1e8f906e770702936b71f0ba7f
vp9/common/x86/vp9_idct_intrin_sse2.c