More optimizations for cost_coeffs().
authorRonald S. Bultje <rbultje@google.com>
Mon, 22 Jul 2013 23:09:09 +0000 (16:09 -0700)
committerRonald S. Bultje <rbultje@google.com>
Mon, 22 Jul 2013 23:09:09 +0000 (16:09 -0700)
commite20fcd9585295d3bae106548d79d501ac8668a6b
tree7c61a984e73a1f05fbe359f308ce91b481418fcb
parent3798d7a641d39a1afe6767b2676bedec1f5a96a2
More optimizations for cost_coeffs().

4x4:    163 ->  123 cycles (33% faster)
8x8:    491 ->  399 cycles (23% faster)
16x16: 1889 -> 1763 cycles (7% faster)
32x32: 8311 -> 8180 cycles (1.6% faster)

Overall encoding time of first 50 frames of bus (speed 0) @ 1500kbps
goes from 1min4.33 to 1min3.00, i.e. 2.11% faster.

Change-Id: Ib52d1dbb5649b14de769d3e7a74af67440b5284f
vp9/common/vp9_entropy.h
vp9/encoder/vp9_rdopt.c