review.tizen.org Git - profile/ivi/libvpx.git/commit

author	Timothy B. Terriberry <tterribe@xiph.org>
	Tue, 28 Sep 2010 00:18:18 +0000 (17:18 -0700)
committer	Timothy B. Terriberry <tterribe@xiph.org>
	Tue, 28 Sep 2010 01:25:45 +0000 (18:25 -0700)
commit	18dc92fd664357db31d7ef43337e2dee3a0f5062
tree	c120f473db2248c6a950e8223c2d02024ea29bc8	tree \| snapshot
parent	305be4e4179214c58796de91e86badadbca29451	commit \| diff

Add 4-tap version of 2nd-pass ARMv6 MC filter.

The existing code applied a 6-tap filter with 0's on either end.
We're already paying the branch penalty to avoid computing the two
extra columns needed as input to this filter.
We might as well save time computing the filter as well.
This reduces the inner loop from 21 instructions to 16, the number
of loads per iteration from 4 to 1, and the number of multiplies
from 7 to 4.
The gain in overall decoding performance, however, is small (less
than 1%).

This change also means we now valgrind clean on ARMv6, which is
its real purpose.
The errors reported here were valgrind's fault (it does not detect
that 0 times an uninitialized value is initialized), but Julian
Seward says it would slow down valgrind considerably to make such
checks.
Speeding up libvpx rather, even by a small amount, seems a much
better idea if only to enable proper valgrind checking of the
rest of the codec.

Change-Id: Ifb376ea195e086b60f61daf1097d8910c4d8ff16

vp8/common/arm/armv6/filter_v6.asm		diff \| blob \| history
vp8/common/arm/filter_arm.c		diff \| blob \| history