review.tizen.org Git - profile/ivi/libvpx.git/log

projects / profile / ivi / libvpx.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Yunqing Wang [Thu, 14 Oct 2010 15:06:37 +0000 (11:06 -0400)]

Improve bounds checking in vp8_diamond_search_sadx4()

In order to know if all 4/8 neighbor points are within the bounds,
4 bounds checking are enough instead of checking 4 bounds for
each points (16/32 checkings). This improvement reduces cost of
vp8_diamond_search_sadx4() by 30%, and gives encoder a 1.5%
performance gain (test options: 1 pass, good, speed=4).

Change-Id: Ie8da29d18a6ecfc9829e74ac02f6fa70e042331a

commit | commitdiff | tree

Fritz Koenig [Tue, 12 Oct 2010 16:42:03 +0000 (09:42 -0700)]

GCC inline restrictions were not adequate.

=r was not restrictive enough and the compiler was not returning
ebx correctly.

Change-Id: I7606e384067bd5fb69189802f1ff64ccc5aa02d6

commit | commitdiff | tree

John Koleszar [Thu, 7 Oct 2010 05:39:16 +0000 (22:39 -0700)]

Centralize mb skip state calculation

This patch moves the scattered updates to the mb skip state
(mode_info_context->mbmi.mb_skip_coeff) to vp8_tokenize_mb. Recent
changes to the quantizer exposed a bug where if a macroblock
could be coded as a skip but isn't, the encoder would run the
loopfilter but the decoder wouldn't, causing a reference buffer
mismatch.

The loopfilter is controlled by a flag called dc_diff. The decoder
looks at the number of decoded coefficients when setting this flag.
The encoder sets this flag based on the skip state, since any
skippable macroblock should be transmitted as a skip. The coefficient
optimization pass (vp8_optimize_b()) could change the coefficients
such that a block that was not a skip becomes one. The encoder was
not updating the skip state in this situation for intra coded blocks.

The underlying issue predates it, but this bug was recently triggered
by enabling trellis quantization on the Y2 block in commit dcd29e3,
and by changing the quantizer range control in commit 305be4e.

Change-Id: I5cce5da0dbc2d22f7d79ee48149f01e868a64802

commit | commitdiff | tree

John Koleszar [Tue, 12 Oct 2010 12:44:20 +0000 (05:44 -0700)]

Merge "Add const qualifiers to variance/SAD functions."

commit | commitdiff | tree

Timothy B. Terriberry [Mon, 11 Oct 2010 21:01:23 +0000 (14:01 -0700)]

Add const qualifiers to variance/SAD functions.

These functions should never change their input, and there's no
reason not to declare that.
This allows them to be passed static const data.

Change-Id: Ia49fe4b01e80e9afcb24b4844817694d4da5995c

commit | commitdiff | tree

John Koleszar [Tue, 12 Oct 2010 12:34:30 +0000 (05:34 -0700)]

Merge "Move vp8_strict_quantize_b inside EXACT_QUANT #define."

commit | commitdiff | tree

John Koleszar [Tue, 12 Oct 2010 12:33:22 +0000 (05:33 -0700)]

Merge "Remove INTRARDOPT #define and intra_rd_opt option."

commit | commitdiff | tree

Timothy B. Terriberry [Mon, 11 Oct 2010 20:49:52 +0000 (13:49 -0700)]

Move vp8_strict_quantize_b inside EXACT_QUANT #define.

There is currently no inexact version of this function, so do not
even compile it without EXACT_QUANT.
This will prevent someone from inadvertently trying to use it without
the proper EXACT_QUANT setup.

Change-Id: Ia13491e0128afb281c05c9222ee5987101e4010d

commit | commitdiff | tree

Timothy B. Terriberry [Mon, 11 Oct 2010 16:34:48 +0000 (09:34 -0700)]

Remove INTRARDOPT #define and intra_rd_opt option.

This is just eliminating some cruft.
Although a number of variables are declared only when INTRARDOPT
is defined, they are used elsewhere without that protection, and
no longer just for intra RDO.
The intra_rd_opt flag was hard-coded to 1 and never checked.

Change-Id: I83a81554ecee8053e7b4ccd8aa04e18fa60f8e4f

commit | commitdiff | tree

Scott LaVarnway [Mon, 11 Oct 2010 16:34:48 +0000 (09:34 -0700)]

Merge "Added vp8_fast_quantize_b_sse2"

commit | commitdiff | tree

John Koleszar [Mon, 11 Oct 2010 14:43:35 +0000 (07:43 -0700)]

Merge "Remove ivfenc usage message leading underscores"

commit | commitdiff | tree

John Koleszar [Mon, 11 Oct 2010 13:41:14 +0000 (09:41 -0400)]

Remove ivfenc usage message leading underscores

An earlier automatic transform changed eg '\nOptions' to '\n_options'
which is incorrect in these printfs. Fix these.

Change-Id: I7e0f37931ef82b79fadddd7058ce0df5572e2ca1

commit | commitdiff | tree

Johann [Thu, 7 Oct 2010 18:13:36 +0000 (14:13 -0400)]

configure is not in src

one comment in the README said the configure script was in src.
it's not. pointed out by Aaron Sherman

Change-Id: Ife0b53e096856d46669a99eefd71ac23d0351f65

commit | commitdiff | tree

Yunqing Wang [Thu, 7 Oct 2010 16:08:08 +0000 (12:08 -0400)]

Remove unused file in encoder

Remove vp8/encoder/x86/csystemdependent.c

Change-Id: I7c590dcd07b68704d463a1452f62f29ffb1402f4

commit | commitdiff | tree

Scott LaVarnway [Thu, 7 Oct 2010 15:43:19 +0000 (11:43 -0400)]

Added vp8_fast_quantize_b_sse2

Moved vp8_fast_quantize_b_sse from quantize_mmx.asm into
quantize_sse2.asm and renamed. Updated the assembly code to
match the C version.

Change-Id: I1766d9e1ca60e173f65badc0ca0c160c2b51b200

commit | commitdiff | tree

Yaowu Xu [Wed, 6 Oct 2010 20:28:36 +0000 (13:28 -0700)]

optimize fast_quantizer c version

As the zbin and rounding constants are normalized, rounding effectively
does the zbinning, therefore the zbin operation can be removed. In
addition, the memset on the two arrays are no longer necessary.

Change-Id: If39c353c42d7e052296cb65322e5218810b5cc4c

commit | commitdiff | tree

Jan Kratochvil [Tue, 5 Oct 2010 17:15:08 +0000 (19:15 +0200)]

nasm: add configure support

yasm has to be preferred as currently nasm produces marginally less
efficient code (longer opcodes). Filed for nasm as:
https://sourceforge.net/tracker/?func=detail&atid=106208&aid=3037462&group_id=6208

OTOH package should be built always the same, no matter which additional
packages are / are not present on the system. As the package should be
built with nasm (as yasm may not be available) we should not use yasm
even if it is possibly available.

nasm >= approx. 2.09 is required for the nasm compilation as the former
versions had a section alignment bug.

Provide nasm compatibility. No binary change by this patch with yasm on
{x86_64,i686}-fedora13-linux-gnu. Few longer opcodes with nasm on
{x86_64,i686}-fedora13-linux-gnu have been checked as safe.

Change-Id: Icb0fe39c64bbcc3bcd7972e392fd03f3273340df

commit | commitdiff | tree

Paul Wilkins [Tue, 5 Oct 2010 13:58:24 +0000 (06:58 -0700)]

Merge "Tune effect of motion on KF/GF boost in two pass;"

commit | commitdiff | tree

Jan Kratochvil [Mon, 4 Oct 2010 21:20:38 +0000 (23:20 +0200)]

nasm: movhps compatibility QWORD->MMWORD

Filed for nasm as:
https://sourceforge.net/tracker/?func=detail&atid=106208&aid=3081103&group_id=6208

nasm just does not accept any size parameter for movhps:
1.asm:2: error: mismatch in operand sizes

Some parts of libvpx already use MMWORD for movhps and MMWORD is
defined-out so it is compatible both with yasm and nasm.

Provide nasm compatibility. No binary change by this patch with yasm on
{x86_64,i686}-fedora13-linux-gnu.

Change-Id: I4008a317ca87ec07c9ada958fcdc10a0cb589bbc

commit | commitdiff | tree

Jan Kratochvil [Sat, 31 Jul 2010 15:12:32 +0000 (17:12 +0200)]

nasm: avoid relative include paths

nasm does not automatically assume the source's directory also for its
include files.

Provide nasm compatibility. No binary change by this patch with yasm on
{x86_64,i686}-fedora13-linux-gnu. Few longer opcodes with nasm on
{x86_64,i686}-fedora13-linux-gnu have been checked as safe.

Change-Id: I386efa0cca5d401193416c11bd7363a283541645

commit | commitdiff | tree

Jan Kratochvil [Mon, 4 Oct 2010 21:18:58 +0000 (23:18 +0200)]

nasm: address labels 'rel label' vice 'wrt rip'

nasm does not support `label wrt rip', it requires `rel label'. It is
still fully compatible with yasm.

Provide nasm compatibility. No binary change by this patch with yasm on
{x86_64,i686}-fedora13-linux-gnu. Few longer opcodes with nasm on
{x86_64,i686}-fedora13-linux-gnu have been checked as safe.

Change-Id: I488773a4e930a56e43b0cc72d867ee5291215f50

commit | commitdiff | tree

Jan Kratochvil [Mon, 4 Oct 2010 21:19:33 +0000 (23:19 +0200)]

nasm: match instruction length (movd/movq) to parameters

nasm requires the instruction length (movd/movq) to match to its
parameters. I find it more clear to really use 64bit instructions when
we use 64bit registers in the assembly.

Provide nasm compatibility. No binary change by this patch with yasm on
{x86_64,i686}-fedora13-linux-gnu. Few longer opcodes with nasm on
{x86_64,i686}-fedora13-linux-gnu have been checked as safe.

Change-Id: Id9b1a5cdfb1bc05697e523c317a296df43d42a91

commit | commitdiff | tree

Yaowu Xu [Mon, 4 Oct 2010 17:41:20 +0000 (10:41 -0700)]

Merge "enable trellis quantization for 2nd order blocks"

commit | commitdiff | tree

Paul Wilkins [Sat, 2 Oct 2010 16:31:46 +0000 (17:31 +0100)]

Tune effect of motion on KF/GF boost in two pass;

This code adjust the impact of the amount and speed of motion
on GF and KF boost.

Sections with lots of slow motion will tend to have a
somewhat bigger boost and sections with fast motion may
have less.

There is a knock on effect to the selection of the active
quantizer range.

This will likely require further tuning but helps with a couple
of particularly bad edge cases.

Change-Id: Ic2449cda7305672b69acf42fc0a845b77ac98d40

commit | commitdiff | tree

Yaowu Xu [Fri, 1 Oct 2010 03:41:37 +0000 (20:41 -0700)]

enable trellis quantization for 2nd order blocks

Experimented with different value for Y2_RD_MULT ranging f[1, 32],
without adapting the value to MB coding mode/frame type/Q value,
4 works out best among all values, providing overall 0.1% coding
gain on the test set.

Change-Id: I6b2583a8aa5db5e7e5c65c646301909c0c58f876

commit | commitdiff | tree

Johann [Fri, 1 Oct 2010 13:18:53 +0000 (06:18 -0700)]

Merge "Fix valgrind errors in the NEON loop filters."

commit | commitdiff | tree

Adrian Grange [Fri, 1 Oct 2010 09:14:01 +0000 (10:14 +0100)]

Made temporal filter default to use centered mode

If temporal filtering is enabled but a filter type is not specified
centered filter mode is used by default.

Change-Id: I87306f267c1390074c806c506a69b4ba914d92a2

commit | commitdiff | tree

Timothy B. Terriberry [Fri, 1 Oct 2010 03:40:45 +0000 (20:40 -0700)]

Fix valgrind errors in the NEON loop filters.

Like the ARMv6 code, these functions were accessing values below
the stack pointer, which can be corrupted by signal delivery at
any time.

commit | commitdiff | tree

John Koleszar [Thu, 30 Sep 2010 17:26:31 +0000 (10:26 -0700)]

Merge "Rename mode_ref_lf_test_function"

commit | commitdiff | tree

John Koleszar [Thu, 30 Sep 2010 17:26:10 +0000 (10:26 -0700)]

Merge "Fix loopfilter delta zero transitions"

commit | commitdiff | tree

Adrian Grange [Thu, 30 Sep 2010 09:06:09 +0000 (10:06 +0100)]

Changed defaults & range checking for AltRef params

Modified the range checking of parameters used in the
AltRef temporal filter (arnr-max-frames, arnr-strength,
arnr-type) and default values for each of them.

Change-Id: Ib261028d501b9523f6e44cb4790cc52167b6e92b

commit | commitdiff | tree

John Koleszar [Wed, 29 Sep 2010 17:53:08 +0000 (13:53 -0400)]

Rename mode_ref_lf_test_function

This function graduated from being a test func to something that's on
by default. Rename it and remove some spurious comments that confuse
its status.

Change-Id: I689695a3ad29c35e9a72a43ec93766733ac6c20b

commit | commitdiff | tree

Fritz Koenig [Wed, 29 Sep 2010 17:47:01 +0000 (10:47 -0700)]

Merge "Optimizations on the loopfilters."

commit | commitdiff | tree

John Koleszar [Wed, 29 Sep 2010 17:04:04 +0000 (13:04 -0400)]

Fix loopfilter delta zero transitions

Loopfilter deltas are initialized to zero on keyframes in the decoder.
The values then persist from the previous frame unless an update bit
is set in the bitstream. This data is not included in the entropy
data saved by the 'refresh entropy' bit in the bitstream, so it is
effectively an additional contextual element beyond the 3 ref-frames
and the entropy data.

The encoder was treating this delta update bit as update-if-nonzero,
meaning that the value would be refreshed even if it hadn't changed,
and more significantly, if the correct value for the delta changed
to zero, the update wouldn't be sent, and the decoder would preserve
the last (presumably non-zero) value.

This patch updates the encoder to send an update only if the value
has changed from the previously transmitted value. It also forces the
value to be transmitted in error resilient mode, to account for lost
context in the event of lost frames.

Change-Id: I56671d5b42965d0166ac226765dbfce3e5301868

commit | commitdiff | tree

Paul Wilkins [Wed, 29 Sep 2010 12:22:05 +0000 (13:22 +0100)]

Change to coefficient optimization rules.

Allow coefficient optimization for good quality speed 0.

Change-Id: Id0cb363df6823c6798671584fbba097916a7df2c

commit | commitdiff | tree

Adrian Grange [Wed, 29 Sep 2010 12:13:41 +0000 (05:13 -0700)]

Merge "Moved row-specific computation of MV bounds out of col loop"

commit | commitdiff | tree

Adrian Grange [Wed, 29 Sep 2010 12:03:07 +0000 (13:03 +0100)]

Moved row-specific computation of MV bounds out of col loop

Moved the bounds computation on vertical MV component out
of the loop that processes MBs within a MB row.

commit | commitdiff | tree

Paul Wilkins [Wed, 29 Sep 2010 11:03:19 +0000 (12:03 +0100)]

Control of active min quantizer for two pass.

Create look up tables for controlling the active quantizer range.
Some initial tuning to improve quality circa 0.5% on test set.
Clean up of some stats output code

Change-Id: Ia698a8525f8b8129a503cadace3ee73fe888f543

commit | commitdiff | tree

Fritz Koenig [Tue, 28 Sep 2010 19:01:34 +0000 (12:01 -0700)]

Optimizations on the loopfilters.

- Scheduling for Atom processors
- Combining of macros to allow for better interleaving
- Change from multiplies to adds for main filter
- Use of movhps/movlps to fill xmm registers without
shifting and orring

Change-Id: I0b3500a5f58abf7085253ec92d64c8a96723040b

commit | commitdiff | tree

Adrian Grange [Tue, 28 Sep 2010 15:52:19 +0000 (16:52 +0100)]

Enabled AltRef motion map creation

Enabled the first-pass encode to output the
map of macroblock coding modes required by
the AltRef filter.

commit | commitdiff | tree

Adrian Grange [Tue, 28 Sep 2010 15:34:44 +0000 (08:34 -0700)]

Merge "Made AltRef filter adaptive & added motion compensation"

commit | commitdiff | tree

Adrian Grange [Tue, 28 Sep 2010 14:23:41 +0000 (15:23 +0100)]

Made AltRef filter adaptive & added motion compensation

Modified AltRef temporal filter to adapt filter length based
on macroblock coding modes selected during first-pass
encode.

Also added sub-pixel motion compensation to the AltRef
filter.

commit | commitdiff | tree

Johann [Tue, 28 Sep 2010 14:10:09 +0000 (07:10 -0700)]

Merge "update gitignore"

commit | commitdiff | tree

Johann [Tue, 28 Sep 2010 13:31:11 +0000 (09:31 -0400)]

update gitignore

this was excluding all .asm files when it should have just been .asm
files in the top level directory and .asm.s files lower down. also be
more restrictive on some other items, and run the whole thing through
sort to keep it organized

Change-Id: Ia48525033226b13098a491ce89465d0377b990c2

commit | commitdiff | tree

Timothy B. Terriberry [Tue, 28 Sep 2010 00:18:18 +0000 (17:18 -0700)]

Add 4-tap version of 2nd-pass ARMv6 MC filter.

The existing code applied a 6-tap filter with 0's on either end.
We're already paying the branch penalty to avoid computing the two
extra columns needed as input to this filter.
We might as well save time computing the filter as well.
This reduces the inner loop from 21 instructions to 16, the number
of loads per iteration from 4 to 1, and the number of multiplies
from 7 to 4.
The gain in overall decoding performance, however, is small (less
than 1%).

This change also means we now valgrind clean on ARMv6, which is
its real purpose.
The errors reported here were valgrind's fault (it does not detect
that 0 times an uninitialized value is initialized), but Julian
Seward says it would slow down valgrind considerably to make such
checks.
Speeding up libvpx rather, even by a small amount, seems a much
better idea if only to enable proper valgrind checking of the
rest of the codec.

Change-Id: Ifb376ea195e086b60f61daf1097d8910c4d8ff16

commit | commitdiff | tree

Paul Wilkins [Fri, 24 Sep 2010 16:52:55 +0000 (17:52 +0100)]

Badly placed initialization of rolling rate monitors.

This affects control of the active quantizer range.

Change-Id: I30511fc81ac9f75ff20d9f1372382423d56739da

commit | commitdiff | tree

John Koleszar [Mon, 27 Sep 2010 16:48:31 +0000 (12:48 -0400)]

move reconintra_mt to decoder (fixup)

Missed the .h file in the move.

Change-Id: Ib408183fbb4d019fd46394b362f89ca6ea9d10bc

commit | commitdiff | tree

John Koleszar [Mon, 27 Sep 2010 14:00:03 +0000 (07:00 -0700)]

Merge "disable compilation of debugging code"

commit | commitdiff | tree

Johann [Mon, 27 Sep 2010 13:39:20 +0000 (06:39 -0700)]

Merge "combine max values and compare once"

commit | commitdiff | tree

Johann [Mon, 27 Sep 2010 13:36:22 +0000 (06:36 -0700)]

Merge "Fix valgrind errors in vp8_sixtap_predict8x4_armv6()."

commit | commitdiff | tree

John Koleszar [Mon, 27 Sep 2010 13:10:07 +0000 (06:10 -0700)]

Merge "darwin-icc: build for specific SDKs"

commit | commitdiff | tree

Timothy B. Terriberry [Fri, 24 Sep 2010 21:30:13 +0000 (14:30 -0700)]

Fix valgrind errors in vp8_sixtap_predict8x4_armv6().

This function was accessing values below the stack pointer, which
can be corrupted by signal delivery at any time.

Change-Id: I92945b30817562eb0340f289e74c108da72aeaca

commit | commitdiff | tree

Johann [Fri, 24 Sep 2010 16:03:31 +0000 (12:03 -0400)]

combine max values and compare once

previous implementation compared each set of values to limit and then
&'d them together, requiring a compare and & for each value.

this does the accumulation first, requiring only one compare

Change-Id: Ia5e3a1a50e47699c88470b8c41964f92a0dc1323

commit | commitdiff | tree

John Koleszar [Fri, 24 Sep 2010 15:46:35 +0000 (08:46 -0700)]

Merge "move reconintra_mt to decoder (for now)"

commit | commitdiff | tree

John Koleszar [Fri, 24 Sep 2010 15:10:25 +0000 (11:10 -0400)]

disable compilation of debugging code

This patch avoids compiling some debugging code in onyx_if.c. The most
significant fix is to avoid generating code for vp8_write_yuv_frame,
which is never called. Some other code was removed by the dead code
elimination performed by the compiler, and this patch does it with the
preprocessor instead. There are advantages both ways.

Change-Id: I044fd43179d2e947553f0d6f2cad5b40907ac458

commit | commitdiff | tree

John Koleszar [Fri, 24 Sep 2010 15:39:27 +0000 (11:39 -0400)]

darwin-icc: build for specific SDKs

Add the missing -isysroot and -mmacosx-version-min flags to ICC builds.
Fixes issue #185.

Change-Id: I2fb37fcaaafef7122a61ced603569f4aa17f8bbc

commit | commitdiff | tree

Yunqing Wang [Fri, 24 Sep 2010 15:34:07 +0000 (08:34 -0700)]

Merge "Adjust multi-thread sync ranges according to image sizes"

commit | commitdiff | tree

John Koleszar [Fri, 24 Sep 2010 15:21:35 +0000 (11:21 -0400)]

move reconintra_mt to decoder (for now)

reconintra_mt.c is only required for building the decoder right now.
It could definitely be used for the encoder in the future, but it
currently depends on decoder only data structures. (onyxd_int.h,
VP8D_COMP, etc). Move it from common/ to decoder/ until the
necessary changes to the common multithread code are complete.

This patch is needed to build with --disable-vp8-decoder.

Change-Id: I568c52221a2b309234d269675cba97131ce35c86

commit | commitdiff | tree

John Koleszar [Tue, 21 Sep 2010 14:34:51 +0000 (10:34 -0400)]

configure: enable PIC for shared libs by default

Shared libs generally require PIC, so this saves a little typing at
configure time.

Change-Id: I357d70cc68434f3283fee78873052d2b7d77c777

commit | commitdiff | tree

John Koleszar [Tue, 21 Sep 2010 14:06:41 +0000 (10:06 -0400)]

configure: add --enable-small

Build with -O2 rather than -O3, to dissuade the compiler from inlining
so much. See issue #1.

Change-Id: Iacb8ddb59125d3f01c5fea846b45a1c004c9aee0

commit | commitdiff | tree

John Koleszar [Fri, 24 Sep 2010 12:39:48 +0000 (05:39 -0700)]

Merge "Add getter functions for the interface data symbols"

commit | commitdiff | tree

John Koleszar [Tue, 21 Sep 2010 14:35:52 +0000 (10:35 -0400)]

Add getter functions for the interface data symbols

Having these symbols be available as functions rather than data is
occasionally more convenient. Implemented this way rather than a
get-codec-by-id style to avoid creating a link-time dependency
between the encoder and the decoder.

Fixes issue #169

Change-Id: I319f281277033a5e7e3ee3b092b9a87cce2f463d

commit | commitdiff | tree

Yunqing Wang [Thu, 23 Sep 2010 17:53:09 +0000 (13:53 -0400)]

Adjust multi-thread sync ranges according to image sizes

In multi-threaded decoder, set different sync ranges for
different video resolutions.

Change-Id: Iea48fd36f51919e0152c8ed3b1f10e1b723c0ca7

commit | commitdiff | tree

Johann [Wed, 22 Sep 2010 15:07:34 +0000 (11:07 -0400)]

Remove dead code

The new loopfilter was originally introduced as an experimental change.
It's permanent now.

Change-Id: I25dbedb6ceff3e9f9c04e18bb29f84c3ecb7e546

commit | commitdiff | tree

John Koleszar [Tue, 21 Sep 2010 23:48:06 +0000 (19:48 -0400)]

unset execute bit on c source

Change-Id: I6625ee41f8872908cb015ce0729e1c7a105b5217

commit | commitdiff | tree

Johann [Tue, 21 Sep 2010 19:03:37 +0000 (12:03 -0700)]

Merge "Fix typo"

commit | commitdiff | tree

Johann [Tue, 21 Sep 2010 18:56:42 +0000 (14:56 -0400)]

Fix typo

Also, move with other ppc32 options

Change-Id: I0b97413c767909c5682afc9bdd954f3d43401f6c

commit | commitdiff | tree

John Koleszar [Tue, 21 Sep 2010 16:06:59 +0000 (09:06 -0700)]

Merge "Don't reset mb clamping state during splitmv decoding"

commit | commitdiff | tree

John Koleszar [Tue, 21 Sep 2010 15:54:36 +0000 (11:54 -0400)]

Don't reset mb clamping state during splitmv decoding

The MV decoding changes in c5fb0eb introduced a bug where the
macroblock clamping state was reset for each partition, so if an
earlier partition needed clamping but a subsequent one didn't,
the MB wouldn't receive clamping. Instead, the state is only
set during splitmv decoding, never cleared.

Change-Id: I224fe258493405ee0f6a04596acdb622c475e845

commit | commitdiff | tree

John Koleszar [Tue, 21 Sep 2010 14:13:26 +0000 (07:13 -0700)]

Merge "gitignore: initial version"

commit | commitdiff | tree

John Koleszar [Tue, 21 Sep 2010 14:02:43 +0000 (07:02 -0700)]

Merge "configure: support for ppc32-linux-gcc"

commit | commitdiff | tree

John Koleszar [Tue, 21 Sep 2010 12:36:46 +0000 (05:36 -0700)]

Merge "Add high limit check for unsigned parameters"

commit | commitdiff | tree

Yunqing Wang [Tue, 21 Sep 2010 12:00:30 +0000 (05:00 -0700)]

Merge "Restructure multi-threaded decoder"

commit | commitdiff | tree

Fritz Koenig [Mon, 20 Sep 2010 16:30:49 +0000 (09:30 -0700)]

Use movq instead of movdqu.

Movdqu is more expensive (throughput, uops) than movq. Minimal
impact for newer big cores, but ~2.25% gain on Atom.

Change-Id: I62c80bb1cc01d8a91c350c4c7719462809a4ef7f

commit | commitdiff | tree

Fritz Koenig [Mon, 20 Sep 2010 18:01:51 +0000 (11:01 -0700)]

Merge "Better choice of instruction filter mask comparision."

commit | commitdiff | tree

Johann [Mon, 20 Sep 2010 17:47:33 +0000 (10:47 -0700)]

Merge "reorder data to use wider instructions"

commit | commitdiff | tree

Johann [Mon, 20 Sep 2010 17:47:22 +0000 (10:47 -0700)]

Merge "Update NEON wide idcts"

commit | commitdiff | tree

Fritz Koenig [Wed, 15 Sep 2010 21:07:32 +0000 (14:07 -0700)]

Better choice of instruction filter mask comparision.

Use pmaxub instead of a combination of psubusb/por to
determine if any comparisons go over the limit.

Change-Id: I3f0bd7d2aabe5fee9ba6620508e2b60605abcb82

commit | commitdiff | tree

Guillermo Ballester Valor [Fri, 11 Jun 2010 18:33:49 +0000 (14:33 -0400)]

Add high limit check for unsigned parameters

The patch related with issue #55 (5a72620) fixed some warnings, but the
fix was not optimal. It actually was a trick to confuse compiler rather
than a fix.

This patch fixes it by creating a new macro used when needed just a high
limit check for an unsigned.

Change-Id: I94b322e0f7fb07604b3b1df1f9321185f48cfcb5

commit | commitdiff | tree

Johann [Thu, 9 Sep 2010 19:55:19 +0000 (15:55 -0400)]

reorder data to use wider instructions

the previous commit laid the groundwork by doing two sets of idcts
together. this moved that further by grouping the interesting data
(q[0], q+16[0]) together to allow using wider instructions. also
managed to drop a few instructions by recognizing that the constant
for sinpi8sqrt2 could be downshifted all the time which avoided a
dowshift as well as workarounds for a function which only accepted
signed data

looks like a modest gain for performance: at qcif, went from ~180
fps to ~183
Change-Id: I842673f3080b8239e026cc9b50346dbccbab4adf

commit | commitdiff | tree

Yunqing Wang [Thu, 16 Sep 2010 18:08:52 +0000 (14:08 -0400)]

Restructure multi-threaded decoder

On each MB, loopfiltering is done right after MB decoding. This
combines two loops in multi-threaded code into one, which reduces
number of synchronizations to half.

The above-row/left-col data are saved in temp buffers for
next-row/next MB decoding.

Tests on 4-core gLucid machine showed 10% decoder performance
gain with threads=4 (tulip clip). Testing on other platforms
isn't done yet.

Change-Id: Id18ea7c1e84965dabea65d4c01ca5bc056ddeac9

commit | commitdiff | tree

John Koleszar [Thu, 16 Sep 2010 17:13:31 +0000 (13:13 -0400)]

cleanup: remove unused xprintf

These files aren't currently used, and we can get them back if we
need them.

Change-Id: I62aa3bff828e491a80c80eeb84a7c44903df29b5

commit | commitdiff | tree

John Koleszar [Thu, 16 Sep 2010 14:00:04 +0000 (10:00 -0400)]

Reduce size of tokenizer tables

This patch reduces the size of the global tables maintained by the
tokenizer to 16k from 80k-96k. See issue #177.

Change-Id: If0275d5f28389af11ac83c5d929d1157cde90fbe

commit | commitdiff | tree

Fritz Koenig [Tue, 14 Sep 2010 22:46:37 +0000 (15:46 -0700)]

Modify GET_GOT macro for performance.

GET_GOT was producing a zero length call. This resulted in
pipeline flushes occuring when returing from the assembly
functions. Masked on out of order cores, but evident on
Atom cores.

Change-Id: I8c375af313e8a169c77adbaf956693c0cfeb5ccd

commit | commitdiff | tree

Fritz Koenig [Tue, 14 Sep 2010 01:34:34 +0000 (18:34 -0700)]

Removed unnecessary pxor.

There is no need to make sure that the lower byte of the
register is 0 because the downshift by 11 overwrites that byte.

Change-Id: I89cbf004b2ff532a2c68e0dc399c45a49cdad5a1

commit | commitdiff | tree

Fritz Koenig [Mon, 13 Sep 2010 18:04:22 +0000 (11:04 -0700)]

Merge "Make block access to frame buffer sequential"

commit | commitdiff | tree

John Koleszar [Mon, 13 Sep 2010 13:04:55 +0000 (09:04 -0400)]

configure: support for ppc32-linux-gcc

Fixes issue 89. Thanks to josejx for the patch.

Change-Id: I7e664fed703b49f2fb3af4c5e6ce1173742000c2

commit | commitdiff | tree

John Koleszar [Mon, 13 Sep 2010 13:00:24 +0000 (09:00 -0400)]

cosmetics: expand tabs in configure

Change-Id: I88ddb0afb56ef2be8184b56fe125ad938ead7a84

commit | commitdiff | tree

Fritz Koenig [Fri, 10 Sep 2010 23:27:28 +0000 (16:27 -0700)]

Make block access to frame buffer sequential

Sequentially accessing memory from a low address to a high
address should make it easier for the processor to predict
the cache.

Change-Id: I1921ce996bdd547144fe864fea6435f527f5842d

commit | commitdiff | tree

Scott LaVarnway [Thu, 9 Sep 2010 18:51:29 +0000 (11:51 -0700)]

Merge "Improved subset block search"

commit | commitdiff | tree

Scott LaVarnway [Thu, 9 Sep 2010 18:42:48 +0000 (14:42 -0400)]

Improved subset block search

Improved the subset block search and fill. (about 3% improvement for
32 bit) Modified/merged the code in order to create
vp8_read_mb_modes_mv which can decode the modes/mvs on a macroblock
level. This will allow the decode loop (in the future) to decode
modes/mvs on a frame, row, or mb level.

Change-Id: If637d994b508792f846d39b5d44a7bf9aa5cddf3

commit | commitdiff | tree

Johann [Tue, 7 Sep 2010 18:21:27 +0000 (14:21 -0400)]

Update NEON wide idcts

Expand 93c32a55 which used SSE2 instructions to do two
idct/dequant/recons at a time to NEON. Initial working
commit. More work needs to be put into rearranging and
interlacing the data to take advantage of quadword
operations, which is when we'll hopefully see a much
better boost

Change-Id: I86d59d96f15e0d0f9710253e2c098ac2ff2865d1

commit | commitdiff | tree

John Koleszar [Thu, 9 Sep 2010 16:57:23 +0000 (12:57 -0400)]

Fix GF interval for non-lagged ARFs

When ARFs are enabled in non-lagged compress modes, the GF interval
was being reset to zero. Non-lagged ARF updates were enabled in commit
63ccfbd, but this incorrect GF interval caused a quality regression.

Change-Id: I615c3b493f4ce2127044f4e68d0bcb07d6b730c3

commit | commitdiff | tree

Fritz Koenig [Thu, 9 Sep 2010 15:54:21 +0000 (08:54 -0700)]

Merge branch 'master' of git://review.webmproject.org/libvpx

commit | commitdiff | tree

John Koleszar [Thu, 9 Sep 2010 12:16:39 +0000 (08:16 -0400)]

Use WebM in copyright notice for consistency

Changes 'The VP8 project' to 'The WebM project', for consistency
with other webmproject.org repositories.

Fixes issue #97.

Change-Id: I37c13ed5fbdb9d334ceef71c6350e9febed9bbba

commit | commitdiff | tree

Jim Bankoski [Thu, 22 Jul 2010 20:07:13 +0000 (16:07 -0400)]

Skip unnecessary search of identical frames

vp8_get_compressed_data() was defeating logic in
encode_frame_to_datarate() that determined the reference buffers to
search and forcing all frames to be eligible to search. In cases
where buffers have identical contents, this is unnecessary extra
work.

Change-Id: I9e667ac39128ae32dc455a3db4c62e3efce6f114

commit | commitdiff | tree

Jim Bankoski [Thu, 22 Jul 2010 20:07:13 +0000 (16:07 -0400)]

Enable ARFs for non-lagged compress

ARFs were explicitly disabled except in lagged compress mode. New
ARF logic allows for the ARF buffer to hold an older golden frame,
which does not require lagged compress.

Change-Id: I1dff82b6f53e8311f1e0514b1794ae05919d5f79

commit | commitdiff | tree

Fritz Koenig [Tue, 7 Sep 2010 17:52:54 +0000 (10:52 -0700)]

Bilinear subpixel optimizations for ssse3.

Used pmaddubsw for multiply and add of two filter taps
at once for 16x16 and 8x8 blocks.

Change-Id: Idccf2d6e094561624407b109fa7e80ba799355ea

commit | commitdiff | tree

Scott LaVarnway [Thu, 2 Sep 2010 20:17:52 +0000 (16:17 -0400)]

Reduced the size of MB_MODE_INFO

Moved partition_bmi and partition_count out of MB_MODE_INFO and
placed into MACROBLOCK.  Also reduced the size of other members
of the MB_MODE_INFO struct.  For 1080p, the memory was reduced
by 1,209,516 bytes.  The decoder performance appeared to improve
by 3% for the clip used.
Note:  The main goal for this change is to improve the decoder
performance.  The encoder will be revisited at a later date for
further structure cleanup.

Change-Id: I4733621292ee9cc3fffa4046cb3fd4d99bd14613

commit | commitdiff | tree

John Koleszar [Thu, 2 Sep 2010 18:56:47 +0000 (14:56 -0400)]

Update CHANGELOG for v0.9.2 release

Change-Id: I184e927987544e9f34f890249b589ea13a93a330

Domain: Multimedia / Media Playback;