hkuang [Thu, 5 Feb 2015 20:44:04 +0000 (12:44 -0800)]
Merge "Mute the harmless tsan error in frame parallel decode."
James Zern [Thu, 5 Feb 2015 20:18:24 +0000 (12:18 -0800)]
Merge "configure: enable x86inc for all intel platforms"
James Zern [Thu, 5 Feb 2015 03:16:12 +0000 (19:16 -0800)]
configure: enable x86inc for all intel platforms
there are no known issues since:
10d5e09 Fix issues in 32bit PIC enabled build
related issues: #808, #924
Change-Id: I80454f95fe6b4ce630fdd434d740ce8b0d42951b
Tom Finegan [Thu, 5 Feb 2015 01:33:41 +0000 (17:33 -0800)]
Merge "Xcode: Fix includes in examples."
Tom Finegan [Thu, 5 Feb 2015 00:11:57 +0000 (16:11 -0800)]
Xcode: Fix includes in examples.
The current file's directory, ".", is treated much more literally
when building libvpx examples with Xcode than it is with make, and
clang cannot find common include files included via "./" when those
files actually reside one directory up in the tree.
Change-Id: I5f66a026282e35d80248ca4052ebb882b859172e
Yaowu Xu [Wed, 4 Feb 2015 02:03:51 +0000 (18:03 -0800)]
Remove unnecessary initialization
loop_filter_level is always reset in loop_filter_frame() later in
encoder.
Change-Id: I608e03d905a6b23e7d5025ca747e4784c665007e
Yaowu Xu [Wed, 4 Feb 2015 01:50:48 +0000 (17:50 -0800)]
Move tx_mode decision logic into select_tx_mode()
Change-Id: I7f8f78c33eb3f33344b029a27bda320f4d68c577
Yaowu Xu [Wed, 4 Feb 2015 20:52:03 +0000 (12:52 -0800)]
Merge "Adjust partitioning threshold based rtc speed"
Yaowu Xu [Wed, 4 Feb 2015 20:51:16 +0000 (12:51 -0800)]
Merge "Move calls to avoid unnecessary operations"
hkuang [Wed, 4 Feb 2015 00:40:13 +0000 (16:40 -0800)]
Mute the harmless tsan error in frame parallel decode.
Change-Id: I52565fd90461221f89134997a0782cb1b681df01
Jingning Han [Wed, 4 Feb 2015 20:09:21 +0000 (12:09 -0800)]
Merge "Unify luma and chroma inter predictors in choose_partitioning"
Jingning Han [Wed, 4 Feb 2015 20:09:14 +0000 (12:09 -0800)]
Merge "Save an extra call for setup_pred_plane function"
Jingning Han [Wed, 4 Feb 2015 20:09:01 +0000 (12:09 -0800)]
Merge "Account for chroma component costs in RTC mode decision"
Yaowu Xu [Wed, 4 Feb 2015 18:37:57 +0000 (10:37 -0800)]
Adjust partitioning threshold based rtc speed
On rtc set:
speed 7 quality improves about 0.5%
speed 8 quality improves about 1.0%
Encoding time for speed 7 changes from 67804ms to 65889ms
Encoding time for speed 8 changes from 58659ms to 56808ms
Change-Id: Iabcfb53012fc1b9f3326cdbc167e5758b8c7ad30
Jingning Han [Wed, 4 Feb 2015 18:02:14 +0000 (10:02 -0800)]
Unify luma and chroma inter predictors in choose_partitioning
Change-Id: I8bfc80f4fffb0892e93d3326394a52d1ee3c0f37
Jingning Han [Tue, 3 Feb 2015 23:56:36 +0000 (15:56 -0800)]
Save an extra call for setup_pred_plane function
Reuse the yv12_mb array to fetch the buffer pointers/strides
corresponding to the current reference frame.
Change-Id: I5276b7494158b2cccef15213be2dc189e9036851
Jingning Han [Wed, 21 Jan 2015 17:32:23 +0000 (09:32 -0800)]
Account for chroma component costs in RTC mode decision
This commit allows the encoder to account for additional chroma
plane costs in the mode decision process, if the current block
potentially contains significant color change. It improves the
visual quality at very low bit-rates.
The compression performance of dark720p is improved by 12.39% in
speed 6. For jimred at 150 kbps, the PSNR of V component (red)
increased by 0.2 dB, at the expense of about 5% increase in
encoding time. Note that for sequences where the chroma components
are fairly consistent, the encoding time increase is negligible.
On average the rtc set compression performance is improved by
1.172% in PSNR and 1.920% in SSIM.
Change-Id: Ia55b24ef23a25304f7ec9958fbf07fd6e658505c
Yunqing Wang [Sat, 31 Jan 2015 01:00:54 +0000 (17:00 -0800)]
vp9_dthread: remove frame_parallel_decoding_mode requirement
This patch continues the work to remove frame_parallel_decoding_mode
requirement in VP9 multi-threaded tile decoder. In order to do that,
the frame counts associated to each thread need to be accumulated
together after the frame is decoded.
Change-Id: Idba1a756cedfed3c154aef52ed82c8da3bbf9e0c
Johann [Wed, 4 Feb 2015 01:12:56 +0000 (17:12 -0800)]
Merge "Remove unnecessary pointer check"
Yaowu Xu [Wed, 4 Feb 2015 01:01:37 +0000 (17:01 -0800)]
Move calls to avoid unnecessary operations
Change-Id: I236f7f75ab9a4511d1b52a6a67299b0e844a103e
Yaowu Xu [Tue, 3 Feb 2015 23:13:52 +0000 (15:13 -0800)]
Merge "adjust rtc setting and threshold"
Johann [Tue, 3 Feb 2015 21:52:14 +0000 (13:52 -0800)]
Merge "Use correct buffer size in vp8 subpixel variance"
hkuang [Tue, 3 Feb 2015 21:37:48 +0000 (13:37 -0800)]
Merge "Remove duplicate code."
Jim Bankoski [Tue, 3 Feb 2015 21:25:06 +0000 (13:25 -0800)]
Merge "make low bitrates a lot less blocky"
Johann [Tue, 3 Feb 2015 20:58:34 +0000 (12:58 -0800)]
Remove unnecessary pointer check
The original implementation had the following comment:
// Ignore mv costing if mvsadcost is NULL
However the current implementation does not allow for this.
If x exists then nmvsadcost must not be null.
This removes the only warning from -Wpointer-bool-conversion
https://code.google.com/p/webm/issues/detail?id=894
Change-Id: I1a2cee340d7972d41e1bbbe1ec8dfbe917667085
Jingning Han [Tue, 3 Feb 2015 20:25:18 +0000 (12:25 -0800)]
Merge "Assign 2nd ref frame in choose_partitioning"
Johann [Tue, 3 Feb 2015 19:19:24 +0000 (11:19 -0800)]
Merge "Fail when only an old nasm is found"
Jingning Han [Tue, 3 Feb 2015 19:17:51 +0000 (11:17 -0800)]
Assign 2nd ref frame in choose_partitioning
Avoid the use of uninitialized second reference frame for fetching
reference block.
Change-Id: I9983a0daea829700b3270dc8bf2bcc6d6ea36652
Yunqing Wang [Tue, 3 Feb 2015 18:52:02 +0000 (10:52 -0800)]
Merge "vp9_dthread: pass frame counts to decoder functions"
Yaowu Xu [Tue, 3 Feb 2015 18:36:42 +0000 (10:36 -0800)]
Merge "Add mutex initialization in encoder"
Yaowu Xu [Tue, 3 Feb 2015 17:52:21 +0000 (09:52 -0800)]
Add mutex initialization in encoder
This resolves the encoder crashes on windows.
Change-Id: I159d79014cf9279751e403936ce1f84482ae82da
Johann [Tue, 3 Feb 2015 17:40:12 +0000 (09:40 -0800)]
Merge "Ensure the error-concealment code is available"
Yunqing Wang [Fri, 30 Jan 2015 18:14:44 +0000 (10:14 -0800)]
vp9_dthread: pass frame counts to decoder functions
The current multi-threaded tile decoder requires that the videoes
are encoded with frame_parallel_decoding_mode = 1. This requirement
is not necessary, and is better to be removed. This patch includes
the first part of the work.
Change-Id: Ic7695fb3cfe13f9022582c9f0edd2aa6e2e36d28
Johann [Tue, 3 Feb 2015 17:02:50 +0000 (09:02 -0800)]
Use correct buffer size in vp8 subpixel variance
In vp8_sub_pixel_variance8x8_neon the temp2 buffer is only initialized
to kHeight8 * kWidth8. However, in the case that xoffset != 0 and
yoffset == 0, var_filter_block2d_bil_w8 is called with output_width
kHeight8PlusOne.
Thanks to cmugurel for diagnosing and yulius for the patch.
Change-Id: Ib71ffd96ffad963c92b8b7ca23f303942785b8e0
https://code.google.com/p/webrtc/issues/detail?id=4190
Johann [Fri, 30 Jan 2015 22:39:01 +0000 (14:39 -0800)]
Fail when only an old nasm is found
Apple ships version 0.98 of nasm through at least XCode 6. It is
incompatible with the assembly in libvpx.
https://code.google.com/p/webm/issues/detail?id=772
Change-Id: I33245a76f50a8224fe6fafa3cce9991f953fdcc8
Jim Bankoski [Tue, 3 Feb 2015 14:45:56 +0000 (06:45 -0800)]
make low bitrates a lot less blocky
Remove loop filter skip at speed 7+ because of bad visual artifacts and
up the postprocessing.
Change-Id: Ibdd0bac71aaee232d2bb2e14462733c51517768d
Yaowu Xu [Mon, 2 Feb 2015 16:05:31 +0000 (08:05 -0800)]
adjust rtc setting and threshold
1. Adjusted the threshold for coef update computation based on counts
of tx used, avoid coef update computation when count is low (<20)
2. Move sf->lpf_pick = LPF_PICK_MINIMAL_LPF to speed 8.
Change-Id: I02b44309e40fcdbf135c7934ae067a3f42502d30
hkuang [Tue, 3 Feb 2015 01:08:42 +0000 (17:08 -0800)]
Merge "Fix a bug from merging frame parallel branch into "
hkuang [Mon, 2 Feb 2015 23:34:21 +0000 (15:34 -0800)]
Fix a bug from merging frame parallel branch into
The merge did not merge the fix for issue #850.
Change-Id: I0dc1377dbfcb9497fb01a13d4f78ac65bff5eb33
Johann [Mon, 2 Feb 2015 21:13:04 +0000 (13:13 -0800)]
Merge "Require webm when explicitly requested"
Alex Converse [Mon, 2 Feb 2015 20:05:56 +0000 (12:05 -0800)]
Merge "Allow larger encoder configurations."
Johann [Fri, 30 Jan 2015 23:05:14 +0000 (15:05 -0800)]
Require webm when explicitly requested
https://code.google.com/p/webm/issues/detail?id=906
Change-Id: I72841078ff81152d21d84ccf4d5548e757685a6d
Adrian Grange [Mon, 2 Feb 2015 16:11:14 +0000 (08:11 -0800)]
Merge "Abort if firstpass file does not exist"
Yaowu Xu [Mon, 2 Feb 2015 04:08:29 +0000 (20:08 -0800)]
Merge "Optimize coef update"
hkuang [Tue, 27 Jan 2015 20:26:28 +0000 (12:26 -0800)]
Try again to merge branch 'frame-parallel' into master branch.
In frame parallel decode, libvpx decoder decodes several frames on all
cpus in parallel fashion. If not being flushed, it will only return frame
when all the cpus are busy. If getting flushed, it will return all the
frames in the decoder. Compare with current serial decode mode in which
libvpx decoder is idle between decode calls, libvpx decoder is busy
between decode calls.
Current frame parallel decode will only speed up the decoding for frame
parallel encoded videos. For non frame parallel encoded videos, frame
parallel decode is slower than serial decode due to lack of loopfilter
worker thread.
There are still some known issues that need to be addressed. For example:
decode frame parallel videos with segmentation enabled is not right sometimes.
* frame-parallel:
Add error handling for frame parallel decode and unit test for that.
Fix a bug in frame parallel decode and add a unit test for that.
Add two test vectors to test frame parallel decode.
Add key frame seeking to webmdec and webm_video_source.
Implement frame parallel decode for VP9.
Increase the thread test range to cover 5, 6, 7, 8 threads.
Fix a bug in adding frame parallel unit test.
Add VP9 frame-parallel unit test.
Manually pick "Make the api behavior conform to api spec." from master branch.
Move vp9_dec_build_inter_predictors_* to decoder folder.
Add segmentation map array for current and last frame segmentation.
Include the right header for VP9 worker thread.
Move vp9_thread.* to common.
ctrl_get_reference does not need user_priv.
Seperate the frame buffers from VP9 encoder/decoder structure.
Revert "Revert "Revert "Revert 3 patches from Hangyu to get Chrome to build:"""
Conflicts:
test/codec_factory.h
test/decode_test_driver.cc
test/decode_test_driver.h
test/invalid_file_test.cc
test/test-data.sha1
test/test.mk
test/test_vectors.cc
vp8/vp8_dx_iface.c
vp9/common/vp9_alloccommon.c
vp9/common/vp9_entropymode.c
vp9/common/vp9_loopfilter_thread.c
vp9/common/vp9_loopfilter_thread.h
vp9/common/vp9_mvref_common.c
vp9/common/vp9_onyxc_int.h
vp9/common/vp9_reconinter.c
vp9/decoder/vp9_decodeframe.c
vp9/decoder/vp9_decodeframe.h
vp9/decoder/vp9_decodemv.c
vp9/decoder/vp9_decoder.c
vp9/decoder/vp9_decoder.h
vp9/encoder/vp9_encoder.c
vp9/encoder/vp9_pickmode.c
vp9/encoder/vp9_rdopt.c
vp9/vp9_cx_iface.c
vp9/vp9_dx_iface.c
This reverts commit
a18da9760a74d9ce6fb9f875706dc639c95402f5.
Change-Id: I361442ffec1586d036ea2e0ee97ce4f077585f02
James Zern [Fri, 30 Jan 2015 23:52:24 +0000 (15:52 -0800)]
vp9: rename 'near' parameters
+ nearest for consistency
near is a reserved word in windows builds so using it as a parameter
name may cause build failures with some configurations
Change-Id: Iddf1d4ecdb39843f14e95dbfd9dca55f07f81403
Jingning Han [Fri, 30 Jan 2015 23:49:14 +0000 (15:49 -0800)]
Merge "Format fixes in vp9_rd_pick_inter_mode_sb/sub8x8"
James Zern [Fri, 30 Jan 2015 21:16:02 +0000 (13:16 -0800)]
Merge "configure: echo --(disable|enable)-* cmdline options"
Johann [Fri, 30 Jan 2015 19:19:28 +0000 (11:19 -0800)]
Merge "Explicitly include vp8_rtcd.h"
Adrian Grange [Fri, 30 Jan 2015 18:42:29 +0000 (10:42 -0800)]
Abort if firstpass file does not exist
This fixes a crash if the firstpass file does not
exist when doing a two-pass encode.
Change-Id: I3a1a95d68d57125c63123d6208af7537f5a689a0
Yaowu Xu [Thu, 29 Jan 2015 18:22:48 +0000 (10:22 -0800)]
Optimize coef update
1. move the check of search method of USE_TX_8X8 up one level to
avoid operations of build_tree_distributions()
2. count tx used and avoid computaton for coef udpate when one size
is not used at all.
Change-Id: Ia3e54a2588aa531c41377a1bfaa64385d04a592c
Yunqing Wang [Thu, 29 Jan 2015 19:33:50 +0000 (11:33 -0800)]
Enable use_x86inc for 32bit pic enabled Darwin target
The previous patch "Fix issues in 32bit PIC enabled build" fixed
the x86inc.asm for macho32. Now we can enable use_x86inc while
building libvpx for 32bit pic enabled Darwin target, which makes
the encoder a lot faster(>2X) in this case by turning on the
existing optimizations.
Change-Id: I5f5c7add428d73f50c935c48d0a70aed2b1eb7af
Yunqing Wang [Fri, 30 Jan 2015 00:41:25 +0000 (16:41 -0800)]
Merge "Fix issues in 32bit PIC enabled build"
Alex Converse [Thu, 15 Jan 2015 21:56:55 +0000 (13:56 -0800)]
Allow larger encoder configurations.
Allow changing colorspace in the encoder and increasing frame size.
Change-Id: I8e7c3b891af29ce420a15beb4f6f9c250245b2bb
Paul Wilkins [Thu, 29 Jan 2015 21:50:52 +0000 (13:50 -0800)]
Merge "Change to update of rate control factors."
Johann [Thu, 29 Jan 2015 17:59:16 +0000 (09:59 -0800)]
Explicitly include vp8_rtcd.h
When referencing RTCD functions make sure the relevant
header file is included.
Change-Id: Ia0d7112d4aff9b4d8fa94648f0702371b7484031
https://code.google.com/p/webm/issues/detail?id=937
Marco [Thu, 29 Jan 2015 17:10:30 +0000 (09:10 -0800)]
Merge "Fix to vp9 denoiser."
James Zern [Thu, 29 Jan 2015 04:30:51 +0000 (20:30 -0800)]
configure: echo --(disable|enable)-* cmdline options
gives a better summary of what is enabled / disabled outside of the
automatic toolchain options.
fixes issue #936
Change-Id: I1bf27593a5512713aab1473cb606c58cf3084d62
Paul Wilkins [Tue, 27 Jan 2015 02:17:14 +0000 (18:17 -0800)]
Change to update of rate control factors.
Remove damping parameter and use the damping
formula introduced by Yaowu Xu in all cases.
Change-Id: I18db7e0d0f262d5140102f259ab07821d374d285
Yaowu Xu [Wed, 28 Jan 2015 23:12:42 +0000 (15:12 -0800)]
Simplify update_coef_probs()
1. reduce the size of temporaray arrays on stack
2. avoid build_tree_distribution for tx size that is not used at all.
Change-Id: I0f8d7124e16a3789d3c15ad24cf02c1c12789e2c
Marco [Wed, 28 Jan 2015 18:24:25 +0000 (10:24 -0800)]
Fix to vp9 denoiser.
Prevent from using wrong mv for denoiser motion compensation.
Change-Id: Ifa0f9daabdbdab0900d3c17304059fe0d15de914
hkuang [Wed, 28 Jan 2015 20:00:34 +0000 (12:00 -0800)]
Remove duplicate code.
(issue #934).
Change-Id: Ic8adaaff87aae0b33d9b508f160b48e0ccdaaf4c
Alex Converse [Wed, 28 Jan 2015 19:22:36 +0000 (11:22 -0800)]
Merge "vp8enc: Prevent out of bounds memory access."
Frank Galligan [Wed, 28 Jan 2015 18:35:50 +0000 (10:35 -0800)]
Merge "Add vp9_sad32x32x4d_neon Neon intrinsic function."
Frank Galligan [Wed, 28 Jan 2015 07:01:44 +0000 (23:01 -0800)]
Merge "Add vp9_sad16x16x4d_neon Neon intrinsic function."
Frank Galligan [Wed, 28 Jan 2015 07:01:15 +0000 (23:01 -0800)]
Merge "Add vp9_sad64x64x4d_neon Neon intrinsic function."
Yunqing Wang [Tue, 27 Jan 2015 05:35:07 +0000 (21:35 -0800)]
Fix issues in 32bit PIC enabled build
This patch was to fix issue 924:
https://code.google.com/p/webm/issues/detail?id=924
The SECTION_RODATA macro was modified to support macho32 format.
The sub-pixel functions were modified to pass in 2 more parameters
to handle the global offsets for PIC build.
Change-Id: I3bfcd336bcae945edf300bca4ab40376a2628cd4
Alex Converse [Sat, 17 Jan 2015 00:02:05 +0000 (16:02 -0800)]
vp8enc: Prevent out of bounds memory access.
Prevent out of bounds access when attempting to increase frame size
Change-Id: I710c40c692802a72963c9680c2125da17f9060a9
Yaowu Xu [Tue, 27 Jan 2015 20:42:13 +0000 (12:42 -0800)]
Merge "move clear_system_state() call before using double"
Johann [Tue, 27 Jan 2015 18:49:26 +0000 (10:49 -0800)]
Merge "Fix discovery of Darwin SDKs"
Frank Galligan [Sun, 25 Jan 2015 00:28:20 +0000 (16:28 -0800)]
Add vp9_sad32x32x4d_neon Neon intrinsic function.
On Nexus 7 speed -6 saw ~18% increase in perf.
Tested on Nexus 7, built with ndk r10d, gcc 4.9.
BUG=https://code.google.com/p/webm/issues/detail?id=908
Change-Id: I70ccdea0326750552ed946fb004507d6efe02d5c
Frank Galligan [Sat, 24 Jan 2015 23:43:36 +0000 (15:43 -0800)]
Add vp9_sad16x16x4d_neon Neon intrinsic function.
On Nexus 7 speed -6 saw ~15% increase in perf.
Tested on Nexus 7, built with ndk r10d, gcc 4.9.
BUG=https://code.google.com/p/webm/issues/detail?id=908
Change-Id: I4b2006b644c488f42bf06d8a22ef0e6120a96bf9
Frank Galligan [Sat, 24 Jan 2015 20:11:16 +0000 (12:11 -0800)]
Add vp9_sad64x64x4d_neon Neon intrinsic function.
On Nexus 7 speed -6 saw ~30% increase in perf.
Tested on Nexus 7, built with ndk r10d, gcc 4.9.
BUG=https://code.google.com/p/webm/issues/detail?id=908
Change-Id: Id12af7d1883243c23e6692e898aea82299633d58
Marco [Tue, 27 Jan 2015 03:47:13 +0000 (19:47 -0800)]
Merge "aq-mode=3: Update to allow for refresh on modes other than zero-mv."
Lawrence Velázquez [Sat, 24 Jan 2015 03:30:23 +0000 (22:30 -0500)]
Fix discovery of Darwin SDKs
The current method doesn't work with Xcode 4 and up, since they no
longer have a $DEVELOPER_DIR/SDKs directory. Using xcrun and xcodebuild
works all the way back to Xcode 3 on OS X 10.6 Snow Leopard, if not
earlier.
Change-Id: I7126f2fb4a8f1d6e46f921e70bbd090f00ce3d36
Yaowu Xu [Mon, 26 Jan 2015 23:29:15 +0000 (15:29 -0800)]
move clear_system_state() call before using double
Floating point is used in vp9_convert_qindex_to_q(), so sometime unit
test ActiveMapTest would cause run time error without properly call
to clear_system_state to reset register status.
Change-Id: I181e9395148c44a6ca8b97d6e109bd4a152143c6
Paul Wilkins [Tue, 27 Jan 2015 02:19:09 +0000 (18:19 -0800)]
Merge "Adjust active maxq for GF groups."
Yaowu Xu [Tue, 27 Jan 2015 00:52:30 +0000 (16:52 -0800)]
Merge "Fix MSVC warnings on conversion from int64 to int"
Marco [Thu, 22 Jan 2015 00:09:13 +0000 (16:09 -0800)]
aq-mode=3: Update to allow for refresh on modes other than zero-mv.
Add distortion threshold condition to refresh state of a coding block,
and allow for qp adjustment also for some intra modes and non-zero motion modes.
Also some code cleanup (remove unused variables/code).
Change-Id: I735fa2b28bc64f60e0323976b82510577b074203
Tom Finegan [Mon, 26 Jan 2015 23:10:18 +0000 (15:10 -0800)]
Merge "iosbuild.sh: Increase build speed."
Paul Wilkins [Tue, 20 Jan 2015 23:23:57 +0000 (15:23 -0800)]
Adjust active maxq for GF groups.
Currently disabled by default: enabled using
#define GROUP_ADAPTIVE_MAXQ
In this patch the active max Q is adjusted for each GF
group based on the vbr bit allocation and raw first pass
group error.
This will tend to give a lower q for easy sections
and a higher value for very hard sections. As such it is
expected to improve quality in some of the easier
sections where quality issues have been reported.
This change tends to hurt overall psnr but help
average psnr. SSIM also shows a small gain.
Average results for derf, yt, std-hd and yt-hd test sets were
as follows (%change for average psnr, overal psnr and ssim):-
derf +0.291, - 0.252, -0.021
yt +6.466, -1.436, +0.552
std-hd +0.490, +0.014, +0.380
yt-hd +5.565, - 1.573, +0.099
Change-Id: Icc015499cebbf2a45054a05e8e31f3dfb43f944a
Yaowu Xu [Mon, 26 Jan 2015 18:54:06 +0000 (10:54 -0800)]
Fix MSVC warnings on conversion from int64 to int
Change-Id: I7e96509ffa36899fcd2935749927a1e8aac8d025
Frank Galligan [Fri, 16 Jan 2015 03:29:46 +0000 (19:29 -0800)]
Add Neon intrinsic vp9_fdct8x8_quant_neon
On Nexus 7 speed -5 got ~2%, -6 got ~15%, -7 and -8 got ~30%
increase in perf.
Tested on Nexus 7, built with ndk r10d, gcc 4.9.
Change-Id: I83246d63b96674d170098a572fa4fe28a05aaf51
Yaowu Xu [Sat, 24 Jan 2015 05:12:18 +0000 (21:12 -0800)]
Merge "Replace divide with look-up"
James Zern [Fri, 23 Jan 2015 22:13:51 +0000 (14:13 -0800)]
x86: correct OSXSAVE + AVX bit check
the result should have both bits set; previously this was converted from
webp incorrectly and resulted in a boolean check...
Change-Id: I2a7c7f2b491945f3a536ab4fca02247eccc892b8
Jingning Han [Fri, 23 Jan 2015 19:47:15 +0000 (11:47 -0800)]
Format fixes in vp9_rd_pick_inter_mode_sb/sub8x8
Add parentheses to bit operations.
Change-Id: I095d601f0631d055adc4b3a8fde70c9cbae9e749
JackyChen [Fri, 23 Jan 2015 19:08:16 +0000 (11:08 -0800)]
Merge "SSE2 code for the filter in MFQE."
Adrian Grange [Fri, 23 Jan 2015 17:57:03 +0000 (09:57 -0800)]
Merge "Remove elevate_newmv_thresh from SPEED_FEATURES (unused)"
Yaowu Xu [Thu, 22 Jan 2015 23:27:43 +0000 (15:27 -0800)]
Replace divide with look-up
This commit replaces an integer divide with a table-lookup. It is
to improve decoding speed, and at the same time, to reduce possible
complications with a bug in AMD Family 12h processors:
"665 Integer Divide Instruction May Cause Unpredictable Behavior"
Change-Id: I678b707a538798a923850bac467e66e847e6def7
Johann [Fri, 23 Jan 2015 16:43:15 +0000 (08:43 -0800)]
Merge "Revert "Merge branch 'frame-parallel' to enable frame parallel decode in master branch.""
Johann [Fri, 23 Jan 2015 16:42:02 +0000 (08:42 -0800)]
Revert "Merge branch 'frame-parallel' to enable frame parallel decode in master branch."
This reverts commit
bde04ce5039cbcf86c8b34bdb4127e18d7e1d0c7
Change-Id: I053dae04c761b04a36dc239558503905a14d2470
James Zern [Fri, 23 Jan 2015 04:04:07 +0000 (20:04 -0800)]
Merge "workaround stack bashing by asm on 32-bit OpenBSD"
hkuang [Fri, 23 Jan 2015 02:19:04 +0000 (18:19 -0800)]
Merge branch 'master' of ssh://gerrit.chromium.org:29418/webm/libvpx
* 'master' of ssh://gerrit.chromium.org:29418/webm/libvpx:
Add libvpx build targets for OS X 10.10 Yosemite.
hkuang [Wed, 21 Jan 2015 22:51:08 +0000 (14:51 -0800)]
Merge branch 'frame-parallel' to enable frame parallel decode in master branch.
In frame parallel decode, libvpx decoder decodes several frames on all
cpus in parallel fashion. If not being flushed, it will only return frame
when all the cpus are busy. If getting flushed, it will return all the
frames in the decoder. Compare with current serial decode mode in which
libvpx decoder is idle between decode calls, libvpx decoder is busy
between decode calls. VP9 frame parallel decode is >30% faster than serial
decode with tile parallel threading which will makes devices play 1080P
VP9 videos more easily.
* frame-parallel:
Add error handling for frame parallel decode and unit test for that.
Fix a bug in frame parallel decode and add a unit test for that.
Add two test vectors to test frame parallel decode.
Add key frame seeking to webmdec and webm_video_source.
Implement frame parallel decode for VP9.
Increase the thread test range to cover 5, 6, 7, 8 threads.
Fix a bug in adding frame parallel unit test.
Add VP9 frame-parallel unit test.
Manually pick "Make the api behavior conform to api spec." from master branch.
Move vp9_dec_build_inter_predictors_* to decoder folder.
Add segmentation map array for current and last frame segmentation.
Include the right header for VP9 worker thread.
Move vp9_thread.* to common.
ctrl_get_reference does not need user_priv.
Seperate the frame buffers from VP9 encoder/decoder structure.
Revert "Revert "Revert "Revert 3 patches from Hangyu to get Chrome to build:"""
Conflicts:
test/codec_factory.h
test/decode_test_driver.cc
test/decode_test_driver.h
test/invalid_file_test.cc
test/test-data.sha1
test/test.mk
test/test_vectors.cc
vp8/vp8_dx_iface.c
vp9/common/vp9_alloccommon.c
vp9/common/vp9_entropymode.c
vp9/common/vp9_loopfilter_thread.c
vp9/common/vp9_loopfilter_thread.h
vp9/common/vp9_mvref_common.c
vp9/common/vp9_onyxc_int.h
vp9/common/vp9_reconinter.c
vp9/decoder/vp9_decodeframe.c
vp9/decoder/vp9_decodeframe.h
vp9/decoder/vp9_decodemv.c
vp9/decoder/vp9_decoder.c
vp9/decoder/vp9_decoder.h
vp9/encoder/vp9_encoder.c
vp9/encoder/vp9_pickmode.c
vp9/encoder/vp9_rdopt.c
vp9/vp9_cx_iface.c
vp9/vp9_dx_iface.c
Change-Id: Ib92eb35851c172d0624970e312ed515054e5ca64
Johann [Fri, 23 Jan 2015 02:11:21 +0000 (18:11 -0800)]
Merge "Add libvpx build targets for OS X 10.10 Yosemite."
Adrian Grange [Thu, 22 Jan 2015 22:53:18 +0000 (14:53 -0800)]
Remove elevate_newmv_thresh from SPEED_FEATURES (unused)
Change-Id: I78ef7f89586a329787f6bc4c58ec83af210989a3
Lawrence Velázquez [Thu, 22 Jan 2015 21:46:02 +0000 (16:46 -0500)]
Add libvpx build targets for OS X 10.10 Yosemite.
Change-Id: I5baa4405e0b52fd3b6f312bd2dc94b19e6ff3da7
Tom Finegan [Thu, 22 Jan 2015 23:18:30 +0000 (15:18 -0800)]
iosbuild.sh: Increase build speed.
Disable more stuff to speed up the build, and log default configure
args in verbose mode.
Change-Id: I40e55fc5e8d2bff0262e1d6bd4a40ee2c10d2b6d
Marco [Tue, 6 Jan 2015 01:13:13 +0000 (17:13 -0800)]
Modify variance partition selection for low resolutions.
For low spatial resolutions: bias partittion selection to smaller block sizes,
and base the variance computation on 4x4 down-sampling.
Also move the threshold computations into the choose_partitioning,
so they are computed once for each sb block.
On low-res clips (RTC_derf) PSNR/SSIMetrics increase by about 4-5%.
No change for resolutions above CIF.
Change-Id: I93f8ff742c8044786977bb6e31dcf8efda6dd1b0
Paul Wilkins [Thu, 22 Jan 2015 16:28:19 +0000 (08:28 -0800)]
Merge "Bug when last group before forced key frame is short."