Marco Paniconi [Fri, 13 Jan 2017 19:07:56 +0000 (19:07 +0000)]
Merge "vp9: Adjust threshold for copy partiton, for speed=8."
Marco [Fri, 13 Jan 2017 18:27:51 +0000 (10:27 -0800)]
vp9: Adjust threshold for copy partiton, for speed=8.
Change-Id: I4799cb2b67d911ee385e6d6992c61633ca77e69d
Jingning Han [Fri, 13 Jan 2017 18:25:16 +0000 (18:25 +0000)]
Merge "Rework 8x8 transpose SSSE3 for avg computation"
Jingning Han [Fri, 13 Jan 2017 18:25:09 +0000 (18:25 +0000)]
Merge "Rework 8x8 transpose SSSE3 for inverse 2D-DCT"
Marco Paniconi [Fri, 13 Jan 2017 06:22:52 +0000 (06:22 +0000)]
Merge "vp9: Update threshold for partition copy."
Jerome Jiang [Fri, 13 Jan 2017 01:59:22 +0000 (17:59 -0800)]
vp9: Update threshold for partition copy.
Avoid many visual artifacts. Compression quality is improved by more
than 1%. Encode speed is about 4% for QVGA and 6% for VGA faster on
android.
Change-Id: I4dd0a81429ddf7efdef1e80a191da5fb8de8e8af
Johann [Thu, 12 Jan 2017 23:40:14 +0000 (15:40 -0800)]
Merge remote-tracking branch 'origin/longtailedduck'
Jingning Han [Thu, 12 Jan 2017 23:15:14 +0000 (15:15 -0800)]
Rework 8x8 transpose SSSE3 for avg computation
Use same transpose process as inv_txfm_sse2 does.
Change-Id: I2db05f0b254628a11f621c4c09abb89501ba6d3c
Jingning Han [Thu, 12 Jan 2017 23:06:30 +0000 (15:06 -0800)]
Rework 8x8 transpose SSSE3 for inverse 2D-DCT
Use same transpose process as inv_txfm_sse2 does.
Change-Id: Ic4827825bd174cba57a0a80e19bf458a648e7d94
Peter Boström [Thu, 12 Jan 2017 20:21:15 +0000 (15:21 -0500)]
Add decoder getters for the last quantizer.
To be used for frame stats output of vpxdec.
Change-Id: I0739a01bd3635c4b3fedd58f3e27363ce8fb1b1e
Johann [Fri, 30 Dec 2016 00:31:22 +0000 (16:31 -0800)]
Release v1.6.1 Long Tailed Duck
Change-Id: If27447472417c7ed34238295427ddb9da0561725
Marco Paniconi [Thu, 12 Jan 2017 17:54:41 +0000 (17:54 +0000)]
Merge "vp9: Make the denoiser work with spatial SVC."
Johann Koenig [Thu, 12 Jan 2017 01:02:58 +0000 (01:02 +0000)]
Merge "Create a class for buffers used in tests"
Peter Boström [Wed, 11 Jan 2017 21:05:47 +0000 (21:05 +0000)]
Merge "Add Y,U,V channel metrics and unweighted metrics."
Jerome Jiang [Wed, 11 Jan 2017 20:50:42 +0000 (20:50 +0000)]
Merge "vp9: Turn on the partition copy for speed 8. Tune threshold."
Johann Koenig [Wed, 11 Jan 2017 20:22:27 +0000 (20:22 +0000)]
Merge "arm idct16x16: remove extra config guards"
Peter Boström [Wed, 11 Jan 2017 17:28:03 +0000 (12:28 -0500)]
Add Y,U,V channel metrics and unweighted metrics.
Renames SSIM to VpxSSIM as an upscaled weighted SSIM metric, then prints
Y, U and V channels unweighted as well as a weighted but not scaled SSIM
score that's 8/1/1 parts Y/U/V (same as VpxSSIM).
Change-Id: Iff800cc8f145314eeb1a9b4af1e11a25bec095ca
Jingning Han [Wed, 11 Jan 2017 19:28:39 +0000 (19:28 +0000)]
Merge "Rework forward 8x8 2D-DCT ssse3 implementation"
Jerome Jiang [Tue, 10 Jan 2017 20:43:22 +0000 (12:43 -0800)]
vp9: Turn on the partition copy for speed 8. Tune threshold.
For speed 8, it speeds up the encoding on android by 6% for QVGA and
7.4% for VGA with the new threshold. Overall PSNR is improved by 0.667
for rtc.
Change-Id: I4a644560b32c0b5b4e9f49ffb953d000413a3732
Johann [Wed, 11 Jan 2017 18:17:14 +0000 (10:17 -0800)]
arm idct16x16: remove extra config guards
This file is guarded by HAVE_NEON_ASM in the .mk file now.
Change-Id: I513a621c234aa90ad52e426c8ed494d8a7d4b74a
Johann [Mon, 24 Oct 2016 19:17:51 +0000 (12:17 -0700)]
Create a class for buffers used in tests
Demonstrate its use with the IDCT test.
Change-Id: Idf87fe048847c180f13818fd4df916ba4500134b
hui su [Wed, 11 Jan 2017 00:37:59 +0000 (16:37 -0800)]
Add "Large" label to VP9 target level tests
Also reduce the number of test frames.
Change-Id: Iea6fa93ca6b924535aef7bf8b388db4d0ec84c08
Marco [Wed, 21 Dec 2016 22:33:21 +0000 (14:33 -0800)]
vp9: Make the denoiser work with spatial SVC.
If enabled denoiser will only denoise the top spatial layer for now.
Added unittest for SVC with denoising.
Change-Id: Ifa373771c4ecfa208615eb163cc38f1c22c6664b
Jingning Han [Mon, 9 Jan 2017 22:00:29 +0000 (14:00 -0800)]
Rework forward 8x8 2D-DCT ssse3 implementation
This commit reworks the SSSE3 implementation of the forward 8x8
2D-DCT. It uses a cyclic rotation approach to the temporary xmm
registers. It reduces the average cycles from 158 to 154. The SSE2
version uses 169 cycles.
Change-Id: I1b79b9642aae0ed3fb3cefb5b70246e6de5d5caa
Marco [Tue, 10 Jan 2017 00:38:49 +0000 (16:38 -0800)]
vp9: 1 pass cbr: Adjustments to usage of gf_cbr_boost and aq=3 mode.
When aq=3 mode is on and the gf_cbr_boost is set: make sure golden frame
is always refreshed, and don't incorporate segement cost in qp setting
on the boosted golden frame.
Better performance on RTC set with gf_cbr_boost on,
for example with gf_cbr_boost=50, gains from ~0.5-3%.
Change-Id: Ie811f5e4d444ff3320bd6e2c1745b2c4c09a8460
Jerome Jiang [Tue, 10 Jan 2017 00:51:09 +0000 (00:51 +0000)]
Merge "vp9: Set less aggresive short_circuit_low_temp_var for HD at speed 8."
Jerome Jiang [Mon, 9 Jan 2017 23:04:13 +0000 (15:04 -0800)]
vp9: Set less aggresive short_circuit_low_temp_var for HD at speed 8.
Quality improved by 1.866 and 0.386 for two noisy clips (dark720p and
marcooffice720p), respectively.
Change-Id: Ib33a7672ae9ca53da156208f7cd13f07b5543e44
Jerome Jiang [Mon, 9 Jan 2017 23:53:41 +0000 (23:53 +0000)]
Merge "Fix compile warnings for target=armv7-android-gcc"
James Zern [Mon, 9 Jan 2017 23:52:29 +0000 (23:52 +0000)]
Merge "Refine 8-bit 16x16 idct NEON intrinsics"
Marco Paniconi [Mon, 9 Jan 2017 23:30:32 +0000 (23:30 +0000)]
Merge "vp9: 1 pass cbr: Fix to qp clamping when gf_cbr_boost_pct is used."
Marco [Mon, 9 Jan 2017 21:03:50 +0000 (13:03 -0800)]
vp9: Fix comment in speed features.
Change-Id: I65d79c06b152922d725bf559adaa508f91cd5766
Marco [Mon, 9 Jan 2017 20:46:01 +0000 (12:46 -0800)]
vp9: 1 pass cbr: Fix to qp clamping when gf_cbr_boost_pct is used.
Avoid the qp-clamping on gf/alt frame if gf_cbr_boost_pct is set.
Change only affect CBR mode when gf_cbr_boost_pct is set.
Change-Id: I0655ed4f2b047c8ed1ed33a070c17960ad776704
Johann Koenig [Mon, 9 Jan 2017 19:53:15 +0000 (19:53 +0000)]
Merge "postproc: vpx_mbpost_proc_down_neon"
Johann Koenig [Mon, 9 Jan 2017 19:49:02 +0000 (19:49 +0000)]
Merge "Add mips dspr2 partial idct tests"
Johann Koenig [Mon, 9 Jan 2017 19:47:47 +0000 (19:47 +0000)]
Merge "Fix mips dspr2 idct32x32 functions for large coefficient input"
Johann Koenig [Mon, 9 Jan 2017 19:47:00 +0000 (19:47 +0000)]
Merge "Fix mips dspr2 idct16x16 functions for large coefficient input"
Johann Koenig [Mon, 9 Jan 2017 19:46:18 +0000 (19:46 +0000)]
Merge "Fix mips dspr2 idct8x8 functions for large coefficient input"
Johann Koenig [Mon, 9 Jan 2017 19:45:53 +0000 (19:45 +0000)]
Merge "Fix mips dspr2 idct4x4 functions for large coefficient input"
Johann Koenig [Mon, 9 Jan 2017 19:39:13 +0000 (19:39 +0000)]
Merge "Add mips dspr2 vp9 intrapred tests"
Johann [Thu, 22 Dec 2016 18:04:42 +0000 (10:04 -0800)]
postproc: vpx_mbpost_proc_down_neon
This was much more amenable to optimization than the across filter.
Speedup of almost 2.5x
BUG=webm:1320
Change-Id: I49acc0f9cb2e7642303df90132cbc938acade4c4
Johann Koenig [Mon, 9 Jan 2017 18:17:26 +0000 (18:17 +0000)]
Merge "postproc: vpx_mbpost_proc_across_ip_neon"
Marco Paniconi [Mon, 9 Jan 2017 17:23:12 +0000 (17:23 +0000)]
Merge "vp9: 1 pass cbr mode: increase threshold for gf_cbr_boost_pct usage."
Kaustubh Raste [Mon, 9 Jan 2017 12:00:16 +0000 (17:30 +0530)]
Add mips dspr2 partial idct tests
Change-Id: Idf4003ea6f9a2a42a9f26e156bee73697acb7a37
Kaustubh Raste [Mon, 9 Jan 2017 11:51:09 +0000 (17:21 +0530)]
Fix mips dspr2 idct32x32 functions for large coefficient input
Change-Id: If9da7099f226a27a09cc9e2899eb66a1158909d2
Kaustubh Raste [Mon, 9 Jan 2017 11:05:28 +0000 (16:35 +0530)]
Fix mips dspr2 idct16x16 functions for large coefficient input
Change-Id: I9be3d3d040837f658c6314606e28db8c31092a1a
Kaustubh Raste [Mon, 9 Jan 2017 10:52:19 +0000 (16:22 +0530)]
Fix mips dspr2 idct8x8 functions for large coefficient input
Change-Id: If011dd923bbe976589735d5aa1c3167dda1a3b61
Kaustubh Raste [Mon, 9 Jan 2017 09:58:30 +0000 (15:28 +0530)]
Fix mips dspr2 idct4x4 functions for large coefficient input
Change-Id: I06730eec80ca81e0b7436d26232465b79f447e89
Kaustubh Raste [Mon, 9 Jan 2017 08:41:57 +0000 (14:11 +0530)]
Add mips dspr2 vp9 intrapred tests
Change-Id: I6be8c59ee220af0597bc2d7213f2779ac2e88db9
Linfeng Zhang [Sat, 7 Jan 2017 01:52:07 +0000 (17:52 -0800)]
Refine 8-bit 16x16 idct NEON intrinsics
Speed test shows 25% gain on vpx_idct16x16_256_add_neon(),
and vpx_idct16x16_10_add_neon() got trippled.
Change-Id: If8518d9b6a3efab74031297b8d40cd83c4a49541
Hui Su [Sat, 7 Jan 2017 00:55:41 +0000 (00:55 +0000)]
Merge "Add support for VP9 level targeting"
Johann [Wed, 21 Dec 2016 22:19:25 +0000 (14:19 -0800)]
postproc: vpx_mbpost_proc_across_ip_neon
The speedup is pretty poor. I would be concerned except the SSE2 is
worse:
Existing SSE2 improvement: 22%
New neon improvement: 35%
BUG=webm:1320
Change-Id: Ied598a261134aa6cbe69f96f58589d2bae17bf62
Marco [Fri, 6 Jan 2017 23:28:21 +0000 (15:28 -0800)]
vp9: 1 pass cbr mode: increase threshold for gf_cbr_boost_pct usage.
Increase the boost threshold below which GOLDEN update will use same
rate correction factor as INTER_NORMAL.
Improves performance when gf_cbr_boost_pct is set (between 0 and 100)
in CBR mode.
Change-Id: I9f54cc18664786a100b13a416b7137ae03bd0cab
Jerome Jiang [Fri, 6 Jan 2017 22:38:39 +0000 (22:38 +0000)]
Merge "vp9: Enable more aggresive short circuit for speed 8."
Marco Paniconi [Fri, 6 Jan 2017 22:34:49 +0000 (22:34 +0000)]
Merge "vp9: Add some controls to sample encoder: vpx_temporal_svc_encoder"
Jerome Jiang [Fri, 6 Jan 2017 21:57:27 +0000 (21:57 +0000)]
Merge "vp9: Compute source sad for every superblock when partition copy is on."
Marco [Fri, 6 Jan 2017 19:28:31 +0000 (11:28 -0800)]
vp9: Add some controls to sample encoder: vpx_temporal_svc_encoder
Add the gf boost and frame_parallel controls.
Set as default to off.
Change-Id: Id85fcb16a4fae97f51c09e9ebadb5cdcd510c2f5
Jerome Jiang [Fri, 6 Jan 2017 18:06:37 +0000 (10:06 -0800)]
vp9: Enable more aggresive short circuit for speed 8.
Set short_circuit_low_temp_var to 3 for speed 8 for all res.
No strong visual difference on all clips.
Change-Id: Ia6d9a314291ab1c14d5421bbdd769974083aeb2a
hui su [Fri, 2 Dec 2016 18:11:33 +0000 (10:11 -0800)]
Add support for VP9 level targeting
Constraints on encoder config:
-target_bandwidth is no larger than 80% of level bitrate limit
-target_bandwidth * (1 + max_over_shoot_pct) is no larger than
88% of level bitrate limit
-min_gf_interval is no smaller than level limit
-tile_columns is no larger than level limit
Constraints on rate control:
-current frame size plus previous three frames' size is no larger
than the CPB level limit
-current frame size is no larger than 50%/40%/20% of the CPB
level limit if it's a key/alt-ref/other frame.
Change-Id: I84d1a2d6d6e3c82bfd533b3309ce999cfaba2c8b
Jerome Jiang [Thu, 5 Jan 2017 00:19:42 +0000 (16:19 -0800)]
vp9: Compute source sad for every superblock when partition copy is on.
The source sad could be used to copy the partition without going into
choose_partitioning function to speed up vp9 encoding. Computing source
sad takes little time. Speed test on Android and Linux shows little
encoding time gain (less than 1.4%).
Turned off for now since partition copy is turned off.
Change-Id: I61c9d5b8f22329760cb29a4ee30a7f9c232ce8d3
Linfeng Zhang [Fri, 6 Jan 2017 16:47:22 +0000 (16:47 +0000)]
Merge "Add high bitdepth 8x8 idct NEON intrinsics"
Linfeng Zhang [Fri, 6 Jan 2017 01:16:18 +0000 (01:16 +0000)]
Merge "Clean DC only idct NEON intrinsics"
Jerome Jiang [Wed, 4 Jan 2017 19:22:51 +0000 (11:22 -0800)]
vp9: Set short circuit to level 3 for VGA for speed 8.
vp9: Set short circuit to level 3 for VGA for speed 8. Also change the
threshold_32x32 to 5/8*thresholds[1] to improve quality regression
caused to VGA clips.
Change-Id: Ia1590e91e7cb22be78d5b85013387bb1be4272e3
Marco Paniconi [Wed, 4 Jan 2017 17:24:08 +0000 (17:24 +0000)]
Merge "vp9: 1 pass cbr: allow noise estimation down to 360p."
Marco [Wed, 4 Jan 2017 00:01:05 +0000 (16:01 -0800)]
vp9: 1 pass cbr: allow noise estimation down to 360p.
Also adjust some thresholds for noise level setting.
Change-Id: I7e03d7057ef2061c9447728deb9c6aff5d3da4b7
Marco [Wed, 21 Dec 2016 20:53:51 +0000 (12:53 -0800)]
vp9: SVC unittests: fix to use y4m source.
Comment out check on buffer underrun, as it currently fails
on some of the svc tests.
Also cast the update of bits_in_buffer_model_, as this can
go negative now due to the buffer underrun.
This fixes the issue in #1352.
BUG=webm:1350
BUG=webm:1352
Change-Id: Ibd4ef23921daf09e5c15b000aca904aa4573599c
Yunqing Wang [Tue, 3 Jan 2017 17:46:15 +0000 (17:46 +0000)]
Merge "Fix for out of range motion vector bug in joint motion search"
Ranjit Kumar Tulabandu [Wed, 21 Dec 2016 09:42:17 +0000 (15:12 +0530)]
Fix for out of range motion vector bug in joint motion search
Clamped the initial mv in vp9_refining_search_8p_c.
BUG=webm:1354
Change-Id: I47d302b350937e3e6e52e95c983b5fb0b4c64fba
Yunqing Wang [Thu, 29 Dec 2016 19:16:00 +0000 (19:16 +0000)]
Merge "Make sub-pixel mv search's return value consistent with the return type"
Yunqing Wang [Thu, 29 Dec 2016 17:24:24 +0000 (17:24 +0000)]
Merge "Bug fix to avoid random crashes during ARNR filtering"
Gabriel Marin [Thu, 29 Dec 2016 06:03:43 +0000 (06:03 +0000)]
Merge "Remove superfluous conditional on 'shortcut'"
Linfeng Zhang [Wed, 28 Dec 2016 21:51:44 +0000 (13:51 -0800)]
Clean DC only idct NEON intrinsics
BUG=webm:1301
Change-Id: Iffc83854218460b3f687f3774e71d45b552382a5
Linfeng Zhang [Wed, 28 Dec 2016 00:28:53 +0000 (16:28 -0800)]
Add high bitdepth 8x8 idct NEON intrinsics
BUG=webm:1301
Change-Id: I56e3bc3aab9214e2debac93796389a7194991084
Yunqing Wang [Tue, 27 Dec 2016 19:52:39 +0000 (11:52 -0800)]
Make sub-pixel mv search's return value consistent with the return type
For out-of-range cases, returned UINT_MAX instead of INT_MAX in the
sub-pixel mv search to be consistent with the "uint32_t" return type.
Change-Id: I8e206d771228c13d89bafbbe9f14722c8ecc6a7a
Ranjit Kumar Tulabandu [Wed, 23 Nov 2016 13:16:44 +0000 (18:46 +0530)]
Bug fix to avoid random crashes during ARNR filtering
The function 'vp9_find_best_sub_pixel_tree_pruned_more' is modified
to return INT_MAX for handling invalid MV cases from UINT32_MAX.
yunqingwang:
patch 3: rebased on top of the tree.
patch 4: The return type of vp9_find_best_sub_pixel_tree* was changed
to uint32_t to fix ubsan warnings. Changing UINT_MAX back to INT_MAX
was not quite right. Patch 4 modified vp9_temporal_filter.c to accept
uint32_t.
(Note: Inconsistency exists in vp9_find_best_sub_pixel_tree*, which
will be fixed in a separate CL.)
Change-Id: Ib1a79dc2aa41ea6335c21669c76883cdbb7e0535
Linfeng Zhang [Tue, 27 Dec 2016 17:59:27 +0000 (17:59 +0000)]
Merge "Clean idct 8x8 neon functions"
James Zern [Fri, 23 Dec 2016 22:10:13 +0000 (14:10 -0800)]
Revert "vp9: SVC unittests: fix to use y4m source."
This reverts commit
f0b491a52405abb1b3dbb6b2c74dd6a4c7a7ddb1.
This change results in unsigned integer overflows (as reported by
-fsanitize=integer) in datarate_test.cc,
for many of --gtest_filter=VP9/DatarateOnePassCbrSvc.OnePassCbrSvc*:
unsigned integer overflow: 167198 - 185560 cannot be represented in type
'unsigned long'
As the encoder didn't change, but the input with the change to
(correctly) use Y4mVideoSource, this revert is merely masking the issue.
BUG=webm:1352
Change-Id: Iecd9a6c83b3fca67c566732a5c92d36193cc2060
Jerome Jiang [Wed, 21 Dec 2016 00:49:42 +0000 (16:49 -0800)]
Fix compile warnings for target=armv7-android-gcc
Fix compile warnings about implicit type conversion for
target=armv7-android-gcc in vpxenc.c.
BUG=webm:1348
Change-Id: I9fbabd843512f2a1a09f4bb934cd091e834eed9c
Marco Paniconi [Thu, 22 Dec 2016 17:26:41 +0000 (17:26 +0000)]
Merge "vp9: SVC unittests: fix to use y4m source."
James Zern [Thu, 22 Dec 2016 13:20:55 +0000 (08:20 -0500)]
libs.mk/stress.sh,curl: set --retry to 1
provide some resilience for transient errors
Change-Id: I8db3d4eb5ef3cccc235a8c4c0052199c0ce23a27
Marco [Wed, 21 Dec 2016 20:53:51 +0000 (12:53 -0800)]
vp9: SVC unittests: fix to use y4m source.
Comment out check on buffer underrun, as it currently fails
on some of the svc tests.
BUG=webm:1350
Change-Id: I73c88b800cdcc06bd2f900f7b7e2a5fd08248065
Linfeng Zhang [Wed, 21 Dec 2016 22:24:17 +0000 (14:24 -0800)]
Clean idct 8x8 neon functions
BUG=webm:1301
Change-Id: I05f47dca1fddc155c8396e627cfccf6449677307
Marco [Fri, 16 Dec 2016 00:10:30 +0000 (16:10 -0800)]
vp9: 1 pass vbr: Skip find_predictors in pickmode when source is altref.
When source frame is altref, we only do zero-mv mode, so we can skip
the find_predictors(). No change in compression.
Small speed gain, ~1%.
Only affects 1 pass vbr with lookhead altref, for ytlive with
the macro flag USE_ALTREF_FOR_ONE_PASS on.
Change-Id: I9318c5da8521f017bf54919cd652438b3a6313d1
Marco Paniconi [Wed, 21 Dec 2016 19:38:00 +0000 (19:38 +0000)]
Merge "vp9; Fix to unitest for high noise."
Marco [Wed, 21 Dec 2016 18:19:44 +0000 (10:19 -0800)]
vp9; Fix to unitest for high noise.
Source if y4m, and fix comment.
Change-Id: I1eb84977d42dd0f9009c276b56b3fdb03949bfc2
Marco Paniconi [Wed, 21 Dec 2016 03:56:10 +0000 (03:56 +0000)]
Merge "vp9: Add datarate test for denoiser, for high noise case."
Marco [Mon, 19 Dec 2016 22:07:49 +0000 (14:07 -0800)]
vp9: Add datarate test for denoiser, for high noise case.
Also breakout the denoiser tests, as the denoiser only
runs for real-time speed >=5.
Change-Id: I921b785860c35e9d1ebfad0833673a98490186c2
Jerome Jiang [Tue, 20 Dec 2016 21:46:43 +0000 (21:46 +0000)]
Merge "vp9: Add feature to copy partition from the last frame."
Gabriel Marin [Wed, 14 Dec 2016 19:07:50 +0000 (11:07 -0800)]
Remove superfluous conditional on 'shortcut'
Remove superfluous test. Produces a small improvement in instruction scheduling.
Measured a 1% to 1.5% reduction in execution time for routine vp9_optimize_b
with different compilers.
No change in behavior.
TEST=Verified that encoded files match bit for bit, with and without this
change.
BUG=b/
33678225
Change-Id: I2bf248d4c25fc0256147d7a8766ff9108ae9cba3
Kaustubh Raste [Tue, 20 Dec 2016 02:27:07 +0000 (02:27 +0000)]
Merge "Add mips msa vp9 intrapred tests"
Jerome Jiang [Mon, 19 Dec 2016 18:39:04 +0000 (10:39 -0800)]
vp9: Add feature to copy partition from the last frame.
Add feature to copy partition from the last frame.
The copy is only done under certain conditions that SAD is below threshold.
Feature is currently disabled, until threshold is tuned.
Feature will be initially used for Speed 8 (ARM).
Under extreme case of always copying partition for speed 8:
Encode time is reduced by 5.4% on rtc_derf and 7.8% on rtc.
Overall PSNR reduced by 2.1 on rtc_derf and 0.968 on rtc.
Change-Id: I1bcab515af3088e4d60675758f72613c2d3dc7a5
Gabriel Marin [Mon, 19 Dec 2016 23:25:38 +0000 (23:25 +0000)]
Merge "Simplify address arithmetic in vp9_optimize_b"
James Zern [Mon, 19 Dec 2016 22:39:01 +0000 (22:39 +0000)]
Merge "vpx_idct32x32_1024_add_neon: quiet uninitialized warning"
Marco Paniconi [Mon, 19 Dec 2016 21:15:36 +0000 (21:15 +0000)]
Merge "vp9 denoiser: Fix the logic for re-evaluating zeromv after denoising."
Gabriel Marin [Wed, 14 Dec 2016 00:22:48 +0000 (16:22 -0800)]
Simplify address arithmetic in vp9_optimize_b
Simplify address arithmetic on token_costs to reduce the number of generated
instructions that are used for address arithmetic inside routine
vp9_optimize_b. It also helps improve instruction scheduling depending on
compiler and optimization level.
Measured a 9.3% reduction in retired instructions and 5.3% reduction in
execution time for this routine with GCC v4.8.4 and optimization flags -O3,
and a reduction of up to 11.6% in execution time with other compilers.
No change in behavior.
TEST=Verified that encoded files match bit for bit, with and without this
change.
BUG=b/
33678225
Change-Id: I6098650fb5cd2aa04e014fe6e68ca20761f3a21f
James Zern [Mon, 19 Dec 2016 18:51:59 +0000 (10:51 -0800)]
vpx_idct32x32_1024_add_neon: quiet uninitialized warning
relocate the assignment to 'in' outside of the for loop. this quiets a
spurious warning in visual studio builds since:
86e340c enable vpx_idct32x32_1024_add_neon in hbd builds
+ give the variable a more descriptive name
BUG=webm:1294
Change-Id: I5c3da5c7939621477e0fc0ad3a1b2a3045c5bffd
Marco [Sat, 17 Dec 2016 00:01:59 +0000 (16:01 -0800)]
vp9: With denoising on, only estimate noise level for higher resolns.
Allow it for resolns above 640x360 for now.
Change-Id: I087d0d8173f96b316164fdd4a499110ce2e7a233
Marco [Mon, 19 Dec 2016 17:22:44 +0000 (09:22 -0800)]
vp9 denoiser: Fix the logic for re-evaluating zeromv after denoising.
Correctly set interp_filter to SWITCHABLE for INTRA mode.
Also reduce threshold on noise level for re-evaluating zeromv.
Change-Id: Id32c01e193209fb380aa07204f0be3babf29f70a
Linfeng Zhang [Mon, 19 Dec 2016 17:09:26 +0000 (17:09 +0000)]
Merge "Clean hbd idct 4x4 neon functions and other"
Kaustubh Raste [Mon, 19 Dec 2016 11:56:17 +0000 (17:26 +0530)]
Add mips msa vp9 intrapred tests
Change-Id: I49b91464a87cad8692f4b1477e45e5f567b4fe87
Johann Koenig [Sat, 17 Dec 2016 01:12:34 +0000 (01:12 +0000)]
Merge "post proc test: add padding for sse2 tests"