platform/upstream/libvpx.git
7 years agoRemove marco MVC in mcomp.c
Yunqing Wang [Tue, 24 Jan 2017 00:57:34 +0000 (16:57 -0800)]
Remove marco MVC in mcomp.c

Removed MVC so that mv_err_cost() is always called while calculating
the mv cost.

Change-Id: I28123e05fbfc2352128e266c985d2ab093940071

7 years agoMerge "vp9: Make the denoiser work with spatial SVC."
Marco Paniconi [Thu, 12 Jan 2017 17:54:41 +0000 (17:54 +0000)]
Merge "vp9: Make the denoiser work with spatial SVC."

7 years agoMerge "Create a class for buffers used in tests"
Johann Koenig [Thu, 12 Jan 2017 01:02:58 +0000 (01:02 +0000)]
Merge "Create a class for buffers used in tests"

7 years agoMerge "Add Y,U,V channel metrics and unweighted metrics."
Peter Boström [Wed, 11 Jan 2017 21:05:47 +0000 (21:05 +0000)]
Merge "Add Y,U,V channel metrics and unweighted metrics."

7 years agoMerge "vp9: Turn on the partition copy for speed 8. Tune threshold."
Jerome Jiang [Wed, 11 Jan 2017 20:50:42 +0000 (20:50 +0000)]
Merge "vp9: Turn on the partition copy for speed 8. Tune threshold."

7 years agoMerge "arm idct16x16: remove extra config guards"
Johann Koenig [Wed, 11 Jan 2017 20:22:27 +0000 (20:22 +0000)]
Merge "arm idct16x16: remove extra config guards"

7 years agoAdd Y,U,V channel metrics and unweighted metrics.
Peter Boström [Wed, 11 Jan 2017 17:28:03 +0000 (12:28 -0500)]
Add Y,U,V channel metrics and unweighted metrics.

Renames SSIM to VpxSSIM as an upscaled weighted SSIM metric, then prints
Y, U and V channels unweighted as well as a weighted but not scaled SSIM
score that's 8/1/1 parts Y/U/V (same as VpxSSIM).

Change-Id: Iff800cc8f145314eeb1a9b4af1e11a25bec095ca

7 years agoMerge "Rework forward 8x8 2D-DCT ssse3 implementation"
Jingning Han [Wed, 11 Jan 2017 19:28:39 +0000 (19:28 +0000)]
Merge "Rework forward 8x8 2D-DCT ssse3 implementation"

7 years agovp9: Turn on the partition copy for speed 8. Tune threshold.
Jerome Jiang [Tue, 10 Jan 2017 20:43:22 +0000 (12:43 -0800)]
vp9: Turn on the partition copy for speed 8. Tune threshold.

For speed 8, it speeds up the encoding on android by 6% for QVGA and
7.4% for VGA with the new threshold. Overall PSNR is improved by 0.667
for rtc.

Change-Id: I4a644560b32c0b5b4e9f49ffb953d000413a3732

7 years agoarm idct16x16: remove extra config guards
Johann [Wed, 11 Jan 2017 18:17:14 +0000 (10:17 -0800)]
arm idct16x16: remove extra config guards

This file is guarded by HAVE_NEON_ASM in the .mk file now.

Change-Id: I513a621c234aa90ad52e426c8ed494d8a7d4b74a

7 years agoCreate a class for buffers used in tests
Johann [Mon, 24 Oct 2016 19:17:51 +0000 (12:17 -0700)]
Create a class for buffers used in tests

Demonstrate its use with the IDCT test.

Change-Id: Idf87fe048847c180f13818fd4df916ba4500134b

7 years agoAdd "Large" label to VP9 target level tests
hui su [Wed, 11 Jan 2017 00:37:59 +0000 (16:37 -0800)]
Add "Large" label to VP9 target level tests

Also reduce the number of test frames.

Change-Id: Iea6fa93ca6b924535aef7bf8b388db4d0ec84c08

7 years agovp9: Make the denoiser work with spatial SVC.
Marco [Wed, 21 Dec 2016 22:33:21 +0000 (14:33 -0800)]
vp9: Make the denoiser work with spatial SVC.

If enabled denoiser will only denoise the top spatial layer for now.

Added unittest for SVC with denoising.

Change-Id: Ifa373771c4ecfa208615eb163cc38f1c22c6664b

7 years agoRework forward 8x8 2D-DCT ssse3 implementation
Jingning Han [Mon, 9 Jan 2017 22:00:29 +0000 (14:00 -0800)]
Rework forward 8x8 2D-DCT ssse3 implementation

This commit reworks the SSSE3 implementation of the forward 8x8
2D-DCT. It uses a cyclic rotation approach to the temporary xmm
registers. It reduces the average cycles from 158 to 154. The SSE2
version uses 169 cycles.

Change-Id: I1b79b9642aae0ed3fb3cefb5b70246e6de5d5caa

7 years agovp9: 1 pass cbr: Adjustments to usage of gf_cbr_boost and aq=3 mode.
Marco [Tue, 10 Jan 2017 00:38:49 +0000 (16:38 -0800)]
vp9: 1 pass cbr: Adjustments to usage of gf_cbr_boost and aq=3 mode.

When aq=3 mode is on and the gf_cbr_boost is set: make sure golden frame
is always refreshed, and don't incorporate segement cost in qp setting
on the boosted golden frame.

Better performance on RTC set with gf_cbr_boost on,
for example with gf_cbr_boost=50, gains from ~0.5-3%.

Change-Id: Ie811f5e4d444ff3320bd6e2c1745b2c4c09a8460

7 years agoMerge "vp9: Set less aggresive short_circuit_low_temp_var for HD at speed 8."
Jerome Jiang [Tue, 10 Jan 2017 00:51:09 +0000 (00:51 +0000)]
Merge "vp9: Set less aggresive short_circuit_low_temp_var for HD at speed 8."

7 years agovp9: Set less aggresive short_circuit_low_temp_var for HD at speed 8.
Jerome Jiang [Mon, 9 Jan 2017 23:04:13 +0000 (15:04 -0800)]
vp9: Set less aggresive short_circuit_low_temp_var for HD at speed 8.

Quality improved by 1.866 and 0.386 for two noisy clips (dark720p and
marcooffice720p), respectively.

Change-Id: Ib33a7672ae9ca53da156208f7cd13f07b5543e44

7 years agoMerge "Fix compile warnings for target=armv7-android-gcc"
Jerome Jiang [Mon, 9 Jan 2017 23:53:41 +0000 (23:53 +0000)]
Merge "Fix compile warnings for target=armv7-android-gcc"

7 years agoMerge "Refine 8-bit 16x16 idct NEON intrinsics"
James Zern [Mon, 9 Jan 2017 23:52:29 +0000 (23:52 +0000)]
Merge "Refine 8-bit 16x16 idct NEON intrinsics"

7 years agoMerge "vp9: 1 pass cbr: Fix to qp clamping when gf_cbr_boost_pct is used."
Marco Paniconi [Mon, 9 Jan 2017 23:30:32 +0000 (23:30 +0000)]
Merge "vp9: 1 pass cbr: Fix to qp clamping when gf_cbr_boost_pct is used."

7 years agovp9: Fix comment in speed features.
Marco [Mon, 9 Jan 2017 21:03:50 +0000 (13:03 -0800)]
vp9: Fix comment in speed features.

Change-Id: I65d79c06b152922d725bf559adaa508f91cd5766

7 years agovp9: 1 pass cbr: Fix to qp clamping when gf_cbr_boost_pct is used.
Marco [Mon, 9 Jan 2017 20:46:01 +0000 (12:46 -0800)]
vp9: 1 pass cbr: Fix to qp clamping when gf_cbr_boost_pct is used.

Avoid the qp-clamping on gf/alt frame if gf_cbr_boost_pct is set.

Change only affect CBR mode when  gf_cbr_boost_pct is set.

Change-Id: I0655ed4f2b047c8ed1ed33a070c17960ad776704

7 years agoMerge "postproc: vpx_mbpost_proc_down_neon"
Johann Koenig [Mon, 9 Jan 2017 19:53:15 +0000 (19:53 +0000)]
Merge "postproc: vpx_mbpost_proc_down_neon"

7 years agoMerge "Add mips dspr2 partial idct tests"
Johann Koenig [Mon, 9 Jan 2017 19:49:02 +0000 (19:49 +0000)]
Merge "Add mips dspr2 partial idct tests"

7 years agoMerge "Fix mips dspr2 idct32x32 functions for large coefficient input"
Johann Koenig [Mon, 9 Jan 2017 19:47:47 +0000 (19:47 +0000)]
Merge "Fix mips dspr2 idct32x32 functions for large coefficient input"

7 years agoMerge "Fix mips dspr2 idct16x16 functions for large coefficient input"
Johann Koenig [Mon, 9 Jan 2017 19:47:00 +0000 (19:47 +0000)]
Merge "Fix mips dspr2 idct16x16 functions for large coefficient input"

7 years agoMerge "Fix mips dspr2 idct8x8 functions for large coefficient input"
Johann Koenig [Mon, 9 Jan 2017 19:46:18 +0000 (19:46 +0000)]
Merge "Fix mips dspr2 idct8x8 functions for large coefficient input"

7 years agoMerge "Fix mips dspr2 idct4x4 functions for large coefficient input"
Johann Koenig [Mon, 9 Jan 2017 19:45:53 +0000 (19:45 +0000)]
Merge "Fix mips dspr2 idct4x4 functions for large coefficient input"

7 years agoMerge "Add mips dspr2 vp9 intrapred tests"
Johann Koenig [Mon, 9 Jan 2017 19:39:13 +0000 (19:39 +0000)]
Merge "Add mips dspr2 vp9 intrapred tests"

7 years agopostproc: vpx_mbpost_proc_down_neon
Johann [Thu, 22 Dec 2016 18:04:42 +0000 (10:04 -0800)]
postproc: vpx_mbpost_proc_down_neon

This was much more amenable to optimization than the across filter.
Speedup of almost 2.5x

BUG=webm:1320

Change-Id: I49acc0f9cb2e7642303df90132cbc938acade4c4

7 years agoMerge "postproc: vpx_mbpost_proc_across_ip_neon"
Johann Koenig [Mon, 9 Jan 2017 18:17:26 +0000 (18:17 +0000)]
Merge "postproc: vpx_mbpost_proc_across_ip_neon"

7 years agoMerge "vp9: 1 pass cbr mode: increase threshold for gf_cbr_boost_pct usage."
Marco Paniconi [Mon, 9 Jan 2017 17:23:12 +0000 (17:23 +0000)]
Merge "vp9: 1 pass cbr mode: increase threshold for gf_cbr_boost_pct usage."

7 years agoAdd mips dspr2 partial idct tests
Kaustubh Raste [Mon, 9 Jan 2017 12:00:16 +0000 (17:30 +0530)]
Add mips dspr2 partial idct tests

Change-Id: Idf4003ea6f9a2a42a9f26e156bee73697acb7a37

7 years agoFix mips dspr2 idct32x32 functions for large coefficient input
Kaustubh Raste [Mon, 9 Jan 2017 11:51:09 +0000 (17:21 +0530)]
Fix mips dspr2 idct32x32 functions for large coefficient input

Change-Id: If9da7099f226a27a09cc9e2899eb66a1158909d2

7 years agoFix mips dspr2 idct16x16 functions for large coefficient input
Kaustubh Raste [Mon, 9 Jan 2017 11:05:28 +0000 (16:35 +0530)]
Fix mips dspr2 idct16x16 functions for large coefficient input

Change-Id: I9be3d3d040837f658c6314606e28db8c31092a1a

7 years agoFix mips dspr2 idct8x8 functions for large coefficient input
Kaustubh Raste [Mon, 9 Jan 2017 10:52:19 +0000 (16:22 +0530)]
Fix mips dspr2 idct8x8 functions for large coefficient input

Change-Id: If011dd923bbe976589735d5aa1c3167dda1a3b61

7 years agoFix mips dspr2 idct4x4 functions for large coefficient input
Kaustubh Raste [Mon, 9 Jan 2017 09:58:30 +0000 (15:28 +0530)]
Fix mips dspr2 idct4x4 functions for large coefficient input

Change-Id: I06730eec80ca81e0b7436d26232465b79f447e89

7 years agoAdd mips dspr2 vp9 intrapred tests
Kaustubh Raste [Mon, 9 Jan 2017 08:41:57 +0000 (14:11 +0530)]
Add mips dspr2 vp9 intrapred tests

Change-Id: I6be8c59ee220af0597bc2d7213f2779ac2e88db9

7 years agoRefine 8-bit 16x16 idct NEON intrinsics
Linfeng Zhang [Sat, 7 Jan 2017 01:52:07 +0000 (17:52 -0800)]
Refine 8-bit 16x16 idct NEON intrinsics

Speed test shows 25% gain on vpx_idct16x16_256_add_neon(),
and vpx_idct16x16_10_add_neon() got trippled.

Change-Id: If8518d9b6a3efab74031297b8d40cd83c4a49541

7 years agoMerge "Add support for VP9 level targeting"
Hui Su [Sat, 7 Jan 2017 00:55:41 +0000 (00:55 +0000)]
Merge "Add support for VP9 level targeting"

7 years agopostproc: vpx_mbpost_proc_across_ip_neon
Johann [Wed, 21 Dec 2016 22:19:25 +0000 (14:19 -0800)]
postproc: vpx_mbpost_proc_across_ip_neon

The speedup is pretty poor. I would be concerned except the SSE2 is
worse:
Existing SSE2 improvement: 22%
New neon improvement: 35%

BUG=webm:1320

Change-Id: Ied598a261134aa6cbe69f96f58589d2bae17bf62

7 years agovp9: 1 pass cbr mode: increase threshold for gf_cbr_boost_pct usage.
Marco [Fri, 6 Jan 2017 23:28:21 +0000 (15:28 -0800)]
vp9: 1 pass cbr mode: increase threshold for gf_cbr_boost_pct usage.

Increase the boost threshold below which GOLDEN update will use same
rate correction factor as INTER_NORMAL.

Improves performance when gf_cbr_boost_pct is set (between 0 and 100)
in CBR mode.

Change-Id: I9f54cc18664786a100b13a416b7137ae03bd0cab

7 years agoMerge "vp9: Enable more aggresive short circuit for speed 8."
Jerome Jiang [Fri, 6 Jan 2017 22:38:39 +0000 (22:38 +0000)]
Merge "vp9: Enable more aggresive short circuit for speed 8."

7 years agoMerge "vp9: Add some controls to sample encoder: vpx_temporal_svc_encoder"
Marco Paniconi [Fri, 6 Jan 2017 22:34:49 +0000 (22:34 +0000)]
Merge "vp9: Add some controls to sample encoder: vpx_temporal_svc_encoder"

7 years agoMerge "vp9: Compute source sad for every superblock when partition copy is on."
Jerome Jiang [Fri, 6 Jan 2017 21:57:27 +0000 (21:57 +0000)]
Merge "vp9: Compute source sad for every superblock when partition copy is on."

7 years agovp9: Add some controls to sample encoder: vpx_temporal_svc_encoder
Marco [Fri, 6 Jan 2017 19:28:31 +0000 (11:28 -0800)]
vp9: Add some controls to sample encoder: vpx_temporal_svc_encoder

Add the gf boost and frame_parallel controls.
Set as default to off.

Change-Id: Id85fcb16a4fae97f51c09e9ebadb5cdcd510c2f5

7 years agovp9: Enable more aggresive short circuit for speed 8.
Jerome Jiang [Fri, 6 Jan 2017 18:06:37 +0000 (10:06 -0800)]
vp9: Enable more aggresive short circuit for speed 8.

Set short_circuit_low_temp_var to 3 for speed 8 for all res.
No strong visual difference on all clips.

Change-Id: Ia6d9a314291ab1c14d5421bbdd769974083aeb2a

7 years agoAdd support for VP9 level targeting
hui su [Fri, 2 Dec 2016 18:11:33 +0000 (10:11 -0800)]
Add support for VP9 level targeting

Constraints on encoder config:
-target_bandwidth is no larger than 80% of level bitrate limit
-target_bandwidth * (1 + max_over_shoot_pct) is no larger than
88% of level bitrate limit
-min_gf_interval is no smaller than level limit
-tile_columns is no larger than level limit

Constraints on rate control:
-current frame size plus previous three frames' size is no larger
than the CPB level limit
-current frame size is no larger than 50%/40%/20% of the CPB
level limit if it's a key/alt-ref/other frame.

Change-Id: I84d1a2d6d6e3c82bfd533b3309ce999cfaba2c8b

7 years agovp9: Compute source sad for every superblock when partition copy is on.
Jerome Jiang [Thu, 5 Jan 2017 00:19:42 +0000 (16:19 -0800)]
vp9: Compute source sad for every superblock when partition copy is on.

The source sad could be used to copy the partition without going into
choose_partitioning function to speed up vp9 encoding. Computing source
sad takes little time. Speed test on Android and Linux shows little
encoding time gain (less than 1.4%).

Turned off for now since partition copy is turned off.

Change-Id: I61c9d5b8f22329760cb29a4ee30a7f9c232ce8d3

7 years agoMerge "Add high bitdepth 8x8 idct NEON intrinsics"
Linfeng Zhang [Fri, 6 Jan 2017 16:47:22 +0000 (16:47 +0000)]
Merge "Add high bitdepth 8x8 idct NEON intrinsics"

7 years agoMerge "Clean DC only idct NEON intrinsics"
Linfeng Zhang [Fri, 6 Jan 2017 01:16:18 +0000 (01:16 +0000)]
Merge "Clean DC only idct NEON intrinsics"

7 years agovp9: Set short circuit to level 3 for VGA for speed 8.
Jerome Jiang [Wed, 4 Jan 2017 19:22:51 +0000 (11:22 -0800)]
vp9: Set short circuit to level 3 for VGA for speed 8.

vp9: Set short circuit to level 3 for VGA for speed 8. Also change the
threshold_32x32 to 5/8*thresholds[1] to improve quality regression
caused to VGA clips.

Change-Id: Ia1590e91e7cb22be78d5b85013387bb1be4272e3

7 years agoMerge "vp9: 1 pass cbr: allow noise estimation down to 360p."
Marco Paniconi [Wed, 4 Jan 2017 17:24:08 +0000 (17:24 +0000)]
Merge "vp9: 1 pass cbr: allow noise estimation down to 360p."

7 years agovp9: 1 pass cbr: allow noise estimation down to 360p.
Marco [Wed, 4 Jan 2017 00:01:05 +0000 (16:01 -0800)]
vp9: 1 pass cbr: allow noise estimation down to 360p.

Also adjust some thresholds for noise level setting.

Change-Id: I7e03d7057ef2061c9447728deb9c6aff5d3da4b7

7 years agovp9: SVC unittests: fix to use y4m source.
Marco [Wed, 21 Dec 2016 20:53:51 +0000 (12:53 -0800)]
vp9: SVC unittests: fix to use y4m source.

Comment out check on buffer underrun, as it currently fails
on some of the svc tests.

Also cast the update of bits_in_buffer_model_, as this can
go negative now due to the buffer underrun.
This fixes the issue in #1352.

BUG=webm:1350
BUG=webm:1352

Change-Id: Ibd4ef23921daf09e5c15b000aca904aa4573599c

7 years agoMerge "Fix for out of range motion vector bug in joint motion search"
Yunqing Wang [Tue, 3 Jan 2017 17:46:15 +0000 (17:46 +0000)]
Merge "Fix for out of range motion vector bug in joint motion search"

7 years agoFix for out of range motion vector bug in joint motion search
Ranjit Kumar Tulabandu [Wed, 21 Dec 2016 09:42:17 +0000 (15:12 +0530)]
Fix for out of range motion vector bug in joint motion search

Clamped the initial mv in vp9_refining_search_8p_c.

BUG=webm:1354

Change-Id: I47d302b350937e3e6e52e95c983b5fb0b4c64fba

7 years agoMerge "Make sub-pixel mv search's return value consistent with the return type"
Yunqing Wang [Thu, 29 Dec 2016 19:16:00 +0000 (19:16 +0000)]
Merge "Make sub-pixel mv search's return value consistent with the return type"

7 years agoMerge "Bug fix to avoid random crashes during ARNR filtering"
Yunqing Wang [Thu, 29 Dec 2016 17:24:24 +0000 (17:24 +0000)]
Merge "Bug fix to avoid random crashes during ARNR filtering"

7 years agoMerge "Remove superfluous conditional on 'shortcut'"
Gabriel Marin [Thu, 29 Dec 2016 06:03:43 +0000 (06:03 +0000)]
Merge "Remove superfluous conditional on 'shortcut'"

7 years agoClean DC only idct NEON intrinsics
Linfeng Zhang [Wed, 28 Dec 2016 21:51:44 +0000 (13:51 -0800)]
Clean DC only idct NEON intrinsics

BUG=webm:1301

Change-Id: Iffc83854218460b3f687f3774e71d45b552382a5

7 years agoAdd high bitdepth 8x8 idct NEON intrinsics
Linfeng Zhang [Wed, 28 Dec 2016 00:28:53 +0000 (16:28 -0800)]
Add high bitdepth 8x8 idct NEON intrinsics

BUG=webm:1301

Change-Id: I56e3bc3aab9214e2debac93796389a7194991084

7 years agoMake sub-pixel mv search's return value consistent with the return type
Yunqing Wang [Tue, 27 Dec 2016 19:52:39 +0000 (11:52 -0800)]
Make sub-pixel mv search's return value consistent with the return type

For out-of-range cases, returned UINT_MAX instead of INT_MAX in the
sub-pixel mv search to be consistent with the "uint32_t" return type.

Change-Id: I8e206d771228c13d89bafbbe9f14722c8ecc6a7a

7 years agoBug fix to avoid random crashes during ARNR filtering
Ranjit Kumar Tulabandu [Wed, 23 Nov 2016 13:16:44 +0000 (18:46 +0530)]
Bug fix to avoid random crashes during ARNR filtering

The function 'vp9_find_best_sub_pixel_tree_pruned_more' is modified
to return INT_MAX for handling invalid MV cases from UINT32_MAX.

yunqingwang:
patch 3: rebased on top of the tree.
patch 4: The return type of vp9_find_best_sub_pixel_tree* was changed
to uint32_t to fix ubsan warnings. Changing UINT_MAX back to INT_MAX
was not quite right. Patch 4 modified vp9_temporal_filter.c to accept
uint32_t.
(Note: Inconsistency exists in vp9_find_best_sub_pixel_tree*, which
will be fixed in a separate CL.)

Change-Id: Ib1a79dc2aa41ea6335c21669c76883cdbb7e0535

7 years agoMerge "Clean idct 8x8 neon functions"
Linfeng Zhang [Tue, 27 Dec 2016 17:59:27 +0000 (17:59 +0000)]
Merge "Clean idct 8x8 neon functions"

7 years agoRevert "vp9: SVC unittests: fix to use y4m source."
James Zern [Fri, 23 Dec 2016 22:10:13 +0000 (14:10 -0800)]
Revert "vp9: SVC unittests: fix to use y4m source."

This reverts commit f0b491a52405abb1b3dbb6b2c74dd6a4c7a7ddb1.

This change results in unsigned integer overflows (as reported by
-fsanitize=integer) in datarate_test.cc,
for many of --gtest_filter=VP9/DatarateOnePassCbrSvc.OnePassCbrSvc*:
unsigned integer overflow: 167198 - 185560 cannot be represented in type
'unsigned long'

As the encoder didn't change, but the input with the change to
(correctly) use Y4mVideoSource, this revert is merely masking the issue.

BUG=webm:1352

Change-Id: Iecd9a6c83b3fca67c566732a5c92d36193cc2060

7 years agoFix compile warnings for target=armv7-android-gcc
Jerome Jiang [Wed, 21 Dec 2016 00:49:42 +0000 (16:49 -0800)]
Fix compile warnings for target=armv7-android-gcc

Fix compile warnings about implicit type conversion for
target=armv7-android-gcc in vpxenc.c.

BUG=webm:1348

Change-Id: I9fbabd843512f2a1a09f4bb934cd091e834eed9c

7 years agoMerge "vp9: SVC unittests: fix to use y4m source."
Marco Paniconi [Thu, 22 Dec 2016 17:26:41 +0000 (17:26 +0000)]
Merge "vp9: SVC unittests: fix to use y4m source."

7 years agolibs.mk/stress.sh,curl: set --retry to 1
James Zern [Thu, 22 Dec 2016 13:20:55 +0000 (08:20 -0500)]
libs.mk/stress.sh,curl: set --retry to 1

provide some resilience for transient errors

Change-Id: I8db3d4eb5ef3cccc235a8c4c0052199c0ce23a27

7 years agovp9: SVC unittests: fix to use y4m source.
Marco [Wed, 21 Dec 2016 20:53:51 +0000 (12:53 -0800)]
vp9: SVC unittests: fix to use y4m source.

Comment out check on buffer underrun, as it currently fails
on some of the svc tests.

BUG=webm:1350

Change-Id: I73c88b800cdcc06bd2f900f7b7e2a5fd08248065

7 years agoClean idct 8x8 neon functions
Linfeng Zhang [Wed, 21 Dec 2016 22:24:17 +0000 (14:24 -0800)]
Clean idct 8x8 neon functions

BUG=webm:1301

Change-Id: I05f47dca1fddc155c8396e627cfccf6449677307

7 years agovp9: 1 pass vbr: Skip find_predictors in pickmode when source is altref.
Marco [Fri, 16 Dec 2016 00:10:30 +0000 (16:10 -0800)]
vp9: 1 pass vbr: Skip find_predictors in pickmode when source is altref.

When source frame is altref, we only do zero-mv mode, so we can skip
the find_predictors(). No change in compression.
Small speed gain, ~1%.

Only affects 1 pass vbr with lookhead altref, for ytlive with
the macro flag USE_ALTREF_FOR_ONE_PASS on.

Change-Id: I9318c5da8521f017bf54919cd652438b3a6313d1

7 years agoMerge "vp9; Fix to unitest for high noise."
Marco Paniconi [Wed, 21 Dec 2016 19:38:00 +0000 (19:38 +0000)]
Merge "vp9; Fix to unitest for high noise."

7 years agovp9; Fix to unitest for high noise.
Marco [Wed, 21 Dec 2016 18:19:44 +0000 (10:19 -0800)]
vp9; Fix to unitest for high noise.

Source if y4m, and fix comment.

Change-Id: I1eb84977d42dd0f9009c276b56b3fdb03949bfc2

7 years agoMerge "vp9: Add datarate test for denoiser, for high noise case."
Marco Paniconi [Wed, 21 Dec 2016 03:56:10 +0000 (03:56 +0000)]
Merge "vp9: Add datarate test for denoiser, for high noise case."

7 years agovp9: Add datarate test for denoiser, for high noise case.
Marco [Mon, 19 Dec 2016 22:07:49 +0000 (14:07 -0800)]
vp9: Add datarate test for denoiser, for high noise case.

Also breakout the denoiser tests, as the denoiser only
runs for real-time speed >=5.

Change-Id: I921b785860c35e9d1ebfad0833673a98490186c2

7 years agoMerge "vp9: Add feature to copy partition from the last frame."
Jerome Jiang [Tue, 20 Dec 2016 21:46:43 +0000 (21:46 +0000)]
Merge "vp9: Add feature to copy partition from the last frame."

7 years agoRemove superfluous conditional on 'shortcut'
Gabriel Marin [Wed, 14 Dec 2016 19:07:50 +0000 (11:07 -0800)]
Remove superfluous conditional on 'shortcut'

Remove superfluous test. Produces a small improvement in instruction scheduling.
Measured a 1% to 1.5% reduction in execution time for routine vp9_optimize_b
with different compilers.

No change in behavior.

TEST=Verified that encoded files match bit for bit, with and without this
change.
BUG=b/33678225

Change-Id: I2bf248d4c25fc0256147d7a8766ff9108ae9cba3

7 years agoMerge "Add mips msa vp9 intrapred tests"
Kaustubh Raste [Tue, 20 Dec 2016 02:27:07 +0000 (02:27 +0000)]
Merge "Add mips msa vp9 intrapred tests"

7 years agovp9: Add feature to copy partition from the last frame.
Jerome Jiang [Mon, 19 Dec 2016 18:39:04 +0000 (10:39 -0800)]
vp9: Add feature to copy partition from the last frame.

Add feature to copy partition from the last frame.
The copy is only done under certain conditions that SAD is below threshold.
Feature is currently disabled, until threshold is tuned.
Feature will be initially used for Speed 8 (ARM).

Under extreme case of always copying partition for speed 8:
Encode time is reduced by 5.4% on rtc_derf and 7.8% on rtc.
Overall PSNR reduced by 2.1 on rtc_derf and 0.968 on rtc.

Change-Id: I1bcab515af3088e4d60675758f72613c2d3dc7a5

7 years agoMerge "Simplify address arithmetic in vp9_optimize_b"
Gabriel Marin [Mon, 19 Dec 2016 23:25:38 +0000 (23:25 +0000)]
Merge "Simplify address arithmetic in vp9_optimize_b"

7 years agoMerge "vpx_idct32x32_1024_add_neon: quiet uninitialized warning"
James Zern [Mon, 19 Dec 2016 22:39:01 +0000 (22:39 +0000)]
Merge "vpx_idct32x32_1024_add_neon: quiet uninitialized warning"

7 years agoMerge "vp9 denoiser: Fix the logic for re-evaluating zeromv after denoising."
Marco Paniconi [Mon, 19 Dec 2016 21:15:36 +0000 (21:15 +0000)]
Merge "vp9 denoiser: Fix the logic for re-evaluating zeromv after denoising."

7 years agoSimplify address arithmetic in vp9_optimize_b
Gabriel Marin [Wed, 14 Dec 2016 00:22:48 +0000 (16:22 -0800)]
Simplify address arithmetic in vp9_optimize_b

Simplify address arithmetic on token_costs to reduce the number of generated
instructions that are used for address arithmetic inside routine
vp9_optimize_b. It also helps improve instruction scheduling depending on
compiler and optimization level.

Measured a 9.3% reduction in retired instructions and 5.3% reduction in
execution time for this routine with GCC v4.8.4 and optimization flags -O3,
and a reduction of up to 11.6% in execution time with other compilers.

No change in behavior.

TEST=Verified that encoded files match bit for bit, with and without this
change.
BUG=b/33678225

Change-Id: I6098650fb5cd2aa04e014fe6e68ca20761f3a21f

7 years agovpx_idct32x32_1024_add_neon: quiet uninitialized warning
James Zern [Mon, 19 Dec 2016 18:51:59 +0000 (10:51 -0800)]
vpx_idct32x32_1024_add_neon: quiet uninitialized warning

relocate the assignment to 'in' outside of the for loop. this quiets a
spurious warning in visual studio builds since:
86e340c enable vpx_idct32x32_1024_add_neon in hbd builds

+ give the variable a more descriptive name

BUG=webm:1294

Change-Id: I5c3da5c7939621477e0fc0ad3a1b2a3045c5bffd

7 years agovp9: With denoising on, only estimate noise level for higher resolns.
Marco [Sat, 17 Dec 2016 00:01:59 +0000 (16:01 -0800)]
vp9: With denoising on, only estimate noise level for higher resolns.

Allow it for resolns above 640x360 for now.

Change-Id: I087d0d8173f96b316164fdd4a499110ce2e7a233

7 years agovp9 denoiser: Fix the logic for re-evaluating zeromv after denoising.
Marco [Mon, 19 Dec 2016 17:22:44 +0000 (09:22 -0800)]
vp9 denoiser: Fix the logic for re-evaluating zeromv after denoising.

Correctly set interp_filter to SWITCHABLE for INTRA mode.
Also reduce threshold on noise level for re-evaluating zeromv.

Change-Id: Id32c01e193209fb380aa07204f0be3babf29f70a

7 years agoMerge "Clean hbd idct 4x4 neon functions and other"
Linfeng Zhang [Mon, 19 Dec 2016 17:09:26 +0000 (17:09 +0000)]
Merge "Clean hbd idct 4x4 neon functions and other"

7 years agoAdd mips msa vp9 intrapred tests
Kaustubh Raste [Mon, 19 Dec 2016 11:56:17 +0000 (17:26 +0530)]
Add mips msa vp9 intrapred tests

Change-Id: I49b91464a87cad8692f4b1477e45e5f567b4fe87

7 years agoMerge "post proc test: add padding for sse2 tests"
Johann Koenig [Sat, 17 Dec 2016 01:12:34 +0000 (01:12 +0000)]
Merge "post proc test: add padding for sse2 tests"

7 years agoMerge "vp9: Change condition to enable recheck_zeromv_after_denoising."
Marco Paniconi [Fri, 16 Dec 2016 23:53:32 +0000 (23:53 +0000)]
Merge "vp9: Change condition to enable recheck_zeromv_after_denoising."

7 years agovp9: Change condition to enable recheck_zeromv_after_denoising.
Marco [Fri, 16 Dec 2016 19:15:57 +0000 (11:15 -0800)]
vp9: Change condition to enable recheck_zeromv_after_denoising.

For when denoising enabled: change condition to enable
the recheck_zeromv_after_denoising for only very high noise level.
This is causing an issue, so enabling it for very high noise
to effectively shut it off.

Change-Id: Ic40d6025f3f398338cedd270d17c0ccd9a3daa84

7 years agopost proc test: add padding for sse2 tests
Johann [Fri, 16 Dec 2016 22:03:53 +0000 (14:03 -0800)]
post proc test: add padding for sse2 tests

Avoid valgrind warnings for reading out of bounds when the width is not
divisible by 16.

Change-Id: I5670d7cfbbce00874b98cfb7472f99c7936c2c47

7 years agopostproc test: disable new down and across test
Johann [Fri, 16 Dec 2016 20:19:00 +0000 (12:19 -0800)]
postproc test: disable new down and across test

The new test is causing valgrind failures:
[ RUN      ] SSE2/VpxPostProcDownAndAcrossMbRowTest.CheckCvsAssembly/0
==28923== Invalid read of size 16
28923==    at 0x724016: ??? (deblock_sse2.asm:146)

Disable during investigation. The test is new but the code is not.

Change-Id: I5521e5fd48a595e3798b833bf7e3cc97b81c1975

7 years agovp8 : use threading mutex's for tsan only.
Jim Bankoski [Fri, 16 Dec 2016 16:50:55 +0000 (08:50 -0800)]
vp8 : use threading mutex's for tsan only.

To avoid decode performance hit of 2% when running on hyperthreaded
cores.

This patch only uses the mutex's when we are running tsan.

This is safe because 32 bit operations like read and store are atomic
on all the platforms we care about. Tsan warns about race situations,
but in this case either situation ( read occurs before write or write
before read) the worst case is that we go around one extra time in the
loop.  So the ordering doesn't really matter.

That said a few other things have been tried :

for instance as per here:
webrtc/base/atomicops.h#52

In this patch they use:
__atomic_load_n(i, __ATOMIC_ACQUIRE);
__atomic_store_n(i, value, __ATOMIC_RELEASE);

This code works on gcc, clang ( replacing protected write and read), and
avoids tsan errors. Incurring no penalty in performance.  In C11 its
replaced by straight atomic operands.

However there is no equivalent in the visual studio's we support as
int32 on all windows platforms is already atomic.  To avoid tsan like
warnings on windows we'd need to use interlocked exchange and the
end result doesn't gain us any thing.

Change-Id: I2066e3c7f42641ebb23d53feb1f16f23f85bcf59

7 years agoMerge "vp9: Fix to usage of flag USE_ALTREF_FOR_ONE_PASS"
Marco Paniconi [Thu, 15 Dec 2016 19:48:16 +0000 (19:48 +0000)]
Merge "vp9: Fix to usage of flag USE_ALTREF_FOR_ONE_PASS"

7 years agopostproc: neon down and across macroblock filter
Johann [Tue, 13 Dec 2016 00:47:05 +0000 (16:47 -0800)]
postproc: neon down and across macroblock filter

Implement vpx_post_proc_down_and_across_mb_row in NEON.
Runs about 6-7x faster than C.

BUG=webm:1320

Change-Id: Ic5c7d3552a88cfcf999ec5bf2bd46fee460642c2

7 years agovp9: Fix to usage of flag USE_ALTREF_FOR_ONE_PASS
Marco [Wed, 14 Dec 2016 22:08:09 +0000 (14:08 -0800)]
vp9: Fix to usage of flag USE_ALTREF_FOR_ONE_PASS

The flag USE_ALTREF_FOR_ONE_PASS allows for alt-ref lookahead
in 1 pass vbr (from https://chromium-review.googlesource.com/#/c/365498).
This change is to make sure this macro flag only has effect if
the config flag cpi->oxcf.enable_auto_altef is also on.

No change in ytlive encoding, as USE_ALTREF_FOR_ONE_PASS is not
yet enabled.

Change-Id: I1a69681e4a15c5244581a3dab4587fca08f02e0f

7 years agoClean hbd idct 4x4 neon functions and other
Linfeng Zhang [Wed, 14 Dec 2016 18:42:01 +0000 (10:42 -0800)]
Clean hbd idct 4x4 neon functions and other

BUG=webm:1301

Change-Id: I387b7eae716a7df15c691dc6f368b07602df7342

7 years agoChange order of operation to avoid ubsan warnings
Yaowu Xu [Wed, 14 Dec 2016 17:37:14 +0000 (09:37 -0800)]
Change order of operation to avoid ubsan warnings

This commit change an order of operation to avoid left shifts of
negative numbers.

Change-Id: I607c7eb91658c7a5ef397fc1504721d1b10e3dd6