platform/upstream/libvpx.git
3 years agovp8 rc: explicit cast to avoid VS build failure
Jerome Jiang [Thu, 16 Sep 2021 17:19:09 +0000 (10:19 -0700)]
vp8 rc: explicit cast to avoid VS build failure

Change-Id: I6a4daca12b79cf996964661e1af85aa6e258b446

3 years agoDefine the VPX_NO_RETURN macro for MSVC
Wan-Teh Chang [Fri, 10 Sep 2021 22:54:51 +0000 (15:54 -0700)]
Define the VPX_NO_RETURN macro for MSVC

Define VPX_NO_RETURN as __declspec(noreturn) for MSVC. See
https://docs.microsoft.com/en-us/cpp/cpp/noreturn?view=msvc-160

This requires moving VPX_NO_RETURN before function declarations because
__declspec(noreturn) must be placed there. Fortunately GCC's
__attribute__((noreturn)) can be placed either before or after function
declarations.

Change-Id: Id9bb0077e2a4f16ec2ca9c913dd93673a0e385cf

3 years agoAdd vp8 support to rc lib
Jerome Jiang [Tue, 31 Aug 2021 17:22:22 +0000 (10:22 -0700)]
Add vp8 support to rc lib

For 1 layer CBR only.
Support for temporal layers comes later.

Rename the library to libvpxrc

Bug: b/188853141

Change-Id: Ib7f977b64c05b1a0596870cb7f8e6768cb483850

3 years agovp8 rc: always update correction factor
Jerome Jiang [Wed, 8 Sep 2021 23:52:51 +0000 (16:52 -0700)]
vp8 rc: always update correction factor

Change-Id: Id40b9cb5a85a15fb313a2a93f14f6768259f7c15

3 years agoAdd codec control for vp8 external rc
Jerome Jiang [Thu, 2 Sep 2021 23:15:13 +0000 (16:15 -0700)]
Add codec control for vp8 external rc

disable cyclic refresh

Change-Id: I7905602919d5780831fad840577e97730ce0afc2

3 years agovp9 rc lib: Allow aq 3 to work for SVC with unit test
Jerome Jiang [Tue, 24 Aug 2021 21:30:54 +0000 (14:30 -0700)]
vp9 rc lib: Allow aq 3 to work for SVC with unit test

Also use round to cast float to int with more accurate calculation to
avoid error accumulation which causes qp to be different after ~290
frames.

Change-Id: Iff65a8fdc67401814fd253dbf148afe9887df97f

3 years agoMerge "vpx_ports/x86.h: sync with aom_ports/x86.h" into main
James Zern [Fri, 30 Jul 2021 00:48:08 +0000 (00:48 +0000)]
Merge "vpx_ports/x86.h: sync with aom_ports/x86.h" into main

3 years agovp9 rc: Fills VP9_COMP zero at initialization
Hirokazu Honda [Thu, 29 Jul 2021 17:42:35 +0000 (02:42 +0900)]
vp9 rc: Fills VP9_COMP zero at initialization

Change-Id: Ib1a544ce87e8fdbe23c0e54b6426ee228011b126

3 years agovpx_ports/x86.h: sync with aom_ports/x86.h
James Zern [Mon, 26 Jul 2021 23:52:56 +0000 (16:52 -0700)]
vpx_ports/x86.h: sync with aom_ports/x86.h

adds a few comments and makes the file ascii:
854b2766a Replace non-ASCII characters

Change-Id: I6c2d76b293158bcad9f1ded7a91a81bda1e700fb

3 years agoFix some instances of -Wunused-but-set-variable.
Peter Kasting [Mon, 26 Jul 2021 10:57:55 +0000 (03:57 -0700)]
Fix some instances of -Wunused-but-set-variable.

Bug: chromium:1203071
Change-Id: Ieb628f95d676ba3814b5caf8a02a884330928c77

3 years agoMerge "Remove unused old FP_MB_STATS code" into main
Yunqing Wang [Mon, 26 Jul 2021 20:13:38 +0000 (20:13 +0000)]
Merge "Remove unused old FP_MB_STATS code" into main

3 years agoMerge "Clean up allow_partition_search_skip code" into main
Yunqing Wang [Mon, 26 Jul 2021 19:19:02 +0000 (19:19 +0000)]
Merge "Clean up allow_partition_search_skip code" into main

3 years agoMerge "Disable allow_partition_search_skip feature" into main
Yunqing Wang [Sun, 25 Jul 2021 22:42:59 +0000 (22:42 +0000)]
Merge "Disable allow_partition_search_skip feature" into main

3 years agoRemove unused old FP_MB_STATS code
Yunqing Wang [Sat, 24 Jul 2021 05:45:45 +0000 (22:45 -0700)]
Remove unused old FP_MB_STATS code

Change-Id: I78ac1f8ce1598de295efd2ac1fe8244072d9b501

3 years agoClean up allow_partition_search_skip code
Yunqing Wang [Sat, 24 Jul 2021 05:34:01 +0000 (22:34 -0700)]
Clean up allow_partition_search_skip code

Change-Id: Ia05157fc3e613d93f10df5abddd77a740a0005ca

3 years agoDisable allow_partition_search_skip feature
Yunqing Wang [Fri, 23 Jul 2021 17:55:10 +0000 (10:55 -0700)]
Disable allow_partition_search_skip feature

This feature was added to help speed up still images and slideshows.
It didn't work anymore, and thus was disabled. Code cleanup will
follow.

This had negligible impact to regular test sets. Borg test result
on ugc360p set at speed 3.
  avg_psnr:  ovr_psnr:  ssim:    speed:
   -0.244    -0.278    -0.153    -0.973

Change-Id: If74edabce0c93be1361e645ffd2eec063c2db76b

3 years agoMerge "Add control to get QP for all spatial layers" into main
Jerome Jiang [Fri, 23 Jul 2021 18:20:39 +0000 (18:20 +0000)]
Merge "Add control to get QP for all spatial layers" into main

3 years agoAdd control to get QP for all spatial layers
Jerome Jiang [Wed, 21 Jul 2021 21:32:27 +0000 (14:32 -0700)]
Add control to get QP for all spatial layers

Change-Id: I77a9884351e71649c8f8632293d9515c60f6adbc

3 years agoMerge "Use round to be more accurate casting float to int" into main
Jerome Jiang [Thu, 22 Jul 2021 17:07:58 +0000 (17:07 +0000)]
Merge "Use round to be more accurate casting float to int" into main

3 years agoAdd cyclic refresh to vp9 rtc external ratecontrol
Jerome Jiang [Tue, 29 Jun 2021 21:48:35 +0000 (14:48 -0700)]
Add cyclic refresh to vp9 rtc external ratecontrol

Change-Id: Ia2a881399aa31ca0f34481b975362ddd4ad87f1c

3 years agoUse round to be more accurate casting float to int
Jerome Jiang [Thu, 15 Jul 2021 23:05:16 +0000 (16:05 -0700)]
Use round to be more accurate casting float to int

Change-Id: Ifd5961917831752b176dd75d39d6b2cba6ce72fa

3 years agoMerge "Refactor rtc rate control test" into main
Jerome Jiang [Mon, 19 Jul 2021 21:00:35 +0000 (21:00 +0000)]
Merge "Refactor rtc rate control test" into main

3 years agoRefactor rtc rate control test
Jerome Jiang [Mon, 12 Jul 2021 21:04:12 +0000 (14:04 -0700)]
Refactor rtc rate control test

Remove golden files. Run actual encoding as the ground truth.

Change-Id: I1cea001278c1e9409bb02d33823cf69192c790a4

3 years agoAvoid chroma resampling for 420mpeg2 input
Bohan Li [Thu, 15 Jul 2021 20:21:35 +0000 (13:21 -0700)]
Avoid chroma resampling for 420mpeg2 input

BUG=aomedia:3080

Change-Id: I4ed81abf4b799224085485560f675c10c318cde6

3 years agoAdd codec control for rtc external ratectrl lib
Jerome Jiang [Tue, 13 Jul 2021 18:54:34 +0000 (11:54 -0700)]
Add codec control for rtc external ratectrl lib

This will do 3 things:

Turn off low motion computation
Turn off gf update constrain on key frame frequency
turn off content mode for cyclic refresh

Those are used to verify the external ratectrl lib works as expected.

Change-Id: Ic6e61498de82d6b3973e58df246cf5e05f838680

3 years agoCheck for addition overflows in vpx_img_set_rect()
Wan-Teh Chang [Thu, 8 Jul 2021 22:17:48 +0000 (15:17 -0700)]
Check for addition overflows in vpx_img_set_rect()

Check for x + w and y + h overflows in vpx_img_set_rect().

Move the declaration of the local variable 'data' to the block it is
used in.

Change-Id: I6bda875e1853c03135ec6ce29015bcc78bb8b7ba

3 years agoDocument vpx_img_set_rect() more precisely
Wan-Teh Chang [Thu, 8 Jul 2021 22:08:05 +0000 (15:08 -0700)]
Document vpx_img_set_rect() more precisely

Document the side effects and return value of vpx_img_set_rect() more
precisely.

Change-Id: Id1120bc478ff090a70b4ddd23c4798026bbefe10

3 years agoMerge "Avoid overflow in calc_iframe_target_size" into main
Yaowu Xu [Thu, 8 Jul 2021 19:59:34 +0000 (19:59 +0000)]
Merge "Avoid overflow in calc_iframe_target_size" into main

3 years agoMerge "Add codec control to get loopfilter level" into main
Jerome Jiang [Fri, 2 Jul 2021 22:59:08 +0000 (22:59 +0000)]
Merge "Add codec control to get loopfilter level" into main

3 years agoAdd codec control to get loopfilter level
Jerome Jiang [Fri, 2 Jul 2021 18:28:48 +0000 (11:28 -0700)]
Add codec control to get loopfilter level

Change-Id: I70d417da900082160e7ba53315af98eceede257c

3 years agoratectrl_rtc.h: quiet MSVC int64_t->int conv warning
James Zern [Fri, 2 Jul 2021 05:16:42 +0000 (22:16 -0700)]
ratectrl_rtc.h: quiet MSVC int64_t->int conv warning

target_bandwidth is int64_t, but layer_target_bitrate[0] is an int. this
is safe in the only place it's set because target_bandwidth defaults to
1000. target_bandwidth is later used to populate the cpi's target, which
is an unsigned int so there may be further fixes/cleanups that can be
done.

Change-Id: I35dbaa2e55a0fca22e0e2680dcac9ea4c6b2815a

3 years agoAvoid overflow in calc_iframe_target_size
Jorge E. Moreira [Wed, 30 Jun 2021 18:33:51 +0000 (11:33 -0700)]
Avoid overflow in calc_iframe_target_size

The changed product was observed to attempt to multiply 1800 by 2500000,
which overflows unsigned 32 bits. Converting to unsigned 64 bits first
and testing whether the final result fits in 32 bits solves the problem.

BUG=b:179686142

Change-Id: I5d27317bf14b0311b739144c451d8e172db01945

3 years agoMerge "vp9-rtc: Extract content dependency in cyclic refresh" into main
Marco Paniconi [Tue, 29 Jun 2021 18:34:46 +0000 (18:34 +0000)]
Merge "vp9-rtc: Extract content dependency in cyclic refresh" into main

3 years agoMerge "Disallow skipping transform and quantization" into main
Cheng Chen [Tue, 29 Jun 2021 16:48:29 +0000 (16:48 +0000)]
Merge "Disallow skipping transform and quantization" into main

3 years agoDisallow skipping transform and quantization
Cheng Chen [Thu, 17 Jun 2021 22:36:18 +0000 (15:36 -0700)]
Disallow skipping transform and quantization

The encoder has a feature to skip transform and quantization based
on model rd analysis. It could happen that the model
based analysis lets the encoder skips transform and quantization, while
a bad prediction occurs, leading to bad reconstructed blocks, which
are intrusive and apparently coding errors.

We add a speed feature to guard the skipping feature.
Due to the risk of bad perceptual quality, we disallow such skipping
by default.

On hdres test set, speed 2, the coding performance difference is 0.025%,
speed difference is 1.2%, which can be considered non significant.

BUG=webm:1729

Change-Id: I48af01ae8dcc7a76c05c695f3f3e68b866c89574

3 years agovp9-rtc: Extract content dependency in cyclic refresh
Marco Paniconi [Fri, 25 Jun 2021 06:34:36 +0000 (23:34 -0700)]
vp9-rtc: Extract content dependency in cyclic refresh

For usage in the external RC. When content_mode = 0,
the cyclic refresh has no dependency on the content
(motion, spatial variance, motion vectors, etc,).

The content_mode = 0, when compared to content_mode = 1,
on rtc set for speed 7: has some regression on some
clips (~3-5%), but overall/average bdrate loss is
about ~1-2%.

Comparing aq_mode=3 with content_mode = 0, vs aq_mode=3:
about ~14% avg/overall bdrate gain, but has ~3-7% regression
on some hard motion clip (e.g.m street).

Change-Id: I93117fabb8f7f89032c15baf1292b201e8c07362

3 years agoAdd constructor to VP9RateControlRtcConfig
Jerome Jiang [Thu, 24 Jun 2021 20:13:50 +0000 (13:13 -0700)]
Add constructor to VP9RateControlRtcConfig

Also add max_inter_bitrate_pct

Change-Id: Ie2c0e7f1397ca0bb55214251906412cdf24e42e2

3 years agoMerge "rc: turn off gf constrain for external RC" into main
Jerome Jiang [Tue, 22 Jun 2021 22:13:38 +0000 (22:13 +0000)]
Merge "rc: turn off gf constrain for external RC" into main

3 years agorc: turn off gf constrain for external RC
Jerome Jiang [Tue, 22 Jun 2021 00:22:51 +0000 (17:22 -0700)]
rc: turn off gf constrain for external RC

Added a new flag in rate control which turns off gf interval constrain
on key frame frequency for external RC.

It remains on for libvpx.

Change-Id: I18bb0d8247a421193f023619f906d0362b873b31

3 years agoMerge "test-data.sha1: add missing sha sums" into main
James Zern [Tue, 22 Jun 2021 03:02:58 +0000 (03:02 +0000)]
Merge "test-data.sha1: add missing sha sums" into main

3 years agoMerge changes I9f0852a0,Ieecb98a7 into main
Angie Chiang [Tue, 22 Jun 2021 01:44:02 +0000 (01:44 +0000)]
Merge changes I9f0852a0,Ieecb98a7 into main

* changes:
  Add use_simple_encode_api to oxcf
  Fix flaky assertions in SimpleEncode

3 years agoAdd use_simple_encode_api to oxcf
Angie Chiang [Fri, 18 Jun 2021 03:23:30 +0000 (20:23 -0700)]
Add use_simple_encode_api to oxcf

Use this flag to change the encoder behavior when
SimpleEncode APIs are used

BUG=webm:1733

Change-Id: I9f0852a03ff99faa01cdd8eee8ab71718cc58632

3 years agoFix flaky assertions in SimpleEncode
Angie Chiang [Fri, 18 Jun 2021 23:09:41 +0000 (16:09 -0700)]
Fix flaky assertions in SimpleEncode

Bug: webm:1731

Change-Id: Ieecb98a7ac19e6291acd5d51432dc6a3789e9552

3 years agotest-data.sha1: add missing sha sums
James Zern [Mon, 21 Jun 2021 20:33:44 +0000 (13:33 -0700)]
test-data.sha1: add missing sha sums

for rc_interface_test_one_layer_vbr and
rc_interface_test_one_layer_vbr_periodic_key added in:
1f45e7b07 vp9 rc: add vbr to rtc rate control library

Change-Id: I8bfa3698284c8ff289e830f7b8fa1ca42b752563

3 years agoMerge "vp9 rc: add vbr to rtc rate control library" into main
Jerome Jiang [Fri, 18 Jun 2021 23:25:53 +0000 (23:25 +0000)]
Merge "vp9 rc: add vbr to rtc rate control library" into main

3 years agovp9 rc: add vbr to rtc rate control library
Jerome Jiang [Tue, 15 Jun 2021 19:54:13 +0000 (12:54 -0700)]
vp9 rc: add vbr to rtc rate control library

Change-Id: I3d2565572c2b905966d60bcaa6e5e6f057b1bd51

3 years agonormalize vp9_calc_[ip]frame declarations and definitions
James Zern [Fri, 18 Jun 2021 18:56:27 +0000 (11:56 -0700)]
normalize vp9_calc_[ip]frame declarations and definitions

fixes warnings under visual studio:

vp9\encoder\vp9_ratectrl.c(2012): warning C4028: formal parameter 1
different from declaration
vp9\encoder\vp9_ratectrl.c(2027): warning C4028: formal parameter 1
different from declaration

Change-Id: Ia0740db597fb7a259f90d362b483f58662f9f584

3 years agovp9: Adjust logic for gf update in 1 pass vbr
Marco Paniconi [Thu, 17 Jun 2021 19:00:33 +0000 (12:00 -0700)]
vp9: Adjust logic for gf update in 1 pass vbr

This reduces some regression when external RC
is used, for which avg_frame_low_motion is not
set/updated (=0).

Change-Id: I2408e62bd97592e892cefa0f183357c641aa5eea

3 years agoInitialize VP9EncoderConfig profile and bit depth
Chunbo Hua [Wed, 16 Jun 2021 08:51:44 +0000 (01:51 -0700)]
Initialize VP9EncoderConfig profile and bit depth

Change-Id: I5c42013a08677cdef8d47f348458118338ff0138

3 years agoChange the data path in svc rate control test
Jerome Jiang [Tue, 15 Jun 2021 21:55:29 +0000 (14:55 -0700)]
Change the data path in svc rate control test

Change-Id: Iba58e2aa2578964b5c8b48ab0acbee9b44bcdada

3 years agovp9-rtc: Refactor 1 pass vbr rate control
Marco Paniconi [Mon, 14 Jun 2021 22:02:52 +0000 (15:02 -0700)]
vp9-rtc: Refactor 1 pass vbr rate control

This refactoring is needed to allow the
RC_rtc library to support VBR.

Change-Id: I863a4a65096fed06b02307098febf7976360e0f3

3 years agoUpdate some comments for rc_target_bitrate
James Zern [Fri, 11 Jun 2021 23:34:41 +0000 (16:34 -0700)]
Update some comments for rc_target_bitrate

this mirrors the change from libaom:
5b150b150 Update some comments for rc_target_bitrate

Change-Id: Iaabee5924e0320609a29dc8ab71327923fb4c5d2

3 years agosimple_encode: fix some -Wsign-compare warnings
James Zern [Wed, 9 Jun 2021 22:07:15 +0000 (15:07 -0700)]
simple_encode: fix some -Wsign-compare warnings

Bug: webm:1731
Change-Id: I1db777c0c3a8784fb3dcf7cd39f78ebf833ab915

3 years agosimple_encode_test: fix input file path
James Zern [Sun, 6 Jun 2021 02:30:04 +0000 (19:30 -0700)]
simple_encode_test: fix input file path

this allows the file to be located in LIBVPX_TEST_DATA_PATH similar to
other test sources.

Bug: webm:1731
Change-Id: I51606635d91871e7c179aa8d20d4841b0d60b6ad

3 years agoL2E: properly init two pass rc parameters
Cheng Chen [Thu, 27 May 2021 22:38:28 +0000 (15:38 -0700)]
L2E: properly init two pass rc parameters

Two pass rc parameters are only initialized in the second pass
in vp9 normal two pass encoding.
However, the simple_encode API queries the keyframe group, arf group,
and number of coding frames without going throught the two pass
route.
Since recent libvpx rc changes, parameters in the TWO_PASS
struct have a great influence on the determination of the above
information.
We therefore need to properly init two pass rc parameters in
the simple_encode related environment.

Change-Id: Ie14b86d6e7ebf171b638d2da24a7fdcf5a15c3d9

3 years agoFix simple encode
Cheng Chen [Mon, 24 May 2021 22:53:06 +0000 (15:53 -0700)]
Fix simple encode

Properly init and delete cpi struct in simple encode functions.

Change-Id: I6e66bcac852cbb3dec9b754ba3fb01a348ac98b8

3 years agoFixed redundant wording for decoder algorithm interface
Chunbo Hua [Wed, 26 May 2021 09:02:07 +0000 (02:02 -0700)]
Fixed redundant wording for decoder algorithm interface

Change-Id: Id56e03dc9cf6d4e70c4681896f29893a9b4c76f2

3 years agoMerge changes I2e86b005,I971c6261,I87fe4dad
James Zern [Tue, 25 May 2021 03:00:47 +0000 (03:00 +0000)]
Merge changes I2e86b005,I971c6261,I87fe4dad

* changes:
  Use 'ptrdiff_t' instead of 'int' for pointer offset parameters
  Implement vpx_convolve8_avg_vert_neon using SDOT instruction
  Merge transpose and permute in Neon SDOT vertical convolution

3 years agoMerge "img_alloc_helper: make align var unsigned"
James Zern [Tue, 25 May 2021 02:37:05 +0000 (02:37 +0000)]
Merge "img_alloc_helper: make align var unsigned"

3 years agoUse 'ptrdiff_t' instead of 'int' for pointer offset parameters
Jonathan Wright [Mon, 24 May 2021 10:42:09 +0000 (11:42 +0100)]
Use 'ptrdiff_t' instead of 'int' for pointer offset parameters

A number of the load/store functions in mem_neon.h use type 'int' for
the 'stride' pointer offset parameter. This causes Clang to generate
the following warning every time these functions are called with a
wider type passed in for 'stride':

warning: implicit conversion loses integer precision: 'ptrdiff_t'
(aka 'long') to 'int' [-Wshorten-64-to-32]

This patch changes all such instances of 'int' to 'ptrdiff_t'.

Bug: b/181236880
Change-Id: I2e86b005219e1fbb54f7cf2465e918b7c077f7ee

3 years agoImplement vpx_convolve8_avg_vert_neon using SDOT instruction
Jonathan Wright [Sun, 23 May 2021 12:35:15 +0000 (13:35 +0100)]
Implement vpx_convolve8_avg_vert_neon using SDOT instruction

Add an alternative AArch64 implementation of
vpx_convolve8_avg_vert_neon for targets that implement the Armv8.4-A
SDOT (signed dot product) instruction.

The existing MLA-based implementation of vpx_convolve8_avg_vert_neon
is retained and used on target CPUs that do not implement the SDOT
instruction (or CPUs executing in AArch32 mode). The availability of
the SDOT instruction is indicated by the feature macro
__ARM_FEATURE_DOTPROD.

Bug: b/181236880
Change-Id: I971c626116155e1384bff4c76fd3420312c7a15b

3 years agoMerge transpose and permute in Neon SDOT vertical convolution
Jonathan Wright [Sat, 22 May 2021 21:07:25 +0000 (22:07 +0100)]
Merge transpose and permute in Neon SDOT vertical convolution

The original dot-product implementation of vpx_convolve8_vert_neon
used a separate transpose before and after the convolution operation.
This patch merges the first transpose with the TBL permute (necessary
before using SDOT to compute the convolution) to significantly reduce
the amount of data re-arrangement. This new approach also allows for
more effective data re-use between loop iterations.

Co-authored by: James Greenhalgh <james.greenhalgh@arm.com>

Bug: b/181236880
Change-Id: I87fe4dadd312c3ad6216943b71a5410ddf4a1b5b

3 years agoImplement vpx_convolve8_avg_horiz_neon using SDOT instruction
Jonathan Wright [Mon, 17 May 2021 09:53:07 +0000 (10:53 +0100)]
Implement vpx_convolve8_avg_horiz_neon using SDOT instruction

Add an alternative AArch64 implementation of
vpx_convolve8_avg_horiz_neon for targets that implement the Armv8.4-A
SDOT (signed dot product) instruction.

The existing MLA-based implementation of vpx_convolve8_avg_horiz_neon
is retained and used on target CPUs that do not implement the SDOT
instruction (or CPUs executing in AArch32 mode). The availability of
the SDOT instruction is indicated by the feature macro
__ARM_FEATURE_DOTPROD.

Bug: b/181236880
Change-Id: Ib435107c47c485f325248da87ba5618d68b0c8ed

3 years agoOptimize remaining mse and sse functions in variance_neon.c
Jonathan Wright [Wed, 12 May 2021 15:05:56 +0000 (16:05 +0100)]
Optimize remaining mse and sse functions in variance_neon.c

Implement sum of squared difference calculations in vpx_mse16x16_neon
and vpx_get4x4sse_cs_neon using the ABD and UDOT instructions -
instead of widening subtracts followed by a sequence of MLAs.

The existing implementation is retained for use on CPUs that do not
implement the Armv8.4-A UDOT instruction. This commit also updates
the variable names used in the existing implementations to be more
descriptive.

Bug: b/181236880
Change-Id: Id4ad8ea7c808af1ac9bb5f1b63327ab487e4b1c7

3 years agoImplement vertical convolution using Neon SDOT instruction
Jonathan Wright [Tue, 20 Apr 2021 11:03:56 +0000 (12:03 +0100)]
Implement vertical convolution using Neon SDOT instruction

Add an alternative AArch64 implementation of vpx_convolve8_vert_neon
for targets that implement the Armv8.4-A SDOT (signed dot product)
instruction.

The existing MLA-based implementation of vpx_convolve8_vert_neon is
retained and used on target CPUs that do not implement the SDOT
instruction (or CPUs executing in AArch32 mode). The availability of
the SDOT instruction is indicated by the feature macro
__ARM_FEATURE_DOTPROD.

Bug: b/181236880
Change-Id: Iebb8c77aba1d45b553b5112f3d87071fef3076f0

3 years agoImplement Neon variance functions using UDOT instruction
Jonathan Wright [Tue, 11 May 2021 12:17:44 +0000 (13:17 +0100)]
Implement Neon variance functions using UDOT instruction

Accelerate Neon variance functions by implementing the sum of squares
calculation using the Armv8.4-A UDOT instruction instead of 4 MLAs.

The previous implementation is retained for use on CPUs that do not
implement the Armv8.4-A dot product instructions.

Bug: b/181236880
Change-Id: I9ab3d52634278b9b6f0011f39390a1195210bc75

3 years agoUse ABD and UDOT to implement Neon sad_4d functions
Jonathan Wright [Mon, 10 May 2021 11:22:03 +0000 (12:22 +0100)]
Use ABD and UDOT to implement Neon sad_4d functions

Implementing sad16_neon using ABD, UDOT instead of ABAL, ABAL2 saves
a cycle and removes resource contention for a single SIMD pipe on
modern out-of-order Arm CPUs. The UDOT accumulation into 32-bit
elements also allows for a faster reduction at the end of each SAD
function.

The existing implementation is retained for CPUs that do not
implement the Armv8.4-A UDOT instruction, and CPUs executing in
AArch32 mode.

Bug: b/181236880
Change-Id: Ibd0da46e86751d2f808c7b1e424f82b046a1aa6f

3 years agoOptimize Neon reductions in sum_neon.h using ADDV instruction
Jonathan Wright [Fri, 7 May 2021 12:25:51 +0000 (13:25 +0100)]
Optimize Neon reductions in sum_neon.h using ADDV instruction

Use the AArch64-only ADDV and ADDLV instructions to accelerate
reductions that add across a Neon vector in sum_neon.h. This commit
also refactors the inline functions to return a scalar instead of a
vector - allowing for optimization of the surrounding code at each
call site.

Bug: b/181236880
Change-Id: Ieed2a2dd3c74f8a52957bf404141ffc044bd5d79

3 years agoimg_alloc_helper: make align var unsigned
James Zern [Sat, 8 May 2021 02:35:25 +0000 (19:35 -0700)]
img_alloc_helper: make align var unsigned

quiets an integer sanitizer warning:
vpx/src/vpx_image.c:101:25: runtime error: implicit conversion from
type 'int' of value -2 (32-bit, signed) to type 'unsigned int' changed
the value to 4294967294 (32-bit, unsigned)

Change-Id: Ifeac31cc80811081c1ba10aadaa94dc36cd46efa

3 years agoManually unroll the inner loop of Neon sad16x_4d()
Jonathan Wright [Thu, 6 May 2021 14:11:52 +0000 (15:11 +0100)]
Manually unroll the inner loop of Neon sad16x_4d()

Manually unrolling the inner loop is sufficient to stop the compiler
getting confused and emitting inefficient code.

Co-authored by: James Greenhalgh <james.greenhalgh@arm.com>

Bug: b/181236880
Change-Id: I860768ce0e6c0e0b6286d3fc1b94f0eae95d0a1a

3 years agoOptimize Neon SAD reductions using wider ADDP instruction
Jonathan Wright [Thu, 6 May 2021 13:51:05 +0000 (14:51 +0100)]
Optimize Neon SAD reductions using wider ADDP instruction

Implement AArch64-only paths for each of the Neon SAD reduction
functions, making use of a wider pairwise addition instruction only
available on AArch64.

This change removes the need for shuffling between high and low
halves of Neon vectors - resulting in a faster reduction that requires
fewer instructions.

Bug: b/181236880
Change-Id: I1c48580b4aec27222538eeab44e38ecc1f2009dc

3 years agoMerge "Implement horizontal convolution using Neon SDOT instruction"
James Zern [Wed, 5 May 2021 19:57:10 +0000 (19:57 +0000)]
Merge "Implement horizontal convolution using Neon SDOT instruction"

3 years agoImplement horizontal convolution using Neon SDOT instruction
Jonathan Wright [Sun, 11 Apr 2021 14:20:36 +0000 (15:20 +0100)]
Implement horizontal convolution using Neon SDOT instruction

Add an alternative AArch64 implementation of vpx_convolve8_horiz_neon
for targets that implement the Armv8.4-A SDOT (signed dot product)
instruction.

The existing MLA-based implementation of vpx_convolve8_horiz_neon is
retained and used on target CPUs that do not implement the SDOT
instruction (or CPUs executing in AArch32 mode). The availability of
the SDOT instruction is indicated by the feature macro
__ARM_FEATURE_DOTPROD.

Co-authored by: James Greenhalgh <james.greenhalgh@arm.com>

Change-Id: I5337286b0f5f2775ad7cdbc0174785ae694363cc

3 years agovp9_denoiser_neon,horizontal_add_s8x16: use vaddlv w/aarch64
James Zern [Tue, 4 May 2021 19:13:17 +0000 (12:13 -0700)]
vp9_denoiser_neon,horizontal_add_s8x16: use vaddlv w/aarch64

this reduces the number of instructions to compute the sum

Change-Id: Icae4d4fb3e343d5b6e5a095c60ac6d171b3e7d54

3 years agotest.mk: enable vp9_denoiser_test w/NEON
James Zern [Tue, 4 May 2021 19:10:21 +0000 (12:10 -0700)]
test.mk: enable vp9_denoiser_test w/NEON

this file uses GTEST_ALLOW_UNINSTANTIATED_PARAMETERIZED_TEST so it's
safe to enable unconditionally. the filter check fell out of sync with
the code, there's a sse2 and neon implementation for the filter.

Change-Id: I2a3336ccef3fb524ca5d9b8f88279240c9a276aa

3 years agoMerge "Add assert for zero_motion_factor range"
Paul Wilkins [Thu, 29 Apr 2021 16:31:52 +0000 (16:31 +0000)]
Merge "Add assert for zero_motion_factor range"

3 years agoAdd assert for zero_motion_factor range
Paul Wilkins [Thu, 29 Apr 2021 10:06:16 +0000 (11:06 +0100)]
Add assert for zero_motion_factor range

Change clamp to an assert so we are warned if changes to input
ranges or defaults in the future lead to an invalid value.

Change-Id: Idb4e0729f477a519bfff3083cdce3891e2fc6faa

3 years agoBump ABI version
Cheng Chen [Wed, 28 Apr 2021 20:54:07 +0000 (13:54 -0700)]
Bump ABI version

Due to recent changes to command line options for rate control
parameters.

Change-Id: I1de7cb4ff2850a3ed19ec216dd9d07f64a118e92

3 years agoMerge changes Iebe9842f,I174b67a5,I80ed1a16
James Zern [Wed, 28 Apr 2021 17:30:48 +0000 (17:30 +0000)]
Merge changes Iebe9842f,I174b67a5,I80ed1a16

* changes:
  vpx_convolve_neon: prefer != 0 to > 0 in tests
  vpx_convolve_avg_neon: prefer != 0 to > 0 in tests
  vpx_convolve_copy_neon: prefer != 0 to > 0 in tests

3 years agoMerge "vp8: enc: Fix valid range for under/over_shoot pct"
James Zern [Wed, 28 Apr 2021 02:27:54 +0000 (02:27 +0000)]
Merge "vp8: enc: Fix valid range for under/over_shoot pct"

3 years agovpx_convolve_neon: prefer != 0 to > 0 in tests
James Zern [Wed, 28 Apr 2021 01:02:35 +0000 (18:02 -0700)]
vpx_convolve_neon: prefer != 0 to > 0 in tests

this produces better assembly code; the horizontal convolve is called
with an adjusted intermediate_height where it may over process some rows
so the checks in those functions remain.

Change-Id: Iebe9842f2a13a4960d9a5addde9489452f5ce33a

3 years agovpx_convolve_avg_neon: prefer != 0 to > 0 in tests
James Zern [Wed, 28 Apr 2021 01:02:35 +0000 (18:02 -0700)]
vpx_convolve_avg_neon: prefer != 0 to > 0 in tests

this produces better assembly code

Change-Id: I174b67a595d7efeb60c921f066302043b1c7d84e

3 years agovpx_convolve_copy_neon: prefer != 0 to > 0 in tests
James Zern [Wed, 28 Apr 2021 01:02:35 +0000 (18:02 -0700)]
vpx_convolve_copy_neon: prefer != 0 to > 0 in tests

this produces better assembly code

Change-Id: I80ed1a165512e941b35a4965faa0c44403357e91

3 years agoAdd limits to Vizier input parameters.
Paul Wilkins [Mon, 26 Apr 2021 14:06:54 +0000 (15:06 +0100)]
Add limits to Vizier input parameters.

Imposed provisional upper and lower limits to each parameter
that can be adjusted in the Vizier ML experiment.

Also in some cases applied secondary limits on on the
range of the final "used" values.

Defaults and limits may well require further tuning after
subsequent rounds of experimentation.

Re-factor get_sr_decay_rate().

Change-Id: I28e804ce3d3710f30cd51a203348e4ab23ef06c0

3 years agosync CONTRIBUTING.md w/libwebm
James Zern [Fri, 23 Apr 2021 23:50:48 +0000 (16:50 -0700)]
sync CONTRIBUTING.md w/libwebm

Change-Id: I63ffea52d079b0d50002526e209ae3fb64811bac

3 years agovp8: enc: Fix valid range for under/over_shoot pct
Sreerenj Balachandran [Wed, 21 Apr 2021 18:34:03 +0000 (11:34 -0700)]
vp8: enc: Fix valid range for under/over_shoot pct

The overshoot_pct & undershoot_pct attributes for rate control
are expressed as a percentage of the target bitrate, so the range
should be 0-100.

Change-Id: I67af3c8be7ab814c711c2eaf30786f1e2fa4f5a3

3 years agoFurther normalization of Vizier parameters.
Paul Wilkins [Tue, 20 Apr 2021 16:26:22 +0000 (17:26 +0100)]
Further normalization of Vizier parameters.

Further changes to normalize the Vizier command line parameters.
The intent is that the default behavior for any given parameter
is signaled by the value 1.0 (expressed on the command line as a
rational).

The final values used in the two pass code are obtained by multiplying
the passed in factors by a default values if use_vizier_rc_params is 1.
Where  use_vizier_rc_params is 0 the values are explicitly set to
the defaults.

This patch also changes the default value of each parameter to 1.0
even if not set explicitly. This should ensure safe /default behavior
if the user sets use_vizier_rc_params to 1 but does not set all the
the individual parameters.

Change-Id: Ied08b3c22df18f42f446a4cc9363473cad097f69

3 years agoPass vizier rd parameter values
Cheng Chen [Thu, 15 Apr 2021 05:15:54 +0000 (22:15 -0700)]
Pass vizier rd parameter values

Add command line options for three rd parameters.
They are controlled by --use_vizier_rc_params, together with
other rc parameters.
If not set from command line, current default values will be used.

Change-Id: Ie1b9a98a50326551cc1d5940c4b637cb01a61aa0

3 years agoMerge "Set vizier rc parameters"
Cheng Chen [Wed, 14 Apr 2021 16:31:40 +0000 (16:31 +0000)]
Merge "Set vizier rc parameters"

3 years agoSet vizier rc parameters
Cheng Chen [Tue, 13 Apr 2021 18:59:40 +0000 (11:59 -0700)]
Set vizier rc parameters

If pass --use-vizier-rc-params=1, the rc parameters are overwittern
by pass in values. It --use-vizier-rc-params=0, the rc parameters
remain the default values.

Change-Id: I7a3e806e0918f49e8970997379a6e99af6bb7cac

3 years agoMerge "Removed unused constant"
Paul Wilkins [Tue, 13 Apr 2021 19:04:49 +0000 (19:04 +0000)]
Merge "Removed unused constant"

3 years agoRemoved unused constant
Paul Wilkins [Mon, 12 Apr 2021 12:51:44 +0000 (13:51 +0100)]
Removed unused constant

Deleted #define that is no longer referenced.

Change-Id: If0b132c5a40dd8910f535fffdee7d2d1c7df4748

3 years agovpx_image: clear user provided vpx_image_t early
James Zern [Fri, 9 Apr 2021 00:34:16 +0000 (17:34 -0700)]
vpx_image: clear user provided vpx_image_t early

this avoids uninitialized values and potential misuse of them which
could lead to a crash should the function fail

this is the same fix that was applied in libaom:
d0cac70b5 Fix a free on invalid ptr when img allocation fails

Bug: webm:1722
Change-Id: If7a8d08c4b010f12e2e1d848613c0fa7328f1f9c

3 years agoMerge "Fix compilation for CONFIG_RATE_CTRL"
Cheng Chen [Wed, 7 Apr 2021 19:03:53 +0000 (19:03 +0000)]
Merge "Fix compilation for CONFIG_RATE_CTRL"

3 years agoMerge "Delete unused constants."
Paul Wilkins [Wed, 7 Apr 2021 12:53:07 +0000 (12:53 +0000)]
Merge "Delete unused constants."

3 years agoMerge "Change zm_factor for Vizier."
Paul Wilkins [Wed, 7 Apr 2021 12:52:47 +0000 (12:52 +0000)]
Merge "Change  zm_factor for Vizier."

3 years agoFix compilation for CONFIG_RATE_CTRL
Cheng Chen [Fri, 2 Apr 2021 21:27:16 +0000 (14:27 -0700)]
Fix compilation for CONFIG_RATE_CTRL

Recently, some function signatures have been changed.
This change fixes compilation error if --enable-rate-ctrl is used.

Change-Id: Ib8e9cb5e181ba1d4a6969883e377f3dd93e9289a

3 years agoAdjust end to end psnr value
Cheng Chen [Wed, 7 Apr 2021 00:20:39 +0000 (17:20 -0700)]
Adjust end to end psnr value

A recent change leads to slight difference of encoding results:
d3aaac367 Change calculation of rd multiplier,
which is caught by Jenkins nightly test.

Adjust the threshold to silence the test failure.

BUG=webm:1725

Change-Id: I7e8b3a26b72c831ae4d88d0fca681b354314739d

3 years agoChange zm_factor for Vizier.
Paul Wilkins [Tue, 6 Apr 2021 19:05:48 +0000 (20:05 +0100)]
Change  zm_factor for Vizier.

Changes the exposed zm_factor parameter.

This patch alters the meaning of the zm_factor
parameter that will be exposed for the Vizier project.

The previous power factor was hard to interpret in terms
of its meaning and effect and has been replaced by a linear factor.
Given that the initial Vizier results suggested a lower zero motion
effect for all formats, the default impact has been reduced.

The patch as it stands gives a modest improvement for PSNR
but is slightly down on some sets for SSIM

(overall psnr, ssim % bdrate change: -ve is better)

lowres    -0.111, 0.001
ugc360p   -0.282, -0.068
midres2   -0.183, 0.059
hdres2    -0.042, 0.172

Change-Id: Id6566433ceed8470d5fad1f30282daed56de385d

3 years agoDelete unused constants.
Paul Wilkins [Tue, 6 Apr 2021 19:08:01 +0000 (20:08 +0100)]
Delete unused constants.

Delete some #defines that are no longer needed.

Change-Id: I9e4e4df10716598b0d62b0c70f538d4b78a32296