James Zern [Sat, 6 Nov 2021 17:42:46 +0000 (10:42 -0700)]
vp8 encoder: fix some integer overflows
cap the bitrate to 1000Mbps to avoid many instances of bitrate * 3 / 2
overflowing.
this adds coverage for 2048x2048 in the default test for VP8 with TODOs
for issues at that resolution for VP9 and at max resolution for both.
Bug: b/
189602769
Bug: chromium:1264506
Bug: webm:1748
Bug: webm:1749
Bug: webm:1750
Bug: webm:1751
Change-Id: Iedee4dd8d3609c2504271f94d22433dfcd828429
James Zern [Mon, 8 Nov 2021 20:57:12 +0000 (12:57 -0800)]
vp8,calc_pframe_target_size: fix integer overflow
this is similar to the fix for calc_iframe_target_size:
5f345a924 Avoid overflow in calc_iframe_target_size
Bug: chromium:1264506
Change-Id: I2f0e161cf9da59ca0724692d581f1594c8098ebb
James Zern [Sat, 6 Nov 2021 17:42:46 +0000 (10:42 -0700)]
vp8_update_rate_correction_factors: fix integer overflow
the intermediate value in the correction_factor calculation may exceed
integer bounds
Bug: b/
189602769
Change-Id: I75726b12f3095663911d78333f3ea26eb6dee21e
James Zern [Thu, 4 Nov 2021 20:01:01 +0000 (20:01 +0000)]
Merge "update tools/cpplint.py" into main
James Zern [Wed, 3 Nov 2021 23:23:06 +0000 (16:23 -0700)]
update tools/cpplint.py
https://github.com/google/styleguide.git
100755 blob
4a82bde4f95cef8103520bc2c019483397ec51f4 cpplint/cpplint.py
Bug: aomedia:3178
Change-Id: I9e11d647096fc2082b18d74731026dabb52639bb
James Zern [Wed, 3 Nov 2021 00:19:10 +0000 (17:19 -0700)]
tools_common.h: add VPX_TOOLS_FORMAT_PRINTF
and use it to set the format attribute for printf like functions. this
allows the examples to be built with -Wformat-nonliteral without
producing warnings.
Bug: webm:1744
Change-Id: I26b4c41c9a42790053b1ae0e4a678af8f2cd1d82
Fixed: webm:1744
James Zern [Tue, 2 Nov 2021 23:29:52 +0000 (16:29 -0700)]
vpx_codec_internal.h: add LIBVPX_FORMAT_PRINTF
and use it to set the format attribute for the printf like function
vpx_internal_error(). this allows the main library to be built with
-Wformat-nonliteral without producing warnings; the examples will be
handled in a followup.
Bug: webm:1744
Change-Id: Iebc322e24db35d902c5a2b1ed767d2e10e9c91b9
James Zern [Fri, 15 Oct 2021 20:08:58 +0000 (20:08 +0000)]
Merge "vp8_yv12_realloc_frame_buffer: move allocation check" into main
James Zern [Tue, 12 Oct 2021 18:57:39 +0000 (11:57 -0700)]
test/Android.mk: import LICENSE indicators from AOSP
https://android-review.googlesource.com/c/platform/external/libvpx/+/1853628
https://android.googlesource.com/platform/external/libvpx/+/
e40f8afb1e51d3bd13d662c1881e3cfb616fa2b8
Change-Id: I15f185ab7c7661f4456c4ad7296fdda01dfb8d53
James Zern [Mon, 11 Oct 2021 20:32:51 +0000 (20:32 +0000)]
Merge "Android.mk: import LICENSE indicators from AOSP" into main
James Zern [Sat, 9 Oct 2021 17:33:37 +0000 (10:33 -0700)]
Android.mk: import LICENSE indicators from AOSP
https://android-review.googlesource.com/c/platform/external/libvpx/+/1588942
https://android.googlesource.com/platform/external/libvpx/+/
099828b5c770ef8630741721be4b6c25a8394204
Change-Id: Ieca1c882f82bcbc7546944b43af7fab358f925d2
James Zern [Fri, 8 Oct 2021 23:24:23 +0000 (16:24 -0700)]
vp8_yv12_realloc_frame_buffer: move allocation check
to before the memset used under msan to avoid any spurious reports in
OOM conditions
Change-Id: I0c4ee92829bbcb356e94f503a4615caf891bb49d
Jerome Jiang [Thu, 7 Oct 2021 17:47:25 +0000 (10:47 -0700)]
Merge branch 'smew' into main
Bug: webm:1732
Change-Id: Id782a897d8005d316dc5b72859657c219edabf30
Jerome Jiang [Tue, 5 Oct 2021 22:57:34 +0000 (15:57 -0700)]
Update AUTHORS and version info in libs.mk
Bug: webm:1732
Change-Id: I29ce77c7d02bd2f5cb0ef8412333df032744b668
James Zern [Fri, 1 Oct 2021 20:46:02 +0000 (13:46 -0700)]
{vp8,vp9}_set_roi_map: fix validation with INT_MIN
previously ranges were checked with abs() whose behavior is undefined
with INT_MIN. this fixes a crash when the original value is returned and
it later used as and offset into a table.
Bug: webm:1742
Change-Id: I345970b75c46699587a4fbc4a059e59277f4c2c8
Jerome Jiang [Mon, 4 Oct 2021 17:43:28 +0000 (17:43 +0000)]
Merge changes If2ef4400,I345970b7 into main
* changes:
vpx_roi_map: add delta range info
{vp8,vp9}_set_roi_map: fix validation with INT_MIN
James Zern [Fri, 1 Oct 2021 22:42:50 +0000 (15:42 -0700)]
vpx_roi_map: add delta range info
Change-Id: If2ef4400562075b4e7abadc01638a46c0c7f1859
James Zern [Fri, 1 Oct 2021 20:46:02 +0000 (13:46 -0700)]
{vp8,vp9}_set_roi_map: fix validation with INT_MIN
previously ranges were checked with abs() whose behavior is undefined
with INT_MIN. this fixes a crash when the original value is returned and
it later used as and offset into a table.
Bug: webm:1742
Change-Id: I345970b75c46699587a4fbc4a059e59277f4c2c8
Marco Paniconi [Fri, 1 Oct 2021 22:25:12 +0000 (22:25 +0000)]
Merge "vp8: Condition decimation drop logic on drop_frames_allowed" into main
Marco Paniconi [Fri, 1 Oct 2021 20:16:56 +0000 (13:16 -0700)]
vp8: Condition decimation drop logic on drop_frames_allowed
This allows user to make sure frame will be encoded
when drop_frames is set off (on the fly), no matter
the state of the buffer.
Change-Id: Ia7b39b93fe3721dd586bdbede72c525db87b6890
Marco Paniconi [Fri, 1 Oct 2021 18:54:53 +0000 (11:54 -0700)]
vp8: For screen mode: clip buffer from below
Condition already existed for screen content mode,
but only when frame-dropper was off. Remove the
frame drop condition.
Change-Id: Ie7357041f5ca05b01e78b4bd3b40da060382591b
Jerome Jiang [Mon, 27 Sep 2021 22:52:53 +0000 (15:52 -0700)]
CHANGELOG for Smew v1.11.0
Bug: webm:1732
Change-Id: I6038f401cf1dfdcaca85b81d0b8b2c04967b44dd
Jerome Jiang [Mon, 20 Sep 2021 20:37:43 +0000 (13:37 -0700)]
Cap duration to avoid overflow
Bug: webm:1728
Change-Id: Id13475660fa921e8ddcc89847e978da4c8d85886
(cherry picked from commit
09775194ffdb84b4979f3988e7ef301575b661df)
Wan-Teh Chang [Fri, 10 Sep 2021 22:54:51 +0000 (15:54 -0700)]
Define the VPX_NO_RETURN macro for MSVC
Define VPX_NO_RETURN as __declspec(noreturn) for MSVC. See
https://docs.microsoft.com/en-us/cpp/cpp/noreturn?view=msvc-160
This requires moving VPX_NO_RETURN before function declarations because
__declspec(noreturn) must be placed there. Fortunately GCC's
__attribute__((noreturn)) can be placed either before or after function
declarations.
Change-Id: Id9bb0077e2a4f16ec2ca9c913dd93673a0e385cf
(cherry picked from commit
8a6fbc0b4eb8538e213782bcdc3969a08b44e73b)
Jerome Jiang [Fri, 24 Sep 2021 21:56:00 +0000 (14:56 -0700)]
vp8 rc: Clear system state at the end of calls
Clear system state at the end of rc calls to make sure the state is
consistent before and after
Change-Id: I59fe9c99485b1a8603c20db37961339b7575455f
Jerome Jiang [Thu, 23 Sep 2021 22:10:57 +0000 (22:10 +0000)]
Merge "vp8 rc: support temporal layers" into main
Jerome Jiang [Thu, 16 Sep 2021 17:16:44 +0000 (10:16 -0700)]
vp8 rc: support temporal layers
Change-Id: I2c7d5de0e17b072cb763f1659b1badce4fe0b82b
Jerome Jiang [Wed, 22 Sep 2021 17:27:26 +0000 (17:27 +0000)]
Merge "Cap duration to avoid overflow" into main
Jerome Jiang [Mon, 20 Sep 2021 20:37:43 +0000 (13:37 -0700)]
Cap duration to avoid overflow
Bug: webm:1728
Change-Id: Id13475660fa921e8ddcc89847e978da4c8d85886
Jerome Jiang [Thu, 16 Sep 2021 17:19:09 +0000 (10:19 -0700)]
vp8 rc: explicit cast to avoid VS build failure
Change-Id: I6a4daca12b79cf996964661e1af85aa6e258b446
Wan-Teh Chang [Fri, 10 Sep 2021 22:54:51 +0000 (15:54 -0700)]
Define the VPX_NO_RETURN macro for MSVC
Define VPX_NO_RETURN as __declspec(noreturn) for MSVC. See
https://docs.microsoft.com/en-us/cpp/cpp/noreturn?view=msvc-160
This requires moving VPX_NO_RETURN before function declarations because
__declspec(noreturn) must be placed there. Fortunately GCC's
__attribute__((noreturn)) can be placed either before or after function
declarations.
Change-Id: Id9bb0077e2a4f16ec2ca9c913dd93673a0e385cf
Jerome Jiang [Tue, 31 Aug 2021 17:22:22 +0000 (10:22 -0700)]
Add vp8 support to rc lib
For 1 layer CBR only.
Support for temporal layers comes later.
Rename the library to libvpxrc
Bug: b/
188853141
Change-Id: Ib7f977b64c05b1a0596870cb7f8e6768cb483850
Jerome Jiang [Wed, 8 Sep 2021 23:52:51 +0000 (16:52 -0700)]
vp8 rc: always update correction factor
Change-Id: Id40b9cb5a85a15fb313a2a93f14f6768259f7c15
Jerome Jiang [Thu, 2 Sep 2021 23:15:13 +0000 (16:15 -0700)]
Add codec control for vp8 external rc
disable cyclic refresh
Change-Id: I7905602919d5780831fad840577e97730ce0afc2
Jerome Jiang [Tue, 24 Aug 2021 21:30:54 +0000 (14:30 -0700)]
vp9 rc lib: Allow aq 3 to work for SVC with unit test
Also use round to cast float to int with more accurate calculation to
avoid error accumulation which causes qp to be different after ~290
frames.
Change-Id: Iff65a8fdc67401814fd253dbf148afe9887df97f
James Zern [Fri, 30 Jul 2021 00:48:08 +0000 (00:48 +0000)]
Merge "vpx_ports/x86.h: sync with aom_ports/x86.h" into main
Hirokazu Honda [Thu, 29 Jul 2021 17:42:35 +0000 (02:42 +0900)]
vp9 rc: Fills VP9_COMP zero at initialization
Change-Id: Ib1a544ce87e8fdbe23c0e54b6426ee228011b126
James Zern [Mon, 26 Jul 2021 23:52:56 +0000 (16:52 -0700)]
vpx_ports/x86.h: sync with aom_ports/x86.h
adds a few comments and makes the file ascii:
854b2766a Replace non-ASCII characters
Change-Id: I6c2d76b293158bcad9f1ded7a91a81bda1e700fb
Peter Kasting [Mon, 26 Jul 2021 10:57:55 +0000 (03:57 -0700)]
Fix some instances of -Wunused-but-set-variable.
Bug: chromium:1203071
Change-Id: Ieb628f95d676ba3814b5caf8a02a884330928c77
Yunqing Wang [Mon, 26 Jul 2021 20:13:38 +0000 (20:13 +0000)]
Merge "Remove unused old FP_MB_STATS code" into main
Yunqing Wang [Mon, 26 Jul 2021 19:19:02 +0000 (19:19 +0000)]
Merge "Clean up allow_partition_search_skip code" into main
Yunqing Wang [Sun, 25 Jul 2021 22:42:59 +0000 (22:42 +0000)]
Merge "Disable allow_partition_search_skip feature" into main
Yunqing Wang [Sat, 24 Jul 2021 05:45:45 +0000 (22:45 -0700)]
Remove unused old FP_MB_STATS code
Change-Id: I78ac1f8ce1598de295efd2ac1fe8244072d9b501
Yunqing Wang [Sat, 24 Jul 2021 05:34:01 +0000 (22:34 -0700)]
Clean up allow_partition_search_skip code
Change-Id: Ia05157fc3e613d93f10df5abddd77a740a0005ca
Yunqing Wang [Fri, 23 Jul 2021 17:55:10 +0000 (10:55 -0700)]
Disable allow_partition_search_skip feature
This feature was added to help speed up still images and slideshows.
It didn't work anymore, and thus was disabled. Code cleanup will
follow.
This had negligible impact to regular test sets. Borg test result
on ugc360p set at speed 3.
avg_psnr: ovr_psnr: ssim: speed:
-0.244 -0.278 -0.153 -0.973
Change-Id: If74edabce0c93be1361e645ffd2eec063c2db76b
Jerome Jiang [Fri, 23 Jul 2021 18:20:39 +0000 (18:20 +0000)]
Merge "Add control to get QP for all spatial layers" into main
Jerome Jiang [Wed, 21 Jul 2021 21:32:27 +0000 (14:32 -0700)]
Add control to get QP for all spatial layers
Change-Id: I77a9884351e71649c8f8632293d9515c60f6adbc
Jerome Jiang [Thu, 22 Jul 2021 17:07:58 +0000 (17:07 +0000)]
Merge "Use round to be more accurate casting float to int" into main
Jerome Jiang [Tue, 29 Jun 2021 21:48:35 +0000 (14:48 -0700)]
Add cyclic refresh to vp9 rtc external ratecontrol
Change-Id: Ia2a881399aa31ca0f34481b975362ddd4ad87f1c
Jerome Jiang [Thu, 15 Jul 2021 23:05:16 +0000 (16:05 -0700)]
Use round to be more accurate casting float to int
Change-Id: Ifd5961917831752b176dd75d39d6b2cba6ce72fa
Jerome Jiang [Mon, 19 Jul 2021 21:00:35 +0000 (21:00 +0000)]
Merge "Refactor rtc rate control test" into main
Jerome Jiang [Mon, 12 Jul 2021 21:04:12 +0000 (14:04 -0700)]
Refactor rtc rate control test
Remove golden files. Run actual encoding as the ground truth.
Change-Id: I1cea001278c1e9409bb02d33823cf69192c790a4
Bohan Li [Thu, 15 Jul 2021 20:21:35 +0000 (13:21 -0700)]
Avoid chroma resampling for 420mpeg2 input
BUG=aomedia:3080
Change-Id: I4ed81abf4b799224085485560f675c10c318cde6
Jerome Jiang [Tue, 13 Jul 2021 18:54:34 +0000 (11:54 -0700)]
Add codec control for rtc external ratectrl lib
This will do 3 things:
Turn off low motion computation
Turn off gf update constrain on key frame frequency
turn off content mode for cyclic refresh
Those are used to verify the external ratectrl lib works as expected.
Change-Id: Ic6e61498de82d6b3973e58df246cf5e05f838680
Wan-Teh Chang [Thu, 8 Jul 2021 22:17:48 +0000 (15:17 -0700)]
Check for addition overflows in vpx_img_set_rect()
Check for x + w and y + h overflows in vpx_img_set_rect().
Move the declaration of the local variable 'data' to the block it is
used in.
Change-Id: I6bda875e1853c03135ec6ce29015bcc78bb8b7ba
Wan-Teh Chang [Thu, 8 Jul 2021 22:08:05 +0000 (15:08 -0700)]
Document vpx_img_set_rect() more precisely
Document the side effects and return value of vpx_img_set_rect() more
precisely.
Change-Id: Id1120bc478ff090a70b4ddd23c4798026bbefe10
Yaowu Xu [Thu, 8 Jul 2021 19:59:34 +0000 (19:59 +0000)]
Merge "Avoid overflow in calc_iframe_target_size" into main
Jerome Jiang [Fri, 2 Jul 2021 22:59:08 +0000 (22:59 +0000)]
Merge "Add codec control to get loopfilter level" into main
Jerome Jiang [Fri, 2 Jul 2021 18:28:48 +0000 (11:28 -0700)]
Add codec control to get loopfilter level
Change-Id: I70d417da900082160e7ba53315af98eceede257c
James Zern [Fri, 2 Jul 2021 05:16:42 +0000 (22:16 -0700)]
ratectrl_rtc.h: quiet MSVC int64_t->int conv warning
target_bandwidth is int64_t, but layer_target_bitrate[0] is an int. this
is safe in the only place it's set because target_bandwidth defaults to
1000. target_bandwidth is later used to populate the cpi's target, which
is an unsigned int so there may be further fixes/cleanups that can be
done.
Change-Id: I35dbaa2e55a0fca22e0e2680dcac9ea4c6b2815a
Jorge E. Moreira [Wed, 30 Jun 2021 18:33:51 +0000 (11:33 -0700)]
Avoid overflow in calc_iframe_target_size
The changed product was observed to attempt to multiply 1800 by 2500000,
which overflows unsigned 32 bits. Converting to unsigned 64 bits first
and testing whether the final result fits in 32 bits solves the problem.
BUG=b:
179686142
Change-Id: I5d27317bf14b0311b739144c451d8e172db01945
Marco Paniconi [Tue, 29 Jun 2021 18:34:46 +0000 (18:34 +0000)]
Merge "vp9-rtc: Extract content dependency in cyclic refresh" into main
Cheng Chen [Tue, 29 Jun 2021 16:48:29 +0000 (16:48 +0000)]
Merge "Disallow skipping transform and quantization" into main
Cheng Chen [Thu, 17 Jun 2021 22:36:18 +0000 (15:36 -0700)]
Disallow skipping transform and quantization
The encoder has a feature to skip transform and quantization based
on model rd analysis. It could happen that the model
based analysis lets the encoder skips transform and quantization, while
a bad prediction occurs, leading to bad reconstructed blocks, which
are intrusive and apparently coding errors.
We add a speed feature to guard the skipping feature.
Due to the risk of bad perceptual quality, we disallow such skipping
by default.
On hdres test set, speed 2, the coding performance difference is 0.025%,
speed difference is 1.2%, which can be considered non significant.
BUG=webm:1729
Change-Id: I48af01ae8dcc7a76c05c695f3f3e68b866c89574
Marco Paniconi [Fri, 25 Jun 2021 06:34:36 +0000 (23:34 -0700)]
vp9-rtc: Extract content dependency in cyclic refresh
For usage in the external RC. When content_mode = 0,
the cyclic refresh has no dependency on the content
(motion, spatial variance, motion vectors, etc,).
The content_mode = 0, when compared to content_mode = 1,
on rtc set for speed 7: has some regression on some
clips (~3-5%), but overall/average bdrate loss is
about ~1-2%.
Comparing aq_mode=3 with content_mode = 0, vs aq_mode=3:
about ~14% avg/overall bdrate gain, but has ~3-7% regression
on some hard motion clip (e.g.m street).
Change-Id: I93117fabb8f7f89032c15baf1292b201e8c07362
Jerome Jiang [Thu, 24 Jun 2021 20:13:50 +0000 (13:13 -0700)]
Add constructor to VP9RateControlRtcConfig
Also add max_inter_bitrate_pct
Change-Id: Ie2c0e7f1397ca0bb55214251906412cdf24e42e2
Jerome Jiang [Tue, 22 Jun 2021 22:13:38 +0000 (22:13 +0000)]
Merge "rc: turn off gf constrain for external RC" into main
Jerome Jiang [Tue, 22 Jun 2021 00:22:51 +0000 (17:22 -0700)]
rc: turn off gf constrain for external RC
Added a new flag in rate control which turns off gf interval constrain
on key frame frequency for external RC.
It remains on for libvpx.
Change-Id: I18bb0d8247a421193f023619f906d0362b873b31
James Zern [Tue, 22 Jun 2021 03:02:58 +0000 (03:02 +0000)]
Merge "test-data.sha1: add missing sha sums" into main
Angie Chiang [Tue, 22 Jun 2021 01:44:02 +0000 (01:44 +0000)]
Merge changes I9f0852a0,Ieecb98a7 into main
* changes:
Add use_simple_encode_api to oxcf
Fix flaky assertions in SimpleEncode
Angie Chiang [Fri, 18 Jun 2021 03:23:30 +0000 (20:23 -0700)]
Add use_simple_encode_api to oxcf
Use this flag to change the encoder behavior when
SimpleEncode APIs are used
BUG=webm:1733
Change-Id: I9f0852a03ff99faa01cdd8eee8ab71718cc58632
Angie Chiang [Fri, 18 Jun 2021 23:09:41 +0000 (16:09 -0700)]
Fix flaky assertions in SimpleEncode
Bug: webm:1731
Change-Id: Ieecb98a7ac19e6291acd5d51432dc6a3789e9552
James Zern [Mon, 21 Jun 2021 20:33:44 +0000 (13:33 -0700)]
test-data.sha1: add missing sha sums
for rc_interface_test_one_layer_vbr and
rc_interface_test_one_layer_vbr_periodic_key added in:
1f45e7b07 vp9 rc: add vbr to rtc rate control library
Change-Id: I8bfa3698284c8ff289e830f7b8fa1ca42b752563
Jerome Jiang [Fri, 18 Jun 2021 23:25:53 +0000 (23:25 +0000)]
Merge "vp9 rc: add vbr to rtc rate control library" into main
Jerome Jiang [Tue, 15 Jun 2021 19:54:13 +0000 (12:54 -0700)]
vp9 rc: add vbr to rtc rate control library
Change-Id: I3d2565572c2b905966d60bcaa6e5e6f057b1bd51
James Zern [Fri, 18 Jun 2021 18:56:27 +0000 (11:56 -0700)]
normalize vp9_calc_[ip]frame declarations and definitions
fixes warnings under visual studio:
vp9\encoder\vp9_ratectrl.c(2012): warning C4028: formal parameter 1
different from declaration
vp9\encoder\vp9_ratectrl.c(2027): warning C4028: formal parameter 1
different from declaration
Change-Id: Ia0740db597fb7a259f90d362b483f58662f9f584
Marco Paniconi [Thu, 17 Jun 2021 19:00:33 +0000 (12:00 -0700)]
vp9: Adjust logic for gf update in 1 pass vbr
This reduces some regression when external RC
is used, for which avg_frame_low_motion is not
set/updated (=0).
Change-Id: I2408e62bd97592e892cefa0f183357c641aa5eea
Chunbo Hua [Wed, 16 Jun 2021 08:51:44 +0000 (01:51 -0700)]
Initialize VP9EncoderConfig profile and bit depth
Change-Id: I5c42013a08677cdef8d47f348458118338ff0138
Jerome Jiang [Tue, 15 Jun 2021 21:55:29 +0000 (14:55 -0700)]
Change the data path in svc rate control test
Change-Id: Iba58e2aa2578964b5c8b48ab0acbee9b44bcdada
Marco Paniconi [Mon, 14 Jun 2021 22:02:52 +0000 (15:02 -0700)]
vp9-rtc: Refactor 1 pass vbr rate control
This refactoring is needed to allow the
RC_rtc library to support VBR.
Change-Id: I863a4a65096fed06b02307098febf7976360e0f3
James Zern [Fri, 11 Jun 2021 23:34:41 +0000 (16:34 -0700)]
Update some comments for rc_target_bitrate
this mirrors the change from libaom:
5b150b150 Update some comments for rc_target_bitrate
Change-Id: Iaabee5924e0320609a29dc8ab71327923fb4c5d2
James Zern [Wed, 9 Jun 2021 22:07:15 +0000 (15:07 -0700)]
simple_encode: fix some -Wsign-compare warnings
Bug: webm:1731
Change-Id: I1db777c0c3a8784fb3dcf7cd39f78ebf833ab915
James Zern [Sun, 6 Jun 2021 02:30:04 +0000 (19:30 -0700)]
simple_encode_test: fix input file path
this allows the file to be located in LIBVPX_TEST_DATA_PATH similar to
other test sources.
Bug: webm:1731
Change-Id: I51606635d91871e7c179aa8d20d4841b0d60b6ad
Cheng Chen [Thu, 27 May 2021 22:38:28 +0000 (15:38 -0700)]
L2E: properly init two pass rc parameters
Two pass rc parameters are only initialized in the second pass
in vp9 normal two pass encoding.
However, the simple_encode API queries the keyframe group, arf group,
and number of coding frames without going throught the two pass
route.
Since recent libvpx rc changes, parameters in the TWO_PASS
struct have a great influence on the determination of the above
information.
We therefore need to properly init two pass rc parameters in
the simple_encode related environment.
Change-Id: Ie14b86d6e7ebf171b638d2da24a7fdcf5a15c3d9
Cheng Chen [Mon, 24 May 2021 22:53:06 +0000 (15:53 -0700)]
Fix simple encode
Properly init and delete cpi struct in simple encode functions.
Change-Id: I6e66bcac852cbb3dec9b754ba3fb01a348ac98b8
Chunbo Hua [Wed, 26 May 2021 09:02:07 +0000 (02:02 -0700)]
Fixed redundant wording for decoder algorithm interface
Change-Id: Id56e03dc9cf6d4e70c4681896f29893a9b4c76f2
James Zern [Tue, 25 May 2021 03:00:47 +0000 (03:00 +0000)]
Merge changes I2e86b005,I971c6261,I87fe4dad
* changes:
Use 'ptrdiff_t' instead of 'int' for pointer offset parameters
Implement vpx_convolve8_avg_vert_neon using SDOT instruction
Merge transpose and permute in Neon SDOT vertical convolution
James Zern [Tue, 25 May 2021 02:37:05 +0000 (02:37 +0000)]
Merge "img_alloc_helper: make align var unsigned"
Jonathan Wright [Mon, 24 May 2021 10:42:09 +0000 (11:42 +0100)]
Use 'ptrdiff_t' instead of 'int' for pointer offset parameters
A number of the load/store functions in mem_neon.h use type 'int' for
the 'stride' pointer offset parameter. This causes Clang to generate
the following warning every time these functions are called with a
wider type passed in for 'stride':
warning: implicit conversion loses integer precision: 'ptrdiff_t'
(aka 'long') to 'int' [-Wshorten-64-to-32]
This patch changes all such instances of 'int' to 'ptrdiff_t'.
Bug: b/
181236880
Change-Id: I2e86b005219e1fbb54f7cf2465e918b7c077f7ee
Jonathan Wright [Sun, 23 May 2021 12:35:15 +0000 (13:35 +0100)]
Implement vpx_convolve8_avg_vert_neon using SDOT instruction
Add an alternative AArch64 implementation of
vpx_convolve8_avg_vert_neon for targets that implement the Armv8.4-A
SDOT (signed dot product) instruction.
The existing MLA-based implementation of vpx_convolve8_avg_vert_neon
is retained and used on target CPUs that do not implement the SDOT
instruction (or CPUs executing in AArch32 mode). The availability of
the SDOT instruction is indicated by the feature macro
__ARM_FEATURE_DOTPROD.
Bug: b/
181236880
Change-Id: I971c626116155e1384bff4c76fd3420312c7a15b
Jonathan Wright [Sat, 22 May 2021 21:07:25 +0000 (22:07 +0100)]
Merge transpose and permute in Neon SDOT vertical convolution
The original dot-product implementation of vpx_convolve8_vert_neon
used a separate transpose before and after the convolution operation.
This patch merges the first transpose with the TBL permute (necessary
before using SDOT to compute the convolution) to significantly reduce
the amount of data re-arrangement. This new approach also allows for
more effective data re-use between loop iterations.
Co-authored by: James Greenhalgh <james.greenhalgh@arm.com>
Bug: b/
181236880
Change-Id: I87fe4dadd312c3ad6216943b71a5410ddf4a1b5b
Jonathan Wright [Mon, 17 May 2021 09:53:07 +0000 (10:53 +0100)]
Implement vpx_convolve8_avg_horiz_neon using SDOT instruction
Add an alternative AArch64 implementation of
vpx_convolve8_avg_horiz_neon for targets that implement the Armv8.4-A
SDOT (signed dot product) instruction.
The existing MLA-based implementation of vpx_convolve8_avg_horiz_neon
is retained and used on target CPUs that do not implement the SDOT
instruction (or CPUs executing in AArch32 mode). The availability of
the SDOT instruction is indicated by the feature macro
__ARM_FEATURE_DOTPROD.
Bug: b/
181236880
Change-Id: Ib435107c47c485f325248da87ba5618d68b0c8ed
Jonathan Wright [Wed, 12 May 2021 15:05:56 +0000 (16:05 +0100)]
Optimize remaining mse and sse functions in variance_neon.c
Implement sum of squared difference calculations in vpx_mse16x16_neon
and vpx_get4x4sse_cs_neon using the ABD and UDOT instructions -
instead of widening subtracts followed by a sequence of MLAs.
The existing implementation is retained for use on CPUs that do not
implement the Armv8.4-A UDOT instruction. This commit also updates
the variable names used in the existing implementations to be more
descriptive.
Bug: b/
181236880
Change-Id: Id4ad8ea7c808af1ac9bb5f1b63327ab487e4b1c7
Jonathan Wright [Tue, 20 Apr 2021 11:03:56 +0000 (12:03 +0100)]
Implement vertical convolution using Neon SDOT instruction
Add an alternative AArch64 implementation of vpx_convolve8_vert_neon
for targets that implement the Armv8.4-A SDOT (signed dot product)
instruction.
The existing MLA-based implementation of vpx_convolve8_vert_neon is
retained and used on target CPUs that do not implement the SDOT
instruction (or CPUs executing in AArch32 mode). The availability of
the SDOT instruction is indicated by the feature macro
__ARM_FEATURE_DOTPROD.
Bug: b/
181236880
Change-Id: Iebb8c77aba1d45b553b5112f3d87071fef3076f0
Jonathan Wright [Tue, 11 May 2021 12:17:44 +0000 (13:17 +0100)]
Implement Neon variance functions using UDOT instruction
Accelerate Neon variance functions by implementing the sum of squares
calculation using the Armv8.4-A UDOT instruction instead of 4 MLAs.
The previous implementation is retained for use on CPUs that do not
implement the Armv8.4-A dot product instructions.
Bug: b/
181236880
Change-Id: I9ab3d52634278b9b6f0011f39390a1195210bc75
Jonathan Wright [Mon, 10 May 2021 11:22:03 +0000 (12:22 +0100)]
Use ABD and UDOT to implement Neon sad_4d functions
Implementing sad16_neon using ABD, UDOT instead of ABAL, ABAL2 saves
a cycle and removes resource contention for a single SIMD pipe on
modern out-of-order Arm CPUs. The UDOT accumulation into 32-bit
elements also allows for a faster reduction at the end of each SAD
function.
The existing implementation is retained for CPUs that do not
implement the Armv8.4-A UDOT instruction, and CPUs executing in
AArch32 mode.
Bug: b/
181236880
Change-Id: Ibd0da46e86751d2f808c7b1e424f82b046a1aa6f
Jonathan Wright [Fri, 7 May 2021 12:25:51 +0000 (13:25 +0100)]
Optimize Neon reductions in sum_neon.h using ADDV instruction
Use the AArch64-only ADDV and ADDLV instructions to accelerate
reductions that add across a Neon vector in sum_neon.h. This commit
also refactors the inline functions to return a scalar instead of a
vector - allowing for optimization of the surrounding code at each
call site.
Bug: b/
181236880
Change-Id: Ieed2a2dd3c74f8a52957bf404141ffc044bd5d79
James Zern [Sat, 8 May 2021 02:35:25 +0000 (19:35 -0700)]
img_alloc_helper: make align var unsigned
quiets an integer sanitizer warning:
vpx/src/vpx_image.c:101:25: runtime error: implicit conversion from
type 'int' of value -2 (32-bit, signed) to type 'unsigned int' changed
the value to
4294967294 (32-bit, unsigned)
Change-Id: Ifeac31cc80811081c1ba10aadaa94dc36cd46efa
Jonathan Wright [Thu, 6 May 2021 14:11:52 +0000 (15:11 +0100)]
Manually unroll the inner loop of Neon sad16x_4d()
Manually unrolling the inner loop is sufficient to stop the compiler
getting confused and emitting inefficient code.
Co-authored by: James Greenhalgh <james.greenhalgh@arm.com>
Bug: b/
181236880
Change-Id: I860768ce0e6c0e0b6286d3fc1b94f0eae95d0a1a
Jonathan Wright [Thu, 6 May 2021 13:51:05 +0000 (14:51 +0100)]
Optimize Neon SAD reductions using wider ADDP instruction
Implement AArch64-only paths for each of the Neon SAD reduction
functions, making use of a wider pairwise addition instruction only
available on AArch64.
This change removes the need for shuffling between high and low
halves of Neon vectors - resulting in a faster reduction that requires
fewer instructions.
Bug: b/
181236880
Change-Id: I1c48580b4aec27222538eeab44e38ecc1f2009dc