platform/upstream/libvpx.git
15 months agomr_dissim: clear -Wshadow warning
James Zern [Tue, 18 Apr 2023 05:01:10 +0000 (22:01 -0700)]
mr_dissim: clear -Wshadow warning

Bug: webm:1793
Change-Id: I73ced43aba45215264134f917fd69ab0b1f10d01

15 months agoMerge "Add VP8RateControlRTC::GetLoopfilterLevel" into main
Jerome Jiang [Fri, 14 Apr 2023 17:25:07 +0000 (17:25 +0000)]
Merge "Add VP8RateControlRTC::GetLoopfilterLevel" into main

15 months agoMerge "libs.mk: Fix wrong scope end comments" into main
James Zern [Fri, 14 Apr 2023 17:22:47 +0000 (17:22 +0000)]
Merge "libs.mk: Fix wrong scope end comments" into main

15 months agolibs.mk: Fix wrong scope end comments
L. E. Segovia [Mon, 10 Apr 2023 22:08:54 +0000 (19:08 -0300)]
libs.mk: Fix wrong scope end comments

I believe the following comments are wrongly scoped, possibly left over
from previous changesets. This made me very confused when reading the
test suite Makefile, in order to port it to Meson.

Change-Id: Ice3c7ba50c6909a9c7dfd4001afa1e1ddfa4b5ce

15 months agoAdd VP8RateControlRTC::GetLoopfilterLevel
Jerome Jiang [Thu, 13 Apr 2023 20:52:51 +0000 (16:52 -0400)]
Add VP8RateControlRTC::GetLoopfilterLevel

New linear model to calculate loopfilter level from frame qp.

Linear regression was done on qvga, vga, and hd clips.

Bug: b/275304642
Change-Id: I552b312212bb4de21b53b762d139aa9588c64ae2

15 months agoMerge "vp9_frame_scale_ssse3: clear -Wshadow warnings" into main
James Zern [Thu, 13 Apr 2023 20:59:43 +0000 (20:59 +0000)]
Merge "vp9_frame_scale_ssse3: clear -Wshadow warnings" into main

15 months agoMerge changes I2a26c929,I0b7f0136,Ib65a2dff into main
James Zern [Thu, 13 Apr 2023 18:35:49 +0000 (18:35 +0000)]
Merge changes I2a26c929,I0b7f0136,Ib65a2dff into main

* changes:
  vpxenc: clear -Wshadow warnings
  vpxdec: clear -Wshadow warnings
  svc_encodeframe: clear -Wshadow warnings

15 months agoMerge changes I571a9d64,I22db73cb into main
James Zern [Thu, 13 Apr 2023 18:35:21 +0000 (18:35 +0000)]
Merge changes I571a9d64,I22db73cb into main

* changes:
  dct_test: clear -Wshadow warnings
  convolve_test: clear -Wshadow warning

15 months agoMerge "vp9_pickmode: clear -Wshadow warnings" into main
James Zern [Thu, 13 Apr 2023 18:35:10 +0000 (18:35 +0000)]
Merge "vp9_pickmode: clear -Wshadow warnings" into main

15 months agovpxenc: clear -Wshadow warnings
James Zern [Wed, 12 Apr 2023 21:52:44 +0000 (14:52 -0700)]
vpxenc: clear -Wshadow warnings

Bug: webm:1793
Change-Id: I2a26c9297016d3fa2c32e8974ef3d7dab1e524c4

15 months agovpxdec: clear -Wshadow warnings
James Zern [Wed, 12 Apr 2023 21:52:33 +0000 (14:52 -0700)]
vpxdec: clear -Wshadow warnings

Bug: webm:1793
Change-Id: I0b7f013682229cde50df7c62db9dab6eab0fd341

15 months agosvc_encodeframe: clear -Wshadow warnings
James Zern [Wed, 12 Apr 2023 21:02:15 +0000 (14:02 -0700)]
svc_encodeframe: clear -Wshadow warnings

Bug: webm:1793
Change-Id: Ib65a2dff124034d8e653572f8ada65984e55ed70

15 months agodct_test: clear -Wshadow warnings
James Zern [Wed, 12 Apr 2023 21:00:11 +0000 (14:00 -0700)]
dct_test: clear -Wshadow warnings

Bug: webm:1793
Change-Id: I571a9d641b2f7f4b9d7c473ca815d4ea10b9f9af

15 months agoconvolve_test: clear -Wshadow warning
James Zern [Wed, 12 Apr 2023 20:56:39 +0000 (13:56 -0700)]
convolve_test: clear -Wshadow warning

Bug: webm:1793
Change-Id: I22db73cb756c6c680b73684caef1e08bb6e729d8

15 months agovp9_frame_scale_ssse3: clear -Wshadow warnings
James Zern [Wed, 12 Apr 2023 20:54:02 +0000 (13:54 -0700)]
vp9_frame_scale_ssse3: clear -Wshadow warnings

Bug: webm:1793
Change-Id: I85608ac7bb6d3a61649ba342c13c3bf6a39a5dea

15 months agovp9_temporal_filter: clear -Wshadow warnings
James Zern [Wed, 12 Apr 2023 20:46:26 +0000 (13:46 -0700)]
vp9_temporal_filter: clear -Wshadow warnings

Bug: webm:1793
Change-Id: Ia681ce636ae99f95b875ee1b0189bc6fa66a7608

15 months agovp9_svc_layercontext: clear -Wshadow warnings
James Zern [Wed, 12 Apr 2023 20:41:46 +0000 (13:41 -0700)]
vp9_svc_layercontext: clear -Wshadow warnings

Bug: webm:1793
Change-Id: I63669de9835713ec70dafa88ca8f2c2459e59698

15 months agovp9_pickmode: clear -Wshadow warnings
James Zern [Wed, 12 Apr 2023 20:31:35 +0000 (13:31 -0700)]
vp9_pickmode: clear -Wshadow warnings

Bug: webm:1793
Change-Id: I26c063818144d11c4c91165c3fcbf6f258453cc7

15 months agovp9_speed_features: clear -Wshadow warning
James Zern [Wed, 12 Apr 2023 02:27:03 +0000 (19:27 -0700)]
vp9_speed_features: clear -Wshadow warning

Bug: webm:1793
Change-Id: I9f509c4461631e358f80b98afbb745ce88e9d7a2

15 months agovp9_ratectrl: clear -Wshadow warnings
James Zern [Wed, 12 Apr 2023 02:23:27 +0000 (19:23 -0700)]
vp9_ratectrl: clear -Wshadow warnings

Bug: webm:1793
Change-Id: I2476a9d8e1d62414fdbe6feee87d5167058f499b

15 months agovp9_mbgraph: clear -Wshadow warnings
James Zern [Wed, 12 Apr 2023 02:16:28 +0000 (19:16 -0700)]
vp9_mbgraph: clear -Wshadow warnings

Bug: webm:1793
Change-Id: Ibffb62775f09922d37f7d0460aa2751e74c36738

15 months agoMerge "vp9_quantize_avx2,highbd_get_max_lane_eob: fix mask" into main
James Zern [Tue, 11 Apr 2023 18:40:00 +0000 (18:40 +0000)]
Merge "vp9_quantize_avx2,highbd_get_max_lane_eob: fix mask" into main

15 months agoMerge "Add assert to ensure NEARESTMV or NEWMV modes are not skipped" into main
Yunqing Wang [Tue, 11 Apr 2023 18:35:10 +0000 (18:35 +0000)]
Merge "Add assert to ensure NEARESTMV or NEWMV modes are not skipped" into main

15 months agoMerge "Avoid redundant start MV SAD calculation" into main
Yunqing Wang [Tue, 11 Apr 2023 18:31:25 +0000 (18:31 +0000)]
Merge "Avoid redundant start MV SAD calculation" into main

15 months agoAdd assert to ensure NEARESTMV or NEWMV modes are not skipped
Cherma Rajan A [Tue, 11 Apr 2023 09:20:18 +0000 (14:50 +0530)]
Add assert to ensure NEARESTMV or NEWMV modes are not skipped

Added an assert for prune_single_mode_based_on_mv_diff_mode_rate
speed feature. This ensures NEARMV or ZEROMV modes are pruned
only when NEARESTMV and NEWMV modes are not early terminated.

Change-Id: Id8b03eef6d1ef3f16714a9cbfde0c171c0c6fe0b

15 months agoAvoid redundant start MV SAD calculation
Deepa K G [Mon, 3 Apr 2023 17:51:56 +0000 (23:21 +0530)]
Avoid redundant start MV SAD calculation

Avoided repeated calculation of start MV
SAD during full pixel motion search.

                 Instruction Count
cpu   Resolution   Reduction(%)
 0       LOWRES2      0.162
 0       MIDRES2      0.246
 0        HDRES2      0.325
 0       Average      0.245

Change-Id: I2b4786901f254ce32ee8ca8a3d56f1c9f112f1d4

15 months agovp9_quantize_avx2,highbd_get_max_lane_eob: fix mask
James Zern [Mon, 10 Apr 2023 20:29:02 +0000 (13:29 -0700)]
vp9_quantize_avx2,highbd_get_max_lane_eob: fix mask

Pack nz_mask with zero. After the result is permuted this has the effect
of ignoring the upper half of the iscan register which is only loaded
with 128-bits. Depending on the optimization level and the load used the
upper half of the ymm register may contain undefined values which can
produce an incorrect eob. If this is large enough it can cause a crash.

Bug: chromium:1431729
Change-Id: I4ebae9fa39f228bdd29dcc19935f3f07759d75f5

15 months agoMerge "Add AVX2 intrinsic for variance function for block width 8" into main
Yunqing Wang [Mon, 10 Apr 2023 18:50:09 +0000 (18:50 +0000)]
Merge "Add AVX2 intrinsic for variance function for block width 8" into main

15 months agoMerge "Prune single ref modes based on mv difference and mode rate" into main
Yunqing Wang [Mon, 10 Apr 2023 17:01:19 +0000 (17:01 +0000)]
Merge "Prune single ref modes based on mv difference and mode rate" into main

15 months agoMerge "Optimize Armv8.0 Neon SAD4D 16xh, 32xh, and 64xh functions" into main
James Zern [Fri, 7 Apr 2023 22:19:18 +0000 (22:19 +0000)]
Merge "Optimize Armv8.0 Neon SAD4D 16xh, 32xh, and 64xh functions" into main

15 months agovp9_dx_iface: clear -Wshadow warnings
James Zern [Thu, 6 Apr 2023 20:00:47 +0000 (13:00 -0700)]
vp9_dx_iface: clear -Wshadow warnings

Bug: webm:1793
Change-Id: Ice6cd08f145e5813e24345d03e0913e5eda5289f

15 months agovp9_encoder: clear -Wshadow warning
James Zern [Thu, 6 Apr 2023 20:00:07 +0000 (13:00 -0700)]
vp9_encoder: clear -Wshadow warning

Bug: webm:1793
Change-Id: Id390c61f82b9f15063d0310a2c252b02b479d9c5

15 months agovpx_subpixel_8t_intrin_avx2: clear -Wshadow warning
James Zern [Thu, 6 Apr 2023 19:57:23 +0000 (12:57 -0700)]
vpx_subpixel_8t_intrin_avx2: clear -Wshadow warning

Bug: webm:1793
Change-Id: Icba4ad242dcd0cad736b9a203829361c5bd1ca3f

15 months agoMerge "Optimize Neon paths of high bitdepth SAD and SAD4d for 8xh blocks" into main
James Zern [Thu, 6 Apr 2023 17:50:00 +0000 (17:50 +0000)]
Merge "Optimize Neon paths of high bitdepth SAD and SAD4d for 8xh blocks" into main

15 months agoOptimize Armv8.0 Neon SAD4D 16xh, 32xh, and 64xh functions
Jonathan Wright [Thu, 6 Apr 2023 15:14:51 +0000 (16:14 +0100)]
Optimize Armv8.0 Neon SAD4D 16xh, 32xh, and 64xh functions

Add a widening 4D reduction function operating on uint16x8_t vectors
and use it to optimize the final reduction in Armv8.0 Neon standard
bitdepth 16xh, 32xh and 64h SAD4D computations.

Also simplify the Armv8.0 Neon version of the sad64xhx4d_neon helper
function since VP9 block sizes are not large enough to require
widening to 32-bit accumulators before the final reduction.

Change-Id: I32b0a283d7688d8cdf21791add9476ed24c66a28

15 months agoOptimize 4D Neon reduction for 4xh and 8xh SAD4D blocks
Jonathan Wright [Tue, 4 Apr 2023 13:52:52 +0000 (14:52 +0100)]
Optimize 4D Neon reduction for 4xh and 8xh SAD4D blocks

Add a 4D reduction function operating on uint16x8_t vectors and use
it to optimize the final reduction in standard bitdepth 4xh and 8xh
SAD4D computations. Similar 4D reduction optimizations have already
been implemented for all other standard bitdepth block sizes, and all
high bitdepth block sizes.[1]

[1] https://chromium-review.googlesource.com/c/webm/libvpx/+/4224681

Change-Id: I0aa0b6e0f70449776f316879cafc4b830e86ea51

15 months agoAdd AVX2 intrinsic for variance function for block width 8
Anupam Pandey [Tue, 28 Mar 2023 09:18:46 +0000 (14:48 +0530)]
Add AVX2 intrinsic for variance function for block width 8

Added AVX2 intrinsic optimization for the following functions
1. vpx_variance8x4
2. vpx_variance8x8
3. vpx_variance8x16

This is a bit-exact change.

                 Instruction Count
cpu   Resolution   Reduction(%)
 0       LOWRES2      0.698
 0       MIDRES2      0.577
 0        HDRES2      0.469
 0       Average      0.582

Change-Id: Iae8fdf9344fd012cda4955ed140633141d60ba86

15 months agoMerge changes Idaf49de6,I6d7d96ff,I0d64c923 into main
James Zern [Thu, 30 Mar 2023 22:44:51 +0000 (22:44 +0000)]
Merge changes Idaf49de6,I6d7d96ff,I0d64c923 into main

* changes:
  svc_datarate_test: clear -Wshadow warning
  vp9_mcomp.c: clear -Wshadow warnings
  vp9_rc_get_second_pass_params: clear -Wshadow warning

15 months agoPrune single ref modes based on mv difference and mode rate
Cherma Rajan A [Wed, 8 Mar 2023 12:20:06 +0000 (17:50 +0530)]
Prune single ref modes based on mv difference and mode rate

This patch introduces a speed feature to prune single reference
modes - NEARMV and ZEROMV based on motion vector difference and
mode rate w.r.t previously evaluated single reference modes
corresponding to the same reference frame.

                Instruction Count        BD-Rate Loss(%)
cpu   Resolution   Reduction(%)    avg.psnr   ovr.psnr    ssim
 0       LOWRES2      1.686        -0.0039    -0.0105   -0.0098
 0       MIDRES2      1.026        -0.0234     0.0029    0.0120
 0        HDRES2      0.000         0.0000     0.0000    0.0000
 0       Average      0.889        -0.0091    -0.0025    0.0007

STATS_CHANGED

Change-Id: I387acd3a73d8256904a7ce684b198d251cf3dd04

15 months agoAvoid vshr and vget_{low,high} in Neon d135 predictor impl
George Steed [Tue, 28 Mar 2023 14:49:37 +0000 (14:49 +0000)]
Avoid vshr and vget_{low,high} in Neon d135 predictor impl

The shift instructions have marginally worse performance on some
micro-architectures, and the vget_{low,high} instructions are
unnecessary.

This commit improves performance of the d135 predictors by 1.5% geomean
averaged across a range of compilers and micro-architectures.

Change-Id: Ied4c3eecc12fc973841696459d868ce403ed4e6c

15 months agoUse sum_neon.h helpers in Neon DC predictors
George Steed [Mon, 27 Mar 2023 08:47:58 +0000 (08:47 +0000)]
Use sum_neon.h helpers in Neon DC predictors

Use sum_neon.h helpers for horizontal reductions in Neon DC predictors,
enabling use of dedicated Neon reduction instructions on AArch64. Some
of the surrounding code is also optimized to remove redundant broadcast
instructions in the dc_store helpers.

Performance is largely unchanged on both the standard as well as the
high bit-depth predictors. The main improvement appears to be the 16x16
standard-bitdepth dc predictor, which improves by 10-15% when
benchmarked on Neoverse N1.

Change-Id: Ibfcc6ecf4b1b2f87ce1e1f63c314d0cc35a0c76f

15 months agoMerge changes Ie4ffa298,If5ec220a,I670dc379 into main
James Zern [Wed, 29 Mar 2023 20:52:46 +0000 (20:52 +0000)]
Merge changes Ie4ffa298,If5ec220a,I670dc379 into main

* changes:
  Avoid LD2/ST2 instructions in highbd v predictors in Neon
  Avoid interleaving loads/stores in Neon for highbd dc predictor
  Avoid LD2/ST2 instructions in vpx_dc_predictor_32x32_neon

15 months agoMerge "svc: Fix a case where target bandwidth is 0" into main
Jerome Jiang [Wed, 29 Mar 2023 18:24:18 +0000 (18:24 +0000)]
Merge "svc: Fix a case where target bandwidth is 0" into main

15 months agosvc: Fix a case where target bandwidth is 0
Jerome Jiang [Wed, 29 Mar 2023 17:06:19 +0000 (13:06 -0400)]
svc: Fix a case where target bandwidth is 0

Bug: webrtc:15033
Change-Id: Iea2997c2ce8982f106a1eed3ec4f7dd1c6e83666

15 months agoOptimize Neon paths of high bitdepth SAD and SAD4d for 8xh blocks
Salome Thirot [Mon, 27 Mar 2023 13:31:40 +0000 (14:31 +0100)]
Optimize Neon paths of high bitdepth SAD and SAD4d for 8xh blocks

For these block sizes there is no need to widen to 32-bits until the
final reduction, so use a single vabaq instead of vabd + vpadalq.

Change-Id: I9c19d620f7bb8b3a6b0bedd37789c03bb628b563

15 months agoAvoid LD2/ST2 instructions in highbd v predictors in Neon
George Steed [Wed, 22 Mar 2023 11:49:33 +0000 (11:49 +0000)]
Avoid LD2/ST2 instructions in highbd v predictors in Neon

The interleaving load/store instructions (LD2/LD3/LD4 and ST2/ST3/ST4)
are useful if we are dealing with interleaved data (e.g. real/imag
components of complex numbers), but for simply loading or storing larger
quantities of data it is preferable to simply use the normal load/store
instructions.

This patch replaces such occurrences in the two larger block sizes:
vpx_highbd_v_predictor_16x16_neon and vpx_highbd_v_predictor_32x32_neon.

Change-Id: Ie4ffa298a2466ceaf893566fd0aefe3f66f439e4

15 months agoAvoid interleaving loads/stores in Neon for highbd dc predictor
George Steed [Wed, 22 Mar 2023 08:44:26 +0000 (08:44 +0000)]
Avoid interleaving loads/stores in Neon for highbd dc predictor

The interleaving load/store instructions (LD2/LD3/LD4 and ST2/ST3/ST4)
are useful if we are dealing with interleaved data (e.g. real/imag
components of complex numbers), but for simply loading or storing larger
quantities of data it is preferable to simply use two or more of the
normal load/store instructions.

This patch replaces such occurrences in the two larger block sizes:
vpx_highbd_dc_predictor_16x16_neon, vpx_highbd_dc_predictor_32x32_neon,
and related helper functions.

Speedups over the original Neon code (higher is better):

Microarch.  | Compiler | Block | Speedup
Neoverse N1 |  LLVM 15 | 16x16 |    1.25
Neoverse N1 |  LLVM 15 | 32x32 |    1.13
Neoverse N1 |   GCC 12 | 16x16 |    1.56
Neoverse N1 |   GCC 12 | 32x32 |    1.52
Neoverse V1 |  LLVM 15 | 16x16 |    1.63
Neoverse V1 |  LLVM 15 | 32x32 |    1.08
Neoverse V1 |   GCC 12 | 16x16 |    1.59
Neoverse V1 |   GCC 12 | 32x32 |    1.37

Change-Id: If5ec220aba9dd19785454eabb0f3d6affec0cc8b

15 months agoAvoid LD2/ST2 instructions in vpx_dc_predictor_32x32_neon
George Steed [Tue, 21 Mar 2023 14:31:50 +0000 (14:31 +0000)]
Avoid LD2/ST2 instructions in vpx_dc_predictor_32x32_neon

The LD2 and ST2 instructions are useful if we are dealing with
interleaved data (e.g. real/imag components of complex numbers), but for
simply loading or storing larger quantities of data it is preferable to
simply use two of the normal load/store instructions.

This patch replaces such occurrences in vpx_dc_predictor_32x32_neon and
related functions.

With Clang-15 this speeds up this function by 10-30% depending on the
micro-architecture being benchmarked on. With GCC-12 this speeds up the
function by 40-60% depending on the micro-architecture being benchmarked
on.

Change-Id: I670dc37908aa238f360104efd74d6c2108ecf945

15 months agoMerge "Add AVX2 for convolve vertical filter for block width 4" into main
Yunqing Wang [Tue, 28 Mar 2023 22:14:51 +0000 (22:14 +0000)]
Merge "Add AVX2 for convolve vertical filter for block width 4" into main

15 months agoMerge changes If83ff1ad,I8fb00a15,Iaad58e77,Iac166d60 into main
James Zern [Tue, 28 Mar 2023 20:14:12 +0000 (20:14 +0000)]
Merge changes If83ff1ad,I8fb00a15,Iaad58e77,Iac166d60 into main

* changes:
  Randomize second half of above_row_ in intrapred tests for Neon
  Allow non-uniform above array in d63 predictor Neon impl
  Allow non-uniform above array in d45 predictor Neon impl
  Allow non-uniform above array in highbd d45 predictor Neon impl

15 months agoMerge "update libwebm to libwebm-1.0.0.29-9-g1930e3c" into main
James Zern [Tue, 28 Mar 2023 18:36:01 +0000 (18:36 +0000)]
Merge "update libwebm to libwebm-1.0.0.29-9-g1930e3c" into main

15 months agosvc: Fix a case where target bandwidth is 0
Jerome Jiang [Tue, 28 Mar 2023 14:09:16 +0000 (10:09 -0400)]
svc: Fix a case where target bandwidth is 0

Bug: webrtc:15033
Change-Id: I28636de66842671b03284408186c4c18254109a5

15 months agoRandomize second half of above_row_ in intrapred tests for Neon
George Steed [Fri, 17 Mar 2023 20:00:24 +0000 (20:00 +0000)]
Randomize second half of above_row_ in intrapred tests for Neon

The existing tests duplicate `above_row_[block_size - 1]` after the
first `block_size` elements, which can lead to tests incorrectly passing
due to differing behaviour when calculating the average for the last
elements of the output.

This change adjusts the above array setup to be fully random instead,
allowing us to catch such issues here rather than in other larger tests
like the external MD5 tests.

It doesn't appear that other architectures are fully clean with this
change so restrict it to just Neon for now until they are fixed.

Bug: webm:1797
Change-Id: If83ff1adbf1e8d30f2a92474d7186c65840a5d0b

15 months agoAllow non-uniform above array in d63 predictor Neon impl
George Steed [Fri, 17 Mar 2023 19:55:17 +0000 (19:55 +0000)]
Allow non-uniform above array in d63 predictor Neon impl

The existing standard bitdepth implementation doesn't appear to manifest
as a failure in any of the predictor or MD5 tests, but it does rely on
the predictor tests filling the second `bs` elements of the `above`
input array with copies of `above[bs - 1]` in order to match the C
implementation.

This patch adjusts the Neon implementation to correctly match the C
implementation in the case where the elements of the `above` array all
differ.

The geomean of performance for the predictor is approximately a 2%
slowdown compared to the previous vectorized implementation. This is
still considerably faster than the unspecialized naive C implementation.

Bug: webm:1797
Change-Id: I8fb00a154288d54b24a72a7ff63c816bdcf3aca3

15 months agoAllow non-uniform above array in d45 predictor Neon impl
George Steed [Fri, 17 Mar 2023 17:59:26 +0000 (17:59 +0000)]
Allow non-uniform above array in d45 predictor Neon impl

The existing implementation doesn't appear to manifest as a failure in
any of the predictor or MD5 tests, but it does rely on the predictor
tests filling the second `bs` elements of the `above` input array with
copies of `above[bs - 1]` in order to match the C implementation.

This patch adjusts the Neon implementation to correctly match the C
implementation in the case where the elements of the `above` array all
differ.

Performance of the predictor is mostly unchanged, except for the 32x32
block size where it appears to have gotten about 40% faster when
compiled with clang-15.

Bug: webm:1797
Change-Id: Iaad58e77c5467307a3c80d6989b7cf2988e09311

15 months agoAllow non-uniform above array in highbd d45 predictor Neon impl
George Steed [Thu, 9 Mar 2023 23:46:31 +0000 (23:46 +0000)]
Allow non-uniform above array in highbd d45 predictor Neon impl

The existing implementation doesn't appear to manifest as a failure in
any of the predictor or MD5 tests, but it does rely on the predictor
tests filling the second `bs` elements of the `above` input array with
copies of `above[bs - 1]` in order to match the C implementation.

This patch adjusts the Neon implementation to correctly match the C
implementation in the case where the elements of the `above` array all
differ.

Performance of the predictor is mostly unchanged, except for the 16x16
block size where it appears to have gotten marginally faster across most
compiler/micro-architecture combinations.

Bug: webm:1797
Change-Id: Iac166d6047316c0382e0f2790ce780fc99674b43

15 months agoAdd AVX2 for convolve vertical filter for block width 4
Anupam Pandey [Tue, 21 Mar 2023 07:30:25 +0000 (13:00 +0530)]
Add AVX2 for convolve vertical filter for block width 4

Introduced AVX2 intrinsic to compute convolve vertical for
w = 4 case. This is a bit-exact change.

                 Instruction Count
cpu   Resolution   Reduction(%)
 0       LOWRES2      0.364
 0       MIDRES2      0.236
 0        HDRES2      0.162
 0       Average      0.254

Change-Id: I413f58aa6333a6f2421d4c10d49dec01e55b2098

15 months agovp9_rdopt,block_rd_txfm: fix clang-tidy warning
James Zern [Tue, 7 Mar 2023 23:29:37 +0000 (15:29 -0800)]
vp9_rdopt,block_rd_txfm: fix clang-tidy warning

argument name 'recon' in comment does not match parameter name
'out_recon'.

https://clang.llvm.org/extra/clang-tidy/checks/bugprone/argument-comment.html

+ normalize similar calls, using /*var=*/NULL to better match the style
  guidelines

https://google.github.io/styleguide/cppguide.html#Function_Argument_Comments

Change-Id: I089591317f7138965735f737c1536a8b16fcd4e4

15 months agosvc_datarate_test: clear -Wshadow warning
James Zern [Fri, 24 Mar 2023 02:28:48 +0000 (19:28 -0700)]
svc_datarate_test: clear -Wshadow warning

rename class member from ref_frame_config to the correct style:
ref_frame_config_.

Bug: webm:1793
Change-Id: Idaf49de6d724014adee75f81efe974b2031241ba

15 months agovp9_mcomp.c: clear -Wshadow warnings
James Zern [Wed, 22 Feb 2023 23:47:10 +0000 (15:47 -0800)]
vp9_mcomp.c: clear -Wshadow warnings

Bug: webm:1793
Change-Id: I6d7d96ffb3e388eac94d1d41563f7079a8297c85

15 months agovp9_rc_get_second_pass_params: clear -Wshadow warning
James Zern [Wed, 22 Feb 2023 23:36:03 +0000 (15:36 -0800)]
vp9_rc_get_second_pass_params: clear -Wshadow warning

Bug: webm:1793
Change-Id: I0d64c9234b4bdcfb49a06566dc41df26f5862c1f

15 months agoMerge changes Ide512788,I77c7abae into main
James Zern [Fri, 24 Mar 2023 18:04:19 +0000 (18:04 +0000)]
Merge changes Ide512788,I77c7abae into main

* changes:
  vp9_scan.h: rename scan_order struct to ScanOrder
  vp9_encodeframe.c: clear -Wshadow warnings

15 months agovp9_scan.h: rename scan_order struct to ScanOrder
James Zern [Wed, 22 Feb 2023 23:16:43 +0000 (15:16 -0800)]
vp9_scan.h: rename scan_order struct to ScanOrder

This matches the style guide and fixes some -Wshadow warnings related to
variables with the same name. Something similar was done in libaom in:
03f6fdcfca Fix warnings reported by -Wshadow: Part1b: scan_order struct
           and variable

Bug: webm:1793
Change-Id: Ide5127886b7fd7778e6d8a983bfba6edda21ff28

16 months agovp9_encodeframe.c: clear -Wshadow warnings
James Zern [Wed, 22 Feb 2023 21:53:49 +0000 (13:53 -0800)]
vp9_encodeframe.c: clear -Wshadow warnings

Bug: webm:1793
Change-Id: I77c7abae7bbb1e1f4972cd31e3a67d62477b896e

16 months agoupdate libwebm to libwebm-1.0.0.29-9-g1930e3c
James Zern [Fri, 24 Mar 2023 02:02:12 +0000 (19:02 -0700)]
update libwebm to libwebm-1.0.0.29-9-g1930e3c

changelog:
https://chromium.googlesource.com/webm/libwebm/+log/ee0bab576..1930e3ca2

Bug: webm:1792
Change-Id: I5c5c30c767d357528f102ff38957655e2ec0c645

16 months agoFix comment typos (likely copy-and-paste errors)
Wan-Teh Chang [Mon, 20 Mar 2023 23:05:11 +0000 (16:05 -0700)]
Fix comment typos (likely copy-and-paste errors)

Fix comment typos for vpx_codec_destroy() and vpx_codec_enc_init_ver().

Based on the change made in libaom:
https://aomedia.googlesource.com/aom/+/365a968684
365a968684 Fix comment typos (likely copy-and-paste errors)

Change-Id: I39edae835ed0752b569e8e7328d0709c59724ac2

16 months agoMerge "Add Neon implementations of vpx_highbd_avg_<w>x<h>_c" into main
James Zern [Thu, 23 Mar 2023 21:40:13 +0000 (21:40 +0000)]
Merge "Add Neon implementations of vpx_highbd_avg_<w>x<h>_c" into main

16 months agoMerge "test.mk: use CONFIG_VP(8|9)_ENCODER for vp8/vp9-only tests" into main
James Zern [Thu, 23 Mar 2023 17:22:28 +0000 (17:22 +0000)]
Merge "test.mk: use CONFIG_VP(8|9)_ENCODER for vp8/vp9-only tests" into main

16 months agoMerge "svc_encodeframe.c: fix -Wstringop-truncation" into main
James Zern [Thu, 23 Mar 2023 17:21:57 +0000 (17:21 +0000)]
Merge "svc_encodeframe.c: fix -Wstringop-truncation" into main

16 months agoMerge "Revert "Add codec control to get tpl stats"" into main
Jerome Jiang [Wed, 22 Mar 2023 20:48:44 +0000 (20:48 +0000)]
Merge "Revert "Add codec control to get tpl stats"" into main

16 months agoRevert "Add codec control to get tpl stats"
Jerome Jiang [Wed, 22 Mar 2023 20:18:39 +0000 (20:18 +0000)]
Revert "Add codec control to get tpl stats"

This reverts commit 9c15fb62b3dfe1c698dc28f9efedb022b0ef8eb8.

Reason for revert:

vpxenc should only use public interface

Original change's description:
> Add codec control to get tpl stats
>
> Add command line flag to vpxenc to export tpl stats
>
> Bug: b/273736974
> Change-Id: I6980096531b0c12fbf7a307fdef4c562d0c29e32

Bug: b/273736974
Change-Id: Ifa8951bb34e5936bbfc33086b22e9fc36d379bc9

16 months agoMerge "Change UpdateRateControl() to return bool" into main
Wan-Teh Chang [Wed, 22 Mar 2023 16:09:24 +0000 (16:09 +0000)]
Merge "Change UpdateRateControl() to return bool" into main

16 months agoAdd Neon implementations of vpx_highbd_avg_<w>x<h>_c
Salome Thirot [Fri, 10 Mar 2023 16:30:36 +0000 (16:30 +0000)]
Add Neon implementations of vpx_highbd_avg_<w>x<h>_c

Add Neon implementation of vpx_highbd_avg_4x4_c and vpx_highbd_avg_8x8_c
as well as the corresponding tests.

Change-Id: Ib1b06af5206774347690c9c56e194b76aa409c91

16 months agoMerge changes I8abac3c9,If678fc19 into main
James Zern [Wed, 22 Mar 2023 02:14:12 +0000 (02:14 +0000)]
Merge changes I8abac3c9,If678fc19 into main

* changes:
  vp9_bitstream.c: clear -Wshadow warnings
  vp9_setup_mask: clear -Wshadow warnings

16 months agoMerge changes I650b305c,If3e4cf37,I4c791e3a into main
James Zern [Tue, 21 Mar 2023 20:20:51 +0000 (20:20 +0000)]
Merge changes I650b305c,If3e4cf37,I4c791e3a into main

* changes:
  sixtappredict_neon.c: remove redundant returns
  sixtappredict_neon.c,cosmetics: fix a typo
  vp8_sixtap_predict16x16_neon: fix overread

16 months agoMerge "Add codec control to get tpl stats" into main
Jerome Jiang [Tue, 21 Mar 2023 18:34:34 +0000 (18:34 +0000)]
Merge "Add codec control to get tpl stats" into main

16 months agoMerge "Reland "quantize: use scan_order instead of passing scan/iscan"" into main
James Zern [Tue, 21 Mar 2023 00:33:00 +0000 (00:33 +0000)]
Merge "Reland "quantize: use scan_order instead of passing scan/iscan"" into main

16 months agotest.mk: use CONFIG_VP(8|9)_ENCODER for vp8/vp9-only tests
James Zern [Tue, 21 Mar 2023 00:28:11 +0000 (17:28 -0700)]
test.mk: use CONFIG_VP(8|9)_ENCODER for vp8/vp9-only tests

fixes some uninstantiated test failures when configured with
--disable-vp8 or --disable-vp9

Change-Id: If9a6705bd070edee02306e89da103ed474688ec8

16 months agosvc_encodeframe.c: fix -Wstringop-truncation
James Zern [Tue, 21 Mar 2023 00:09:42 +0000 (17:09 -0700)]
svc_encodeframe.c: fix -Wstringop-truncation

use sizeof(buf) - 1 with strncpy.

fixes:
examples/svc_encodeframe.c:282:3: warning: ‘strncpy’ specified bound
1024 equals destination size [-Wstringop-truncation]
  282 |   strncpy(si->options, options, sizeof(si->options));
      |   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Change-Id: I46980872f9865ae1dc2b56330c3a65d8bc6cf1f7

16 months agosixtappredict_neon.c: remove redundant returns
James Zern [Mon, 20 Mar 2023 23:58:28 +0000 (16:58 -0700)]
sixtappredict_neon.c: remove redundant returns

Change-Id: I650b305c2599fc32353daba030e6241d330796a7

16 months agosixtappredict_neon.c,cosmetics: fix a typo
James Zern [Mon, 20 Mar 2023 23:56:58 +0000 (16:56 -0700)]
sixtappredict_neon.c,cosmetics: fix a typo

Change-Id: If3e4cf372fc6ed076f0d42c435a72262494aab68

16 months agovp8_sixtap_predict16x16_neon: fix overread
James Zern [Mon, 20 Mar 2023 23:43:47 +0000 (16:43 -0700)]
vp8_sixtap_predict16x16_neon: fix overread

Shift the final read from the source by 3 to avoid breaking the
assumption that the 6-tap filter needs only 5 pixels outside of the
macroblock; this matches the sse2 and ssse3 implementations.

It's possible this restriction could be removed if the source buffers
are assumed to be padded.

Bug: webm:1795
Change-Id: I4c791e3a214898a503c78f4cedca154c75cdbaef
Fixed: webm:1795

16 months agoMerge "Skip trellis coeff opt based on tx block properties" into main
Yunqing Wang [Mon, 20 Mar 2023 16:35:44 +0000 (16:35 +0000)]
Merge "Skip trellis coeff opt based on tx block properties" into main

16 months agoMerge "Refactor logic of skipping trellis coeff opt" into main
Yunqing Wang [Mon, 20 Mar 2023 16:27:53 +0000 (16:27 +0000)]
Merge "Refactor logic of skipping trellis coeff opt" into main

16 months agoAdd codec control to get tpl stats
Jerome Jiang [Fri, 17 Mar 2023 18:34:42 +0000 (14:34 -0400)]
Add codec control to get tpl stats

Add command line flag to vpxenc to export tpl stats

Bug: b/273736974
Change-Id: I6980096531b0c12fbf7a307fdef4c562d0c29e32

16 months agoSkip trellis coeff opt based on tx block properties
Deepa K G [Thu, 2 Mar 2023 08:09:55 +0000 (13:39 +0530)]
Skip trellis coeff opt based on tx block properties

The trellis coefficient optimization is skipped for blocks
with larger residual mse.

                 Instruction Count        BD-Rate Loss(%)
cpu   Resolution   Reduction(%)    avg.psnr   ovr.psnr    ssim
 0       LOWRES2      9.467        0.0921     0.1057    0.0362
 0       MIDRES2      4.328       -0.0155     0.0694    0.0178
 0        HDRES2      1.858        0.0231     0.0214   -0.0034
 0       Average      5.218        0.0332     0.0655    0.0169

STATS_CHANGED

Change-Id: I321a9b1a34ebb59b7b6a065b5b2d717c8767a4a5

16 months agoRefactor logic of skipping trellis coeff opt
Deepa K G [Thu, 2 Mar 2023 08:09:55 +0000 (13:39 +0530)]
Refactor logic of skipping trellis coeff opt

The code to enable trellis coefficient optimization
is refactored using the sf 'trellis_opt_tx_rd'. This
change facilitates adaptive skipping of trellis
optimization based on block properties.

Change-Id: Ia1ff7cbbe5acf86414410f62655d46c099387847

16 months agovp9_bitstream.c: clear -Wshadow warnings
James Zern [Wed, 22 Feb 2023 21:29:20 +0000 (13:29 -0800)]
vp9_bitstream.c: clear -Wshadow warnings

Bug: webm:1793
Change-Id: I8abac3c901ad24b642b39ea6e6081d8ba626853d

16 months agovp9_setup_mask: clear -Wshadow warnings
James Zern [Wed, 22 Feb 2023 21:22:08 +0000 (13:22 -0800)]
vp9_setup_mask: clear -Wshadow warnings

Bug: webm:1793
Change-Id: If678fc195ef87cc634d31fb7b24e0c844a5cb7b0

16 months agoReland "quantize: use scan_order instead of passing scan/iscan"
Johann [Mon, 14 Nov 2022 07:47:33 +0000 (16:47 +0900)]
Reland "quantize: use scan_order instead of passing scan/iscan"

This is a reland of commit 14fc40040ff30486c45111056db44ee18590a24a

Parent change fixed in crrev.com/c/webm/libvpx/+/4305500

Original change's description:
> quantize: use scan_order instead of passing scan/iscan
>
> further reduces the arguments for the 32x32. This will be applied to the base
> version as well.
>
> Change-Id: I25a162b5248b14af53d9e20c6a7fa2a77028a6d1

Change-Id: I2a7654558eaddd68bd09336bf317b297f18559d2

16 months agoMerge changes I5d9444a2,I1f127df9 into main
James Zern [Fri, 17 Mar 2023 20:35:24 +0000 (20:35 +0000)]
Merge changes I5d9444a2,I1f127df9 into main

* changes:
  Add Neon implementation of vpx_highbd_minmax_8x8_c
  Add tests for vpx_highbd_minmax_8x8_c

16 months agoMerge "Reland "quantize: simplifly highbd 32x32_b args"" into main
James Zern [Fri, 17 Mar 2023 20:32:11 +0000 (20:32 +0000)]
Merge "Reland "quantize: simplifly highbd 32x32_b args"" into main

16 months agoAdd Neon implementation of vpx_highbd_minmax_8x8_c
Salome Thirot [Thu, 9 Mar 2023 13:58:16 +0000 (13:58 +0000)]
Add Neon implementation of vpx_highbd_minmax_8x8_c

Add Neon implementation of vpx_highbd_minmax_8x8_c as well as the
corresponding tests.

Change-Id: I5d9444a239fb1baa53634c1bdb5292b44067d90c

16 months agoAdd tests for vpx_highbd_minmax_8x8_c
Salome Thirot [Thu, 9 Mar 2023 21:04:07 +0000 (21:04 +0000)]
Add tests for vpx_highbd_minmax_8x8_c

Write tests for vpx_highbd_minmax_8x8_c, and fix initial value of min in
vpx_highbd_minmax_8x8_c.

Change-Id: I1f127df945bbb8c7d373c5430ff5f94f28575968

16 months agoReland "quantize: simplifly highbd 32x32_b args"
Johann [Fri, 11 Nov 2022 23:23:17 +0000 (08:23 +0900)]
Reland "quantize: simplifly highbd 32x32_b args"

This is a reland of commit 573f5e662b544dbc553d73fa2b61055c30dfe8cc

Alignment issue with tests fixed in crrev.com/c/webm/libvpx/+/4305500

Original change's description:
> quantize: simplify highbd 32x32_b args
>
> Change-Id: I431a41279c4c4193bc70cfe819da6ea7e1d2fba1

Change-Id: Ic868b6f987c99d88672858fedd092fa49c125e19

16 months agoChange UpdateRateControl() to return bool
Wan-Teh Chang [Thu, 16 Mar 2023 20:30:01 +0000 (13:30 -0700)]
Change UpdateRateControl() to return bool

Change the VP9RateControlRtcConfig constructor to initialize
ss_number_layers (to 1).

Change UpdateRateControl() to return bool so that it can report failure
(due to invalid configuration).

Also change InitRateControl() to return bool to propagate the return
value of UpdateRateControl().

Note: This is a port of the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/172042.

Change-Id: I90b60353b5f15692dba5d89e7b1a9c81bb2fdd89

16 months agoMerge "Set oxcf->ts_rate_decimator[tl] only once" into main
Wan-Teh Chang [Fri, 17 Mar 2023 02:54:21 +0000 (02:54 +0000)]
Merge "Set oxcf->ts_rate_decimator[tl] only once" into main

16 months agoSet oxcf->ts_rate_decimator[tl] only once
Wan-Teh Chang [Fri, 17 Mar 2023 01:36:13 +0000 (18:36 -0700)]
Set oxcf->ts_rate_decimator[tl] only once

The code that sets oxcf->ts_rate_decimator[tl] does not need to be
inside a loop that iterates over sl. Move the code out of the sl loop so
that oxcf->ts_rate_decimator[tl] is set only once.

Change-Id: I22f6c117d200ec38a757b749a8700660d15436c1

16 months agoRemove repeated field from VP9RateControlRtcConfig
Wan-Teh Chang [Thu, 16 Mar 2023 22:21:49 +0000 (15:21 -0700)]
Remove repeated field from VP9RateControlRtcConfig

Remove the `ts_number_layers` field from VP9RateControlRtcConfig because
the base class VpxRateControlRtcConfig already has that field.

Note: In commit 65a1751e5b98bf7f1d21bcbfdef352af34fb205d,
`ts_number_layers` was moved to the newly created base class
VpxRateControlRtcConfig but was inadvertently left in
VP9RateControlRtcConfig:
https://chromium-review.googlesource.com/c/webm/libvpx/+/3140048,

Change-Id: I98d48e152683ec2e5e62efffb56b7f010c5d0695

16 months agoMerge "Update the sample code for VP9RateControlRTC" into main
Wan-Teh Chang [Thu, 16 Mar 2023 21:40:14 +0000 (21:40 +0000)]
Merge "Update the sample code for VP9RateControlRTC" into main