platform/upstream/libvpx.git
11 months agoMerge "vp9_quantize_fp_neon: Same params name as in decl" into main
Jerome Jiang [Thu, 3 Aug 2023 19:28:44 +0000 (19:28 +0000)]
Merge "vp9_quantize_fp_neon: Same params name as in decl" into main

11 months agoMerge "vp9 ext rc: Add callback for tpl stats" into main
Jerome Jiang [Thu, 3 Aug 2023 18:33:32 +0000 (18:33 +0000)]
Merge "vp9 ext rc: Add callback for tpl stats" into main

11 months agovp9_quantize_fp_neon: Same params name as in decl
Jerome Jiang [Thu, 3 Aug 2023 18:07:55 +0000 (14:07 -0400)]
vp9_quantize_fp_neon: Same params name as in decl

Clear some clang-tidy warnings

Change-Id: Iea4c4e77b3d515ec6384bd34875a0002ab13c14c

11 months agovp9 ext rc: Add callback for tpl stats
Jerome Jiang [Tue, 1 Aug 2023 15:00:20 +0000 (11:00 -0400)]
vp9 ext rc: Add callback for tpl stats

Added test

Bug: b/294049605
Change-Id: I3967a0f915e1a6e7a0d34d04732c33e1ca6f35e7

11 months agoAdd test to check bit exactness of C and SIMD in VP9 encoder
Anupam Pandey [Tue, 13 Jun 2023 10:32:58 +0000 (16:02 +0530)]
Add test to check bit exactness of C and SIMD in VP9 encoder

This CL adds a shell script to test bit exactness of C and SIMD
VP9 encoder for x86 platform.

As C Vs NEON encoding outputs are not bit-exact (BUG=webm:1809),
ARM tests are currently disabled.

BUG=webm:1800

Change-Id: Iffcc70863e8cf83ccb5bc5be73e8866165697358

11 months agoAdd a 10-bit test file
Yunqing Wang [Wed, 2 Aug 2023 03:58:18 +0000 (20:58 -0700)]
Add a 10-bit test file

Added a 10-bit test file for VP9 end-to-end c vs SIMD bit-
exactness test.

BUG=webm:1800

Change-Id: I4a864f1a740abee27049d68231adf2ec308f9a96

11 months agonormalize *const in rtcd
Johann [Fri, 28 Jul 2023 20:44:56 +0000 (05:44 +0900)]
normalize *const in rtcd

Change-Id: Iece50143b43263c0c8f90299bedd7d2a5b9aa56b

11 months agoremove incorrect (void)
Johann [Fri, 28 Jul 2023 11:21:31 +0000 (20:21 +0900)]
remove incorrect (void)

n_coeffs is used in this function

Change-Id: I5f5d2933304bb636a33e0fa294b4526edb65a08d

11 months agoquantize_fp: reduce parameters
Johann [Fri, 28 Jul 2023 10:37:48 +0000 (19:37 +0900)]
quantize_fp: reduce parameters

apply similar steps as to the other quantize functions to switch to
macroblock_plane and ScanOrder

Change-Id: I486d653326aaf52ffd3beafd2e891ba6a5d57ef3

11 months agoquantize: reduce parameters
Johann [Mon, 14 Nov 2022 07:47:33 +0000 (16:47 +0900)]
quantize: reduce parameters

Pass macroblock_plane and ScanOrder instead of looking up the values
beforehand. Avoids pushing arguments to the stack.

Change-Id: I22df6f645eb1a1d89ba5a4d9bc58acb77af51aa9

11 months agoresize_test: prefer 'override' to 'virtual'
James Zern [Thu, 27 Jul 2023 02:03:04 +0000 (19:03 -0700)]
resize_test: prefer 'override' to 'virtual'

Update functions in WRITE_COMPRESSED_STREAM blocks, which are disabled
by default. This caused them to be missed in:
84e6b7ab0 test/*.cc: prefer 'override' to 'virtual'

Change-Id: I0e462263f19c15eb0a30d0c0f4e145062f789489

11 months agotest/*.h: prefer 'override' to 'virtual'
James Zern [Wed, 26 Jul 2023 22:29:40 +0000 (15:29 -0700)]
test/*.h: prefer 'override' to 'virtual'

created with clang-tidy --fix --checks=-*,modernize-use-override

Change-Id: I53412f35590799574edb573ae417a4a004cccd1e

11 months agoencode_test_driver.h: use bool literal
James Zern [Wed, 26 Jul 2023 22:38:36 +0000 (15:38 -0700)]
encode_test_driver.h: use bool literal

Change-Id: If47be9ca0daa18d92cb849484f9e139e65e3560e

11 months agotest/**.cc: use bool literals
James Zern [Tue, 25 Jul 2023 19:18:03 +0000 (12:18 -0700)]
test/**.cc: use bool literals

created with clang-tidy --fix --checks=-*,modernize-use-bool-literals

Change-Id: Ifaed8ca824676555acaf1053b2a5a52c51a70638

11 months agotest/decode_perf_test.cc: use nullptr
James Zern [Tue, 25 Jul 2023 19:10:21 +0000 (12:10 -0700)]
test/decode_perf_test.cc: use nullptr

created with clang-tidy --fix --checks=-*,modernize-use-nullptr

Change-Id: Ibf4a80fa00e9b59d471c92788ec4c7c72e4662e5

11 months agotest/*.cc: use '= default'
James Zern [Tue, 25 Jul 2023 19:04:57 +0000 (12:04 -0700)]
test/*.cc: use '= default'

created with clang-tidy --fix --checks=-*,modernize-use-equals-default

Change-Id: Ie373fb5501491fce53479d20f3a6d908c4b7c535

11 months agoMerge changes I71e1b442,Ibbfb949b into main
James Zern [Tue, 25 Jul 2023 18:27:34 +0000 (18:27 +0000)]
Merge changes I71e1b442,Ibbfb949b into main

* changes:
  test/*.cc: prefer 'override' to 'virtual'
  test,AbstractBench: fix -Wnon-virtual-dtor

11 months agotest/*.cc: prefer 'override' to 'virtual'
James Zern [Tue, 25 Jul 2023 00:23:23 +0000 (17:23 -0700)]
test/*.cc: prefer 'override' to 'virtual'

created with clang-tidy --fix --checks=-*,modernize-use-override

Change-Id: I71e1b4423c143b3e47fe90929ee110b307cdb565

11 months agotest,AbstractBench: fix -Wnon-virtual-dtor
James Zern [Sat, 8 Jul 2023 02:14:59 +0000 (19:14 -0700)]
test,AbstractBench: fix -Wnon-virtual-dtor

In file included from ../test/bench.cc:14:
../test/bench.h:17:7: warning: 'AbstractBench' has virtual functions but
non-virtual destructor [-Wnon-virtual-dtor]
class AbstractBench {

Change-Id: Ibbfb949b63c8dff936c7ed4f2d056dea0343377b

11 months agoAdd new_mv_count to ext rate control interface
Jerome Jiang [Mon, 24 Jul 2023 22:04:58 +0000 (18:04 -0400)]
Add new_mv_count to ext rate control interface

Bug: b/290385227
Change-Id: Ia87c4bf1e9315bf1134c998f88e9d5548c497777

11 months agocleanup: _pt -> _ptr in vp9 external RC interface
Jerome Jiang [Mon, 24 Jul 2023 17:08:05 +0000 (13:08 -0400)]
cleanup: _pt -> _ptr in vp9 external RC interface

Change-Id: Ic483488f8f6273e8977cfc324466bda41f1e47a7

12 months agovp9_rdopt,handle_inter_mode: fix -Wmaybe-uninitialized warning
James Zern [Thu, 13 Jul 2023 16:49:30 +0000 (09:49 -0700)]
vp9_rdopt,handle_inter_mode: fix -Wmaybe-uninitialized warning

With gcc 13.1.1

In function ‘handle_inter_mode’,
inlined from ‘vp9_rd_pick_inter_mode_sb’ at
    ../vp9/encoder/vp9_rdopt.c:3872:17:
../vp9/encoder/vp9_rdopt.c:3142:8: warning: ‘tmp_rd’ may be used
    uninitialized [-Wmaybe-uninitialized]
 3142 |     rd = tmp_rd + RDCOST(x->rdmult, x->rddiv, rs, 0);
../vp9/encoder/vp9_rdopt.c: In function ‘vp9_rd_pick_inter_mode_sb’:
../vp9/encoder/vp9_rdopt.c:2846:15: note: ‘tmp_rd’ was declared here
 2846 |   int64_t rd, tmp_rd, best_rd = INT64_MAX;

Change-Id: I8608957cc8bbeb1ae525f3c3dad6fe9785b2a9b4

12 months agoMerge "vp8: remove missing prototypes from the rtcd header" into main
James Zern [Tue, 11 Jul 2023 00:55:30 +0000 (00:55 +0000)]
Merge "vp8: remove missing prototypes from the rtcd header" into main

12 months agovp8: remove missing prototypes from the rtcd header
L. E. Segovia [Sat, 8 Jul 2023 23:30:49 +0000 (20:30 -0300)]
vp8: remove missing prototypes from the rtcd header

These were removed in If7a49e920e12f7fca0541190b87e6dae510df05c but
the leftovers can cause a build to fail if the code isn't optimized out.
I just found this out in the Meson port of libvpx for GStreamer.

BUG=webm:1584

Change-Id: I1c953720a2cbec3796200d4ec4020dca0b672bfb

12 months agovpx_free_tpl_gop_stats: normalize param name
James Zern [Mon, 10 Jul 2023 17:06:13 +0000 (10:06 -0700)]
vpx_free_tpl_gop_stats: normalize param name

this fixes a clang-tidy warning

Change-Id: I13f4750c15b7d6a395494c8dbcb896bde125b3c4

12 months agoMerge "delete some dead code" into main
James Zern [Thu, 6 Jul 2023 17:10:37 +0000 (17:10 +0000)]
Merge "delete some dead code" into main

12 months agomfqe_partition: fix -Wunreachable-code
James Zern [Thu, 29 Jun 2023 16:52:26 +0000 (09:52 -0700)]
mfqe_partition: fix -Wunreachable-code

vp9/common/vp9_mfqe.c|240 col 16| warning: code will never be executed
[-Wunreachable-code]
 BLOCK_SIZE mfqe_bs, bs_tmp;
            ^~~~~~~

Change-Id: I566b20d8c294e19bc4b90b57b730f933048e71a5

12 months agoFix a bug in vpx_highbd_hadamard_32x32_neon().
Wan-Teh Chang [Wed, 28 Jun 2023 23:09:36 +0000 (16:09 -0700)]
Fix a bug in vpx_highbd_hadamard_32x32_neon().

This CL is the highbd version of
https://chromium-review.googlesource.com/c/webm/libvpx/+/4646573.

The bug is caused by the incorrect assumption that
(a / 2) + (b / 2) == (a + b) / 2 and (a / 2) - (b / 2) == (a - b) / 2.

Also fix the Rand() inputs to Hadamard functions in unit tests.

This CL ports the following libaom CLs to libvpx:
https://aomedia-review.googlesource.com/c/aom/+/177101
https://aomedia-review.googlesource.com/c/aom/+/177241

Change-Id: Ic20e7684eab5d6507417fa2b75e572064d37ad2c

12 months agodelete some dead code
James Zern [Wed, 28 Jun 2023 19:26:32 +0000 (12:26 -0700)]
delete some dead code

follow-up to:
3ecba3980 Fix Clang -Wunreachable-code-aggressive warnings

Change-Id: I364312987bc838c69c010cce024bd3d62a918417

12 months agoMerge "Fix Clang -Wunreachable-code-aggressive warnings" into main
James Zern [Wed, 28 Jun 2023 19:21:40 +0000 (19:21 +0000)]
Merge "Fix Clang -Wunreachable-code-aggressive warnings" into main

12 months agoFix Clang -Wunreachable-code-aggressive warnings
James Zern [Sat, 3 Jun 2023 01:49:00 +0000 (18:49 -0700)]
Fix Clang -Wunreachable-code-aggressive warnings

Based on the change in libaom:
fe36011455 Fix Clang -Wunreachable-code-aggressive warnings

Clang's -Wunreachable-code-aggressive flag enables several warning flags
such as -Wunreachable-code-break and -Wunreachable-code-return. Chrome's
build system enables -Wunreachable-code-aggressive (in
build/config/compiler/BUILD.gn), so it would be good if libvpx could be
compiled without -Wunreachable-code-aggressive warnings.

This requires the VPX_NO_RETURN macro be defined correctly for all the
compilers we support, otherwise some compilers may warn about missing
return statements after a die() or fatal() call (which does not return).

Change-Id: I0c069133af45a7a61759538b6d74c681ea087dcd

12 months agovp9 firstpass stats in a separate header
Jerome Jiang [Wed, 28 Jun 2023 14:04:21 +0000 (10:04 -0400)]
vp9 firstpass stats in a separate header

Change-Id: If91c5c74c71affc48eb858beb314a6c194b14023

12 months agoMerge changes I1c17302f,Ic084894b,I9867f5fc,Ie3faf7b3,If5dc96b7, ... into main
James Zern [Wed, 28 Jun 2023 00:02:47 +0000 (00:02 +0000)]
Merge changes I1c17302f,Ic084894b,I9867f5fc,Ie3faf7b3,If5dc96b7, ... into main

* changes:
  vp8_decode: fix keyframe resync after decode error
  vp8_decode: only remove threads on thread create failure
  vp8_decode: clear stream info on decoder create failure
  vp9_decodeframe,init_mt: free tile_workers on alloc failure
  vp9_alloccommon: clear allocation sizes on free
  vp9_dx_iface: fix leaks on init_decoder() failure

12 months agovp8_decode: fix keyframe resync after decode error
James Zern [Tue, 27 Jun 2023 02:25:56 +0000 (19:25 -0700)]
vp8_decode: fix keyframe resync after decode error

This fixes a crash if the application continues to call
vpx_codec_decode(). Previously a non-keyframe could cause a crash if the
decoder failed before fully initializing due to an allocation failure.
The stream info and frame resolution would be 0, skipping an allocation.

Found with vpx_dec_fuzzer_vp8 & Nallocfuzz
(https://github.com/catenacyber/nallocfuzz).

Bug: webm:1807
Change-Id: I1c17302f4d3a488ba3b4eefe0bf53853dc558bc1

12 months agovp8_decode: only remove threads on thread create failure
James Zern [Tue, 27 Jun 2023 02:22:00 +0000 (19:22 -0700)]
vp8_decode: only remove threads on thread create failure

This fixes a crash if the application continues to call
vpx_codec_decode(). Previously the decoder instance would be freed,
causing a crash when attempting to access it with restart_threads=1.

Found with vpx_dec_fuzzer_vp8 & Nallocfuzz
(https://github.com/catenacyber/nallocfuzz).

Bug: webm:1807
Change-Id: Ic084894b776729bb1572f747082cef002f0832a8

12 months agovp8_decode: clear stream info on decoder create failure
James Zern [Tue, 27 Jun 2023 02:18:55 +0000 (19:18 -0700)]
vp8_decode: clear stream info on decoder create failure

This fixes a crash if the application continues to call
vpx_codec_decode().

Found with vpx_dec_fuzzer_vp8 & Nallocfuzz
(https://github.com/catenacyber/nallocfuzz).

Bug: webm:1807
Change-Id: I9867f5fc3d1163026f521a9609d3cbbc00568d1d

12 months agovp9_decodeframe,init_mt: free tile_workers on alloc failure
James Zern [Tue, 27 Jun 2023 02:09:24 +0000 (19:09 -0700)]
vp9_decodeframe,init_mt: free tile_workers on alloc failure

This avoids a crash if any of the thread allocations fail and the
application continues to call vpx_codec_decode(). Previously
num_tile_workers would be non-zero, but not equal to num_threads, which
would cause a crash during later thread management.

Found with vpx_dec_fuzzer_vp9 & Nallocfuzz
(https://github.com/catenacyber/nallocfuzz).

Bug: webm:1807
Change-Id: Ie3faf7b36764aebedac0924acb6e4cb7545aec7d

12 months agovp9_alloccommon: clear allocation sizes on free
James Zern [Tue, 27 Jun 2023 02:06:51 +0000 (19:06 -0700)]
vp9_alloccommon: clear allocation sizes on free

This fixes reallocations (and avoids potential crashes) if any
allocations fails and the application continues to call
vpx_codec_decode().

Found with vpx_dec_fuzzer_vp9 & Nallocfuzz
(https://github.com/catenacyber/nallocfuzz).

Bug: webm:1807
Change-Id: If5dc96b73c02efc94ec84c25eb50d10ad6b645a6

12 months agovp9_dx_iface: fix leaks on init_decoder() failure
James Zern [Sat, 24 Jun 2023 02:27:26 +0000 (19:27 -0700)]
vp9_dx_iface: fix leaks on init_decoder() failure

If any allocations fail in init_decoder() and the application continues
to call vpx_codec_decode() some of the allocations would be orphaned or
the decoder would be left in a partially initialized state.

Found with vpx_dec_fuzzer_vp9 & Nallocfuzz
(https://github.com/catenacyber/nallocfuzz).

Bug: webm:1807
Change-Id: I44f662526d715ecaeac6180070af40672cd42611

12 months agoFix a bug in vpx_hadamard_32x32_neon()
Wan-Teh Chang [Mon, 26 Jun 2023 21:57:53 +0000 (14:57 -0700)]
Fix a bug in vpx_hadamard_32x32_neon()

A right shift by 2 is equivalent to two halving operations if there is
no no addition or subtraction between the two halving operations.

Note: Since vhaddq_s16() and vhsubq_s16() have 17-bit intermediate
precision, the Neon code doesn't need to go to int32_t as was done in
https://chromium-review.googlesource.com/c/webm/libvpx/+/4604169.

Change-Id: Ibe0691cde0fd3b94ee7c497845ba459d30d503b0

13 months agoMerge "configure.sh: Improve a comment." into main
James Zern [Tue, 20 Jun 2023 20:06:32 +0000 (20:06 +0000)]
Merge "configure.sh: Improve a comment." into main

13 months agoMerge "Remove vp9_diamond_search_sad_avx function" into main
Yunqing Wang [Tue, 20 Jun 2023 16:34:58 +0000 (16:34 +0000)]
Merge "Remove vp9_diamond_search_sad_avx function" into main

13 months agoRemove vp9_diamond_search_sad_avx function
Anupam Pandey [Wed, 14 Jun 2023 04:57:49 +0000 (10:27 +0530)]
Remove vp9_diamond_search_sad_avx function

This CL removes the avx of vp9_diamond_search_sad function as
there is no speed up seen wrt C.

Change-Id: Ife6005d8e444ea2c8d07ac0f686c840344b9e0ea

13 months agoconfigure.sh: Improve a comment.
Chen Wang [Fri, 16 Jun 2023 08:19:02 +0000 (16:19 +0800)]
configure.sh: Improve a comment.

The corresponding case block is not only for ARM.
Original comment text makes reader confused.

Test: N/A, just comment text changes.

Change-Id: I3154d18d3b3d237c1eecfe07dc7ec237c98194cf
Signed-off-by: Chen Wang <wangchen20@iscas.ac.cn>
13 months agoAdd new_mv_count to firstpass stats
Jerome Jiang [Wed, 14 Jun 2023 20:20:30 +0000 (16:20 -0400)]
Add new_mv_count to firstpass stats

Mostly follows the logic of how it's calculated in libaom.

Bug: b/287283080
Change-Id: I9ee67d844ef9db7cca63339b5304459eaa28d324

13 months agoMerge "Fix c vs intrinsic mismatch of vpx_hadamard_32x32() function" into main
Yunqing Wang [Mon, 12 Jun 2023 16:44:21 +0000 (16:44 +0000)]
Merge "Fix c vs intrinsic mismatch of vpx_hadamard_32x32() function" into main

13 months agoRTC RC: clean up unnecessary headers
Jerome Jiang [Fri, 9 Jun 2023 19:33:39 +0000 (15:33 -0400)]
RTC RC: clean up unnecessary headers

Change-Id: I77c407be59f4eb0c70a89a5fffd88c648e634123

13 months agoFix c vs intrinsic mismatch of vpx_hadamard_32x32() function
Anupam Pandey [Tue, 6 Jun 2023 06:57:34 +0000 (12:27 +0530)]
Fix c vs intrinsic mismatch of vpx_hadamard_32x32() function

This CL resolves the mismatch between C and intrinsic implementation
of vpx_hadamard_32x32 function. The mismatch was due to integer
overflow during the addition operation in the intrinsic functions.
Specifically, the addition in the intrinsic function was performed
at the 16-bit level, while the calculation of a0 + a1 resulted in
a 17-bit value.

This code change addresses the problem by performing
the addition at the 32-bit level (with sign extension) in both SSE2
and AVX2, and then converting the results back to the 16-bit level
after a right shift.

STATS_CHANGED

Change-Id: I576ca64e3b9ebb31d143fcd2da64322790bc5853

13 months agoReplace NONE with NO_REF_FRAME
Jerome Jiang [Thu, 8 Jun 2023 14:52:45 +0000 (10:52 -0400)]
Replace NONE with NO_REF_FRAME

NONE is a common name and it has conflicts with symbols defined in
Chromium.

Bug: b/286163500
Change-Id: I3d935a786f771a4d90b258fabc6fd6c2ecbf1c59

13 months agoMerge "Fix more typos (n/n)" into main
Jerome Jiang [Thu, 8 Jun 2023 14:11:24 +0000 (14:11 +0000)]
Merge "Fix more typos (n/n)" into main

13 months agoMerge "Fix more typos (3/n)" into main
Jerome Jiang [Wed, 7 Jun 2023 21:10:04 +0000 (21:10 +0000)]
Merge "Fix more typos (3/n)" into main

13 months agoFix more typos (n/n)
Jerome Jiang [Wed, 7 Jun 2023 20:35:19 +0000 (16:35 -0400)]
Fix more typos (n/n)

impace -> impact
taget -> target
prediciton -> prediction
addtion -> addition
the the -> the

Bug: webm:1803
Change-Id: I759c9d930a037ca69662164fcd6be160ed707d77

13 months agoFix more typos (3/n)
Jerome Jiang [Wed, 7 Jun 2023 16:36:31 +0000 (12:36 -0400)]
Fix more typos (3/n)

Propogation -> Propagation
propogate -> propagate
cant -> can't
upto -> up to
canddiates -> candidates
refernce -> reference
USEAGE -> USAGE

Change-Id: Iadaf2dffd86b54e04411910f667e8c2dfc6c4c77

13 months agoMerge "Fix more typos (2/n)" into main
Jerome Jiang [Wed, 7 Jun 2023 19:10:43 +0000 (19:10 +0000)]
Merge "Fix more typos (2/n)" into main

13 months agoMerge "Fix more typos (1/n)" into main
Jerome Jiang [Wed, 7 Jun 2023 19:10:36 +0000 (19:10 +0000)]
Merge "Fix more typos (1/n)" into main

13 months agoMerge "Fix a few typos" into main
Jerome Jiang [Wed, 7 Jun 2023 18:19:08 +0000 (18:19 +0000)]
Merge "Fix a few typos" into main

13 months agoFix more typos (2/n)
Jerome Jiang [Wed, 7 Jun 2023 16:31:38 +0000 (12:31 -0400)]
Fix more typos (2/n)

kernal -> kernel
e.g -> e.g.
paritioning -> partitioning
partioning -> partitioning
coefficents -> coefficients
i.e, -> i.e.,
equivalend -> equivalent
recive -> receive
resoultions -> resolutions

Bug: webm:1803
Change-Id: I1d6176202ee5daee7a64bf59114e8b304aeb4db7

13 months agoFix more typos (1/n)
Jerome Jiang [Wed, 7 Jun 2023 16:26:55 +0000 (12:26 -0400)]
Fix more typos (1/n)

Dont -> Don't
setings -> settings
thresold -> thresh
thresold -> threshold
becasue -> because
itterations -> iterations
its a -> it's a
an constant -> a constant

Bug: webm:1803
Change-Id: I1e019393939ed25c59c898c88d4941ec360b026d

13 months agoFix a few typos
Jerome Jiang [Wed, 7 Jun 2023 16:21:38 +0000 (12:21 -0400)]
Fix a few typos

segement -> segment
dont -> don't
useage -> usage
devide -> divide

Bug: webm:1803
Change-Id: I0153380b0003825c4b62cf323d4f2bc837c8a264

13 months agoAdd comments in vp9_diamond_search_sad_avx()
Deepa K G [Tue, 6 Jun 2023 06:08:09 +0000 (11:38 +0530)]
Add comments in vp9_diamond_search_sad_avx()

Added comments related to re-arranging the
elements of the SAD vector to find the
minimum.

Change-Id: I58b702d304a6cdd32f04775fba603e39c19a8947

13 months agoFix c vs avx mismatch of diamond_search_sad()
Deepa K G [Mon, 24 Apr 2023 10:26:18 +0000 (15:56 +0530)]
Fix c vs avx mismatch of diamond_search_sad()

In the function vp9_diamond_search_sad_avx(), arranged
the cost vector in a specific order. This ensures that
the motion vector with the least index is selected,
when there exists more than one candidate motion
vector with the minimum cost, thus resolving the
c vs avx mismatch.

STATS_CHANGED

Change-Id: I4f8864f464f9ea2aae6250db3d8ad91cb08b26e2

13 months agoMerge "Trim tpl stats by 2 extra frames" into main
Jerome Jiang [Wed, 31 May 2023 19:31:04 +0000 (19:31 +0000)]
Merge "Trim tpl stats by 2 extra frames" into main

13 months agoTrim tpl stats by 2 extra frames
Jerome Jiang [Fri, 26 May 2023 16:02:36 +0000 (12:02 -0400)]
Trim tpl stats by 2 extra frames

Not applicable to the last GOP.

Bug: b/284162396
Change-Id: I55b7e04e9fc4b68a08ce3e00b10743823c828954

13 months agoMerge changes I6a906803,I0307a3b6 into main
James Zern [Wed, 31 May 2023 17:44:00 +0000 (17:44 +0000)]
Merge changes I6a906803,I0307a3b6 into main

* changes:
  Optimize Neon implementation of vpx_int_pro_row
  Optimize Neon implementation of vpx_int_pro_col

13 months agoOptimize Neon implementation of vpx_int_pro_row
Jonathan Wright [Tue, 30 May 2023 16:31:18 +0000 (17:31 +0100)]
Optimize Neon implementation of vpx_int_pro_row

Double the number of accumulator registers to remove the bottleneck.
Also peel the first loop iteration.

Change-Id: I6a90680369f9c33cdfe14ea547ac1569ec3f50de

13 months agoOptimize Neon implementation of vpx_int_pro_col
Jonathan Wright [Tue, 30 May 2023 14:22:04 +0000 (15:22 +0100)]
Optimize Neon implementation of vpx_int_pro_col

Use widening pairwise addition instructions to halve the number of
additions required.

Change-Id: I0307a3b65e50d2b1ae582938bc5df9c2b21df734

13 months agoMerge changes Ia3647698,I55caf34e,Id2c60f39 into main
James Zern [Thu, 25 May 2023 04:54:09 +0000 (04:54 +0000)]
Merge changes Ia3647698,I55caf34e,Id2c60f39 into main

* changes:
  vpx_dsp_common.h,clip_pixel: work around VS2022 Arm64 issue
  fdct_partial_neon.c: work around VS2022 Arm64 issue
  fdct8x8_test.cc: work around VS2022 Arm64 issue

13 months agoMerge "examples.mk,vpxdec: rm libwebm muxer dependency" into main
James Zern [Wed, 24 May 2023 17:43:20 +0000 (17:43 +0000)]
Merge "examples.mk,vpxdec: rm libwebm muxer dependency" into main

13 months agoMerge "Add IO for TPL stats" into main
Jerome Jiang [Wed, 24 May 2023 16:27:20 +0000 (16:27 +0000)]
Merge "Add IO for TPL stats" into main

13 months agovpx_dsp_common.h,clip_pixel: work around VS2022 Arm64 issue
James Zern [Tue, 23 May 2023 22:50:10 +0000 (15:50 -0700)]
vpx_dsp_common.h,clip_pixel: work around VS2022 Arm64 issue

cl.exe targeting AArch64 with optimizations enabled
produces invalid code for clip_pixel() when the return type is uint8_t.
See:
https://developercommunity.visualstudio.com/t/Misoptimization-for-ARM64-in-VS-2022-17/10363361

Bug: b/277255076
Bug: webm:1788
Change-Id: Ia3647698effd34f1cf196cd33fa4a8cab9fa53d6

13 months agofdct_partial_neon.c: work around VS2022 Arm64 issue
James Zern [Tue, 23 May 2023 22:49:29 +0000 (15:49 -0700)]
fdct_partial_neon.c: work around VS2022 Arm64 issue

cl.exe targeting AArch64 with optimizations enabled
will fail with an internal compiler error.
See:
https://developercommunity.visualstudio.com/t/Compiler-crash-C1001-when-building-a-for/10346110

Bug: b/277255076
Bug: webm:1788
Change-Id: I55caf34e910dab47a7775f07280677cdfe606f5b

13 months agofdct8x8_test.cc: work around VS2022 Arm64 issue
James Zern [Tue, 23 May 2023 22:48:10 +0000 (15:48 -0700)]
fdct8x8_test.cc: work around VS2022 Arm64 issue

cl.exe targeting AArch64 with optimizations enabled
produces invalid code in RunExtremalCheck() and RunInvAccuracyCheck().
See:
https://developercommunity.visualstudio.com/t/1770-preview-1:-Misoptimization-for-AR/10369786

Bug: b/277255076
Bug: webm:1788
Change-Id: Id2c60f3948d8f788c78602aea1b5232133415dea

13 months agoAdd IO for TPL stats
Jerome Jiang [Thu, 4 May 2023 14:48:25 +0000 (10:48 -0400)]
Add IO for TPL stats

Overload TempOutFile constructor to allow IO mode.

Bug: b/281563704

Change-Id: I1f4f5b29db0e331941b6795e478eeeab51f625ad

14 months agoMerge "Add new vpx_tpl.h API file" into main
Jerome Jiang [Thu, 18 May 2023 17:20:03 +0000 (17:20 +0000)]
Merge "Add new vpx_tpl.h API file" into main

14 months agoMerge "Improve convolve AVX2 intrinsic for speed" into main
Yunqing Wang [Thu, 18 May 2023 15:48:49 +0000 (15:48 +0000)]
Merge "Improve convolve AVX2 intrinsic for speed" into main

14 months agoAdd new vpx_tpl.h API file
Jerome Jiang [Tue, 16 May 2023 18:57:05 +0000 (14:57 -0400)]
Add new vpx_tpl.h API file

New file (vpx_tpl.c) in the following CLs will add new APIs dealing with
TPL stats from VP9 encoder.

Change-Id: I5102ef64214cba1ca6ecea9582a19049666c6ca4

14 months agoImprove convolve AVX2 intrinsic for speed
Anupam Pandey [Fri, 12 May 2023 05:26:45 +0000 (10:56 +0530)]
Improve convolve AVX2 intrinsic for speed

This CL refactors the code related to convolve function.
Furthermore, improved the AVX2 intrinsic to compute
convolve vertical for w = 4 case, and convolve horiz for
w = 16 case.

Please note the module level scaling w.r.t C function
(timer based) for existing (AVX2) and new AVX2 intrinsics:

Block     Scaling
Size   AVX2       AVX2
     (existing)   (New)
4x4    5.34x      5.91x
4x8    7.10x      7.79x
16x8  23.52x     25.63x
16x16 29.47x     30.22x
16x32 33.42x     33.44x

This is a bit exact change.

Change-Id: If130183bc12faab9ca2bcec0ceeaa8d0af05e413

14 months agoMerge changes Ie77ad184,Idfcac43c into main
James Zern [Tue, 16 May 2023 00:05:05 +0000 (00:05 +0000)]
Merge changes Ie77ad184,Idfcac43c into main

* changes:
  Add 2D-specific Neon horizontal convolution functions
  Refactor standard bitdepth Neon convolution functions

14 months agoAdd 2D-specific Neon horizontal convolution functions
Jonathan Wright [Thu, 4 May 2023 15:33:38 +0000 (16:33 +0100)]
Add 2D-specific Neon horizontal convolution functions

2D 8-tap convolution filtering is performed in two passes -
horizontal and vertical. The horizontal pass must produce enough
input data for the subsequent vertical pass - 3 rows above and 4 rows
below, in addition to the actual block height.

At present, all Neon horizontal convolution algorithms process 4 rows
at a time, but this means we end up doing at least 1 row too much
work in the 2D first pass case where we need h + 7, not h + 8 rows of
output.

This patch adds additional dot-product (SDOT and USDOT) Neon paths
that process h + 7 rows of data exactly, saving the work of the
unnecessary extra row. It is impractical to take a similar approach
for the Armv8.0 MLA paths since we have to transpose the data block
both before and after calling the convolution helper functions.

vpx_convolve_neon performance impact: we observe a speedup of ~9% for
smaller (and wider) blocks, and a speedup of 0-3% for larger blocks.
This is to be expected since the proportion of redundant work
decreases as the block height increases.

Change-Id: Ie77ad1848707d2d48bb8851345a469aae9d097e1

14 months agoMerge "Don't use -Wl,-z,defs with Clang's sanitizers" into main
James Zern [Fri, 12 May 2023 19:23:47 +0000 (19:23 +0000)]
Merge "Don't use -Wl,-z,defs with Clang's sanitizers" into main

14 months agoDon't use -Wl,-z,defs with Clang's sanitizers
James Zern [Mon, 8 May 2023 23:58:59 +0000 (16:58 -0700)]
Don't use -Wl,-z,defs with Clang's sanitizers

This avoids link errors related to the sanitizers:
https://clang.llvm.org/docs/AddressSanitizer.html#usage
"When linking shared libraries, the AddressSanitizer run-time is not
linked, so -Wl,-z,defs may cause link errors ..."

See also:
https://crbug.com/aomedia/3438

Bug: webm:1801
Fixed: webm:1801
Change-Id: Ie212318005a5f7222e5486775175534025306367

14 months agoRefactor standard bitdepth Neon convolution functions
Jonathan Wright [Mon, 8 May 2023 16:41:26 +0000 (17:41 +0100)]
Refactor standard bitdepth Neon convolution functions

1) Use #define constant instead of magic numbers for right shifts.
2) Move saturating narrow into helper functions that return 4-element
   result vectors.
3) Use mem_neon.h helpers for load/store sequences in Armv8.0 paths.
4) Tidy up: assert conditions and some longer variable names.
5) Prefer != 0 to > 0 where possible for loop termination conditions.

Change-Id: Idfcac43ca38faf729dca07b8cc8f7f45ad264d24

14 months agoconfigure: add -Wshadow
James Zern [Mon, 17 Apr 2023 20:42:11 +0000 (13:42 -0700)]
configure: add -Wshadow

libraries under third_party/ are out of scope for this change.

Bug: webm:1793
Change-Id: I562065a3c0ea9fdfc9615d1a6b1ae47da79b8ce0

14 months agoMerge "vp8_macros_msa.h: clear -Wshadow warnings" into main
James Zern [Tue, 9 May 2023 21:03:31 +0000 (21:03 +0000)]
Merge "vp8_macros_msa.h: clear -Wshadow warnings" into main

14 months agoMerge changes Iac020280,I8ca8660a into main
James Zern [Tue, 9 May 2023 20:55:55 +0000 (20:55 +0000)]
Merge changes Iac020280,I8ca8660a into main

* changes:
  gen_msvs_vcxproj: add ARM64EC w/VS >= 2022
  configure: add clang-cl vs1[67] arm64 targets

14 months agoMerge "Add AVX2 intrinsic for vpx_comp_avg_pred() function" into main
Yunqing Wang [Tue, 9 May 2023 15:57:09 +0000 (15:57 +0000)]
Merge "Add AVX2 intrinsic for vpx_comp_avg_pred() function" into main

14 months agoAdd AVX2 intrinsic for vpx_comp_avg_pred() function
Anupam Pandey [Mon, 8 May 2023 06:40:09 +0000 (12:10 +0530)]
Add AVX2 intrinsic for vpx_comp_avg_pred() function

The module level scaling w.r.t C function (timer based) for
existing (SSE2) and new AVX2 intrinsics:

If ref_padding = 0
Block     Scaling
size    SSE2    AVX2
8x4     3.24x   3.24x
8x8     4.22x   4.90x
8x16    5.91x   5.93x
16x8    1.63x   3.52x
16x16   1.53x   4.19x
16x32   1.38x   4.82x
32x16   1.28x   3.08x
32x32   1.45x   3.13x
32x64   1.38x   3.04x
64x32   1.39x   2.12x
64x64   1.46x   2.24x

If ref_padding = 8
Block     Scaling
size    SSE2    AVX2
8x4     3.20x   3.21x
8x8     4.61x   4.83x
8x16    5.50x   6.45x
16x8    1.56x   3.35x
16x16   1.53x   4.19x
16x32   1.37x   4.83x
32x16   1.28x   3.07x
32x32   1.46x   3.29x
32x64   1.38x   3.22x
64x32   1.38x   2.14x
64x64   1.38x   2.12x

This is a bit-exact change.

Change-Id: I72c5d155f64d0c630bc8c3aef21dc8bbd045d9e6

14 months agovp8_macros_msa.h: clear -Wshadow warnings
James Zern [Mon, 8 May 2023 18:48:15 +0000 (11:48 -0700)]
vp8_macros_msa.h: clear -Wshadow warnings

Bug: webm:1793
Change-Id: Ia940b06bd23a915a050432e03bb630567e891d8d

14 months agoMerge "README: update target list" into main
James Zern [Mon, 8 May 2023 20:52:52 +0000 (20:52 +0000)]
Merge "README: update target list" into main

14 months agoMerge changes Ie165d410,I6d9bb8da,I6858e574 into main
James Zern [Mon, 8 May 2023 20:52:31 +0000 (20:52 +0000)]
Merge changes Ie165d410,I6d9bb8da,I6858e574 into main

* changes:
  vp8_[cd]x_iface: clear setjmp flag on function exit
  vp9_decodeframe,tile_worker_hook: relocate setjmp=1
  vp9,encoder_set_config: set setjmp flag after setjmp()

14 months agoMerge "Add VpxTplGopStats" into main
Jerome Jiang [Mon, 8 May 2023 19:47:30 +0000 (19:47 +0000)]
Merge "Add VpxTplGopStats" into main

14 months agoMerge "Unify implementation of CHECK_MEM_ERROR" into main
Jerome Jiang [Mon, 8 May 2023 19:47:21 +0000 (19:47 +0000)]
Merge "Unify implementation of CHECK_MEM_ERROR" into main

14 months agoMerge "CHECK_MEM_ERROR to return in vp9_set_roi_map" into main
Jerome Jiang [Mon, 8 May 2023 19:46:44 +0000 (19:46 +0000)]
Merge "CHECK_MEM_ERROR to return in vp9_set_roi_map" into main

14 months agogen_msvs_vcxproj: add ARM64EC w/VS >= 2022
James Zern [Sat, 6 May 2023 02:00:08 +0000 (19:00 -0700)]
gen_msvs_vcxproj: add ARM64EC w/VS >= 2022

rather than define new targets, add a platform to the arm64 list as they
share the same configuration.

Bug: webm:1788
Change-Id: Iac020280b1103fb12b559f21439aeff26568fba4

14 months agoconfigure: add clang-cl vs1[67] arm64 targets
James Zern [Fri, 5 May 2023 23:56:51 +0000 (16:56 -0700)]
configure: add clang-cl vs1[67] arm64 targets

x86 and armv7 are skipped for now as the intrinsics will need different
flags than cl.exe (/arch:... -> -m...).

Bug: webm:1788
Change-Id: I8ca8660a8644cdd84c51cb1f75005e371ba8207d

14 months agoAdd VpxTplGopStats
Jerome Jiang [Thu, 4 May 2023 18:28:29 +0000 (14:28 -0400)]
Add VpxTplGopStats

Contains the size of GOP - also the size of the list of TPL stats for
each frame in this GOP.

VpxTplGopStats will be the unit for VP9E_GET_TPL_STATS control to return
TPL stats from the encoder.

Bug: b/273736974
Change-Id: I1682242fc6db4aafcd6314af023aa0d704976585

14 months agoUnify implementation of CHECK_MEM_ERROR
Jerome Jiang [Fri, 5 May 2023 02:03:27 +0000 (22:03 -0400)]
Unify implementation of CHECK_MEM_ERROR

There were multiple implementations of CHECK_MEM_ERROR across the
library that take different arguments and used in different places.

This CL will unify them and have only one implementation that takes
vpx_internal_error_info.

Change-Id: I2c568639473815bc00b1fc2b72be56e5ccba1a35

14 months agoCHECK_MEM_ERROR to return in vp9_set_roi_map
Jerome Jiang [Mon, 8 May 2023 14:37:54 +0000 (10:37 -0400)]
CHECK_MEM_ERROR to return in vp9_set_roi_map

Also change the return type of vp9_set_roi_map to vpx_codec_err_t

Change-Id: I60d9ff45f2d3dfc44cd6e2aab2cb1ba389ff15f3

14 months agoexamples.mk,vpxdec: rm libwebm muxer dependency
James Zern [Sat, 6 May 2023 22:48:58 +0000 (15:48 -0700)]
examples.mk,vpxdec: rm libwebm muxer dependency

vpxdec only requires the parser.

Change-Id: I54ead453d4af400ca5c3412a3211d6d0b1383046

14 months agoMerge "vp9_encoder: clear -Wshadow warning" into main
James Zern [Sat, 6 May 2023 02:26:55 +0000 (02:26 +0000)]
Merge "vp9_encoder: clear -Wshadow warning" into main