review.tizen.org Git - platform/upstream/libvpx.git/log

projects / platform / upstream / libvpx.git / log

Shiyou Yin [Fri, 25 Aug 2017 06:44:02 +0000 (06:44 +0000)]

Merge "vpx_dsp:loongson optimize vpx_varianceWxH_c,vpx_sub_pixel_varianceWxH_c and vpx_sub_pixel_avg_varianceWxH_c with mmi."

commit | commitdiff | tree

Marco Paniconi [Thu, 24 Aug 2017 22:26:43 +0000 (22:26 +0000)]

Merge "vp9: Adjust 16x16 splot threshold for variance partition"

commit | commitdiff | tree

Tom Finegan [Thu, 24 Aug 2017 19:11:48 +0000 (12:11 -0700)]

Make sure diff is present at configure time.

This avoids an endless build loop at vpx_version.h
creation time when diff is not present.

Change-Id: I16ae386dbdaf14f9a2b85e4c5d1aaa6c08f52a45

commit | commitdiff | tree

Johann Koenig [Thu, 24 Aug 2017 18:55:03 +0000 (18:55 +0000)]

Merge "quantize avx: copy 32x32 implementation"

commit | commitdiff | tree

Shiyou Yin [Thu, 24 Aug 2017 15:11:58 +0000 (23:11 +0800)]

vpx_dsp:loongson optimize vpx_varianceWxH_c,vpx_sub_pixel_varianceWxH_c and vpx_sub_pixel_avg_varianceWxH_c with mmi.

Change-Id: Ia576a721df6312329b599c31cfe1fb1267a9f174

commit | commitdiff | tree

Marco [Thu, 24 Aug 2017 17:36:27 +0000 (10:36 -0700)]

vp9: Adjust 16x16 splot threshold for variance partition

For speeds < 7, increase threshold that controls the split
of 16x16->8x8 blocks, for resolutions 720p and higher.

Minor change for speed 5 (since it uses reference partition scheme
which only uses variance partition as first step).
For speed 6: ~0.5% increase in avgPSNR/SSIM metrics on ytlvie set.
No change in speed.

Change-Id: I5126580973201538d8ca26a9256b93c4d11d685b

commit | commitdiff | tree

Johann Koenig [Thu, 24 Aug 2017 17:43:10 +0000 (17:43 +0000)]

Merge "quantize test: skip block was removed"

commit | commitdiff | tree

Johann [Wed, 23 Aug 2017 20:59:33 +0000 (13:59 -0700)]

quantize avx: copy 32x32 implementation

Ensure avx and ssse3 stay in sync by testing them against each other.

Change-Id: I699f3b48785c83260825402d7826231f475f697c

commit | commitdiff | tree

Johann [Wed, 16 Aug 2017 20:10:59 +0000 (13:10 -0700)]

quantize ssse3: copy implementation to intrinsics

Still does not pass tests. Does match the previous assembly, although
saving the sign before multiplying is dubious.

Change-Id: Ia163f18c755aba542d6e93f7bf7343184660df5a

commit | commitdiff | tree

Johann [Thu, 24 Aug 2017 14:21:42 +0000 (07:21 -0700)]

quantize test: skip block was removed

Change-Id: I1d93698bc27529b0544d79dd7b9fe37afa51ef87

commit | commitdiff | tree

Johann Koenig [Thu, 24 Aug 2017 14:04:29 +0000 (14:04 +0000)]

Merge "quantize test: set threshold for 32x32"

commit | commitdiff | tree

Shiyou Yin [Thu, 24 Aug 2017 00:55:11 +0000 (00:55 +0000)]

Merge "vpx_dsp:loongson optimize vpx_mseWxH_c(case 16x16,16X8,8X16,8X8) with mmi."

commit | commitdiff | tree

Marco Paniconi [Wed, 23 Aug 2017 23:09:33 +0000 (23:09 +0000)]

Merge "vp9: SVC: Skip NEWMV for small blocks for (0, 0) base_mv."

commit | commitdiff | tree

Johann [Wed, 23 Aug 2017 22:59:11 +0000 (15:59 -0700)]

quantize test: set threshold for 32x32

Change-Id: I77be617c7d7c64929dd51c6077322f4f8ad23897

commit | commitdiff | tree

Johann Koenig [Wed, 23 Aug 2017 21:14:13 +0000 (21:14 +0000)]

Merge "quantize avx: copy implementation to intrinsics"

commit | commitdiff | tree

Marco [Wed, 23 Aug 2017 20:01:57 +0000 (13:01 -0700)]

vp9: SVC: Skip NEWMV for small blocks for (0, 0) base_mv.

For SVC encoding:
average speedup ~1.5%, with small ~0.57 loss in avgPSNR metrics.

Change-Id: Icebce6f6ef4e819d7dfcf8db898c583167351de4

commit | commitdiff | tree

Scott LaVarnway [Wed, 23 Aug 2017 19:59:25 +0000 (19:59 +0000)]

Merge "vpx_dsp: get32x32var_avx2() cleanup"

commit | commitdiff | tree

Johann Koenig [Wed, 23 Aug 2017 19:20:53 +0000 (19:20 +0000)]

Merge "quantize neon: round dqcoeff towards zero"

commit | commitdiff | tree

Johann [Tue, 22 Aug 2017 22:43:35 +0000 (15:43 -0700)]

quantize avx: copy implementation to intrinsics

Adds an early exit based on ptest. Slightly slower than ssse3 in the
full case because of the extra check, but potentially faster if lots of
rows can be skipped.

Very close in speed to the assembly.

Can run in 32 bit, unlike the assembly. Allows reworking the function
prototype to use structs.

Change-Id: If80e2b9ba059370a4cad3c973196e82a97b4330e

commit | commitdiff | tree

Johann [Mon, 21 Aug 2017 18:23:49 +0000 (11:23 -0700)]

quantize neon: round dqcoeff towards zero

Add 1 if negative to get dqcoeff to round towards zero.

10-15% faster than converting to positive before shifting.

Change-Id: I01a62fd0c9bca786b6885b318bd447bb9229903d

commit | commitdiff | tree

Johann [Thu, 10 Aug 2017 22:02:22 +0000 (15:02 -0700)]

quantize fp: neon implementation

About 4x faster when values are below the dequant threshold and 10x
faster if everything needs to be calculated.

Both numbers would improve if the division for dqcoeff could be
simplified.

BUG=webm:1426

Change-Id: I8da67c1f3fcb4abed8751990c1afe00bc841f4b2

commit | commitdiff | tree

Shiyou Yin [Tue, 22 Aug 2017 00:44:36 +0000 (08:44 +0800)]

vpx_dsp:loongson optimize vpx_mseWxH_c(case 16x16,16X8,8X16,8X8) with mmi.

Change-Id: I2c782d18d9004414ba61b77238e0caf3e022d8f2

commit | commitdiff | tree

Marco Paniconi [Tue, 22 Aug 2017 22:52:05 +0000 (22:52 +0000)]

Merge "vp9: Condition lighting change detection on CBR mode."

commit | commitdiff | tree

Johann Koenig [Tue, 22 Aug 2017 22:27:56 +0000 (22:27 +0000)]

Merge changes I53f8a160,I48f282bf

* changes:
quantize ssse3: copy style from sse2
quantize sse2: copy opts from ssse3

commit | commitdiff | tree

Marco [Tue, 22 Aug 2017 21:46:39 +0000 (14:46 -0700)]

vp9: Condition lighting change detection on CBR mode.

This feature is used for the CBR RTC encoding mode
at speed >= 6. This change will exclude it for VBR mode.

For speed 6 live encoding (VBR):
avgPSNR/SSIM metrics on ytlive set up by ~1% (few clips up by 2/3%).
No change in speed.

Change-Id: I1a0dd94c334f7df309ab5a48d477d7e25355b798

commit | commitdiff | tree

Johann [Tue, 22 Aug 2017 21:25:27 +0000 (14:25 -0700)]

quantize ssse3: copy style from sse2

Change-Id: I53f8a160e640c674ea035fc112e207b6dca42598

commit | commitdiff | tree

Johann Koenig [Tue, 22 Aug 2017 20:03:02 +0000 (20:03 +0000)]

Merge "quantize: capture skip block early"

commit | commitdiff | tree

Johann [Tue, 22 Aug 2017 20:01:44 +0000 (13:01 -0700)]

quantize sse2: copy opts from ssse3

Simplify eob calculations based on ssse3 implementation.

General clean up and re-scoping.

Change-Id: I48f282bf9bd28ee9bc2c7a6779be9d45b5a3a3ee

commit | commitdiff | tree

Johann Koenig [Tue, 22 Aug 2017 19:19:14 +0000 (19:19 +0000)]

Merge changes Icfb70687,I9a963e99,Ie8ac00ef,I1272917c

* changes:
  quantize: ignore skip_block in arm
  quantize: ignore skip_block in x86
  quantize fp: ignore skip_block in arm
  quantize fp: ignore skip_block in x86

commit | commitdiff | tree

Johann [Tue, 22 Aug 2017 18:24:33 +0000 (11:24 -0700)]

quantize: capture skip block early

This should probably be handled before vp9_regular_quantize_b_4x4 even
gets called.

Fixes an assert resulting from removing skip_block from the quantize
functions.

BUG=webm:1459

Change-Id: I7f52b53f959b4654b3d4517ebda31a678f4d0fde

commit | commitdiff | tree

James Zern [Tue, 22 Aug 2017 00:48:39 +0000 (00:48 +0000)]

Merge "ppc: Add vpx_idct16x16_256_add_vsx"

commit | commitdiff | tree

Shiyou Yin [Tue, 22 Aug 2017 00:37:23 +0000 (00:37 +0000)]

Merge "vpx_dsp:loongson optimize vpx_subtract_block_c (case 4x4,8x8,16x16) with mmi."

commit | commitdiff | tree

Johann [Mon, 21 Aug 2017 18:15:39 +0000 (11:15 -0700)]

quantize: ignore skip_block in arm

Change-Id: Icfb70687476b2edb25d255793ba325b261d40584

commit | commitdiff | tree

Johann [Mon, 21 Aug 2017 18:15:23 +0000 (11:15 -0700)]

quantize: ignore skip_block in x86

Change-Id: I9a963e99f08761f0c8d6a305619270b2f1c4edf8

commit | commitdiff | tree

Johann [Mon, 21 Aug 2017 18:14:51 +0000 (11:14 -0700)]

quantize fp: ignore skip_block in arm

Change-Id: Ie8ac00efa826eead2a227726a1add816e04ff147

commit | commitdiff | tree

Johann [Mon, 21 Aug 2017 18:14:39 +0000 (11:14 -0700)]

quantize fp: ignore skip_block in x86

Change-Id: I1272917c49cf6e6710e52c36535b2fc8c8dced78

commit | commitdiff | tree

Johann [Wed, 2 Aug 2017 21:28:05 +0000 (14:28 -0700)]

quantize test: test _fp_ version of quantize

None of the x86 optimizations pass the tests.

Change-Id: Ic67f2ba1977b657e68f2a13b0711fc5fcbafd909

commit | commitdiff | tree

Johann [Wed, 16 Aug 2017 20:34:14 +0000 (13:34 -0700)]

Remove skip_block from quantize

This condition is handled before this code is reached. The ssse3 version
of the function has always crashed when attempting to handle the
skip_block condition.

Add assert() and comments regarding the usage of skip_block.

Removing the parameter is a fairly involved process so leave it be for
the moment.

Change-Id: Ib299f6fc6589d7ee102262cc74a7aeb60110bc5a

commit | commitdiff | tree

Scott LaVarnway [Fri, 18 Aug 2017 20:44:09 +0000 (13:44 -0700)]

vpx_dsp: get32x32var_avx2() cleanup

renamed to get32x16var_avx2()

BUG=webm:1404

Change-Id: Icb8f3986c9c9c646e13a69430db7235fc7e1a036

commit | commitdiff | tree

Scott LaVarnway [Fri, 18 Aug 2017 20:30:59 +0000 (20:30 +0000)]

Merge "vpx_dsp: vpx_get16x16var_avx2() cleanup"

commit | commitdiff | tree

Scott LaVarnway [Mon, 7 Aug 2017 18:56:42 +0000 (11:56 -0700)]

vpx_dsp: vpx_get16x16var_avx2() cleanup

BUG=webm:1404

Change-Id: I88aceb07f4db4870a06eee21d87296974ce3221a

commit | commitdiff | tree

Johann Koenig [Fri, 18 Aug 2017 16:00:28 +0000 (16:00 +0000)]

Merge "quantize: normalize intermediate types"

commit | commitdiff | tree

Shiyou Yin [Wed, 2 Aug 2017 06:17:09 +0000 (14:17 +0800)]

vpx_dsp:loongson optimize vpx_subtract_block_c (case 4x4,8x8,16x16) with mmi.

Change-Id: Ia120ad1064d0b6106d9685cf075bdab373eef19e

commit | commitdiff | tree

James Zern [Thu, 17 Aug 2017 22:37:38 +0000 (15:37 -0700)]

highbd_idct32x32*,idct32_34_4x32_quarter_1_2: fix typo

135 -> 34

fixes unused function warnings for highbd_idct32_34_4x32_quarter_[12]

Change-Id: I4f50ff6ea514200af93dd59ff94c7f9717409682

commit | commitdiff | tree

Johann [Wed, 16 Aug 2017 17:22:48 +0000 (10:22 -0700)]

quantize: normalize intermediate types

Despite abs_coeff being a positive value, all the other implementations
treat it as signed which simplifies restoring the sign.

HBD builds cast qcoeff to avoid a visual studio warning. Match
vp9_quantize.c style of casting the entire expression.

Change-Id: I62b539b8df05364df3d7644311e325288da7c5b5

commit | commitdiff | tree

James Zern [Thu, 17 Aug 2017 06:06:09 +0000 (23:06 -0700)]

inv_txfm_sse2.h: correct idct*/iadst* prototypes

fixes mismatch between prototypes and definitions

Change-Id: Ib5e7dfcce244dbb8401815be2cdd183d96792652

commit | commitdiff | tree

Paul Wilkins [Wed, 16 Aug 2017 18:25:57 +0000 (18:25 +0000)]

Merge "Prevent parameters that can cause invalid ARF groups."

commit | commitdiff | tree

Paul Wilkins [Wed, 16 Aug 2017 18:25:29 +0000 (18:25 +0000)]

Merge "Fix corrupt arf groups due to low "lag_in_frames""

commit | commitdiff | tree

Linfeng Zhang [Wed, 16 Aug 2017 16:36:37 +0000 (16:36 +0000)]

Merge changes I08b562b6,Ia275940a,I51106e90

* changes:
  Add vpx_highbd_idct32x32_{34, 135, 1024}_add_{sse2, sse4_1}
  Update highbd idct x86 optimizations.
  Update 32x32 idct sse2 and ssse3 optimizations.

commit | commitdiff | tree

paulwilkins [Wed, 16 Aug 2017 12:34:49 +0000 (13:34 +0100)]

Prevent parameters that can cause invalid ARF groups.

Having a very low "lag_in_frames" value could cause the encoder to create
incorrect / corrupt ARF groups including displayed frames that update the
ARF buffer and false overlay frames that are coded at low rate but are not
actually overlays of a real ARF frame.

This is linked to a reported unit test "slow down" where the chosen parameters
(lag of 3 frames) gave rise to such "broken" ARF group(s).

See also BUG=webm:1454

Change-Id: If52d0236243ed5552537d1ea9ed3fed8c867232c

commit | commitdiff | tree

paulwilkins [Wed, 16 Aug 2017 13:07:24 +0000 (14:07 +0100)]

Fix corrupt arf groups due to low "lag_in_frames"

Having a very small value for "lag_in_frames" can result in
corrupt arf groups including displayed frames that update
the arf buffer and fake overlay frames that are not in fact
overlays of real arfs but are nevertheless starved of bits.

Leaving lag_in_frames at the default of 25 for these 5 frame two
pass VBR tests should now give rise to a valid ARF coding pattern
as follows:- K(ey), A(rf), N(ormal), N, N, O(verlay).

This change is part of a response to BUG=webm:1454 where broken
arf groups interacted badly with a change that corrects for large rate
misses. However, it may still in some cases increase encode time by
virtue of the fact that the unit test now codes a correct coding pattern
with "hidden" ARF frames.

Change-Id: Ifd0246a4c1d0be247247c754024d7a4ed5f66a6b

commit | commitdiff | tree

Paul Wilkins [Wed, 16 Aug 2017 13:01:38 +0000 (13:01 +0000)]

Merge "Fix for encoder slowdown (for speeds >= 3)"

commit | commitdiff | tree

paulwilkins [Mon, 14 Aug 2017 15:11:34 +0000 (16:11 +0100)]

Fix for encoder slowdown (for speeds >= 3)

Some clips in nightly unit test exhibiting significant encoder slowdown which
appears to bisect to Change-Id: I692311a709ccdb6003e705103de9d05b59bf840a.

The above change allowed for emergency iterations of the recode loop and
adjustment of the Q range if there is a large rate miss.

This patch disables the above adaptation for cases of cpu_speed >= 3 or more
specifically where cpi->sf.recode_loop >= ALLOW_RECODE_KFARFGF.

For speeds >= 3 the code does not currently run a dummy bit pack operation
inside the recode loop. Without this dummy pack operation there is no up to
date estimate of the current frame's size to use as a basis for assessing the
requirement for a recode. In practice it was using the previous frames size (or 0
for the first frame) which could cause odd behavior.

If we require the emergency rate correction added in Change-Id: I6923.. for
the higher speed settings it will be necessary to enable the dummy pack
which will in turn hurt encode speed.

BUG=webm:1454

Change-Id: I4fb3c6062ca9508325a6f31582f8e80f1a9b126f

commit | commitdiff | tree

Jerome Jiang [Tue, 15 Aug 2017 18:28:54 +0000 (18:28 +0000)]

Merge "Clean up writing YUV files for debug purpose."

commit | commitdiff | tree

Marco Paniconi [Tue, 15 Aug 2017 17:53:08 +0000 (17:53 +0000)]

Merge "vp9: Denoiser fix: use correct bsize for skin detection."

commit | commitdiff | tree

Jerome Jiang [Mon, 14 Aug 2017 20:57:51 +0000 (13:57 -0700)]

Clean up writing YUV files for debug purpose.

Change legacy vp8/9_write_yuv_frame to vpx_write_yuv_files.
Delete some flags that can be enabled during build.

To enable writing denoised YUV, use the following command line:
CFLAGS='-DOUTPUT_YUV_DENOISED' ./configure
--enable-vp9-temporal-denoising

For skinmap, use CFLAGS='-DOUTPUT_YUV_SKINMAP'

Change-Id: I236974ac8b3cf279d20c4dc7f6162d8b480b6528

commit | commitdiff | tree

Johann Koenig [Tue, 15 Aug 2017 17:37:59 +0000 (17:37 +0000)]

Merge changes I1f1edeaa,I89313cac

* changes:
quantize: silence unsigned overflow warning
quantize test: quiet overflow warning

commit | commitdiff | tree

Marco [Tue, 15 Aug 2017 17:01:09 +0000 (10:01 -0700)]

vp9: Denoiser fix: use correct bsize for skin detection.

Change-Id: I9d201fa3a4b00ebd147b57ed519fab8d59b0a802

commit | commitdiff | tree

Johann [Tue, 15 Aug 2017 16:48:24 +0000 (09:48 -0700)]

quantize: silence unsigned overflow warning

The result of the xor operation is unsigned. If coeff was negative,
this results in an unsigned value - INT_MIN.

Change-Id: I1f1edeaa6de1f4c68b848e8a82a666d390b749f0

commit | commitdiff | tree

Scott LaVarnway [Tue, 15 Aug 2017 15:35:33 +0000 (15:35 +0000)]

Merge "vp9: strip temporal filter code"

commit | commitdiff | tree

Johann [Tue, 15 Aug 2017 15:28:09 +0000 (08:28 -0700)]

quantize test: quiet overflow warning

Promote the result of RandRange to signed

Change-Id: I89313cace3bcbe9af96946bef00b6857fc48b128

commit | commitdiff | tree

Paul Wilkins [Tue, 15 Aug 2017 14:57:56 +0000 (14:57 +0000)]

Merge "Patch relating to Issue 1456."

commit | commitdiff | tree

Paul Wilkins [Tue, 15 Aug 2017 14:57:22 +0000 (14:57 +0000)]

Merge "Enable emergency fast Q adaptation for VBR test case."

commit | commitdiff | tree

Linfeng Zhang [Tue, 15 Aug 2017 00:05:22 +0000 (17:05 -0700)]

Add vpx_highbd_idct32x32_{34, 135, 1024}_add_{sse2, sse4_1}

BUG=webm:1412

Change-Id: I08b562b60fa85fbc2fec1c15c323a3444b44618f

commit | commitdiff | tree

Linfeng Zhang [Mon, 14 Aug 2017 23:47:24 +0000 (16:47 -0700)]

Update highbd idct x86 optimizations.

BUG=webm:1412

Change-Id: Ia275940af7d7d8637e9a851a9e39d655bfbe4069

commit | commitdiff | tree

Linfeng Zhang [Thu, 10 Aug 2017 22:17:48 +0000 (15:17 -0700)]

Update 32x32 idct sse2 and ssse3 optimizations.

Change-Id: I51106e90344035452621c49a6e1be7d5276b6c70

commit | commitdiff | tree

Scott LaVarnway [Thu, 10 Aug 2017 23:19:18 +0000 (16:19 -0700)]

vp9: strip temporal filter code

when CONFIG_REALTIME_ONLY is enabled.

BUG=webm:1446

Change-Id: Id547783ec75383966c40ab5cf6abb4a0f7984f52

commit | commitdiff | tree

Johann Koenig [Mon, 14 Aug 2017 20:52:52 +0000 (20:52 +0000)]

Merge changes I4b4beab1,I02f74dec

* changes:
quantize test: check skip_block
quantize test: use negative input

commit | commitdiff | tree

Johann Koenig [Mon, 14 Aug 2017 20:46:22 +0000 (20:46 +0000)]

Merge "temporal filter test: adjust inputs and runtime"

commit | commitdiff | tree

Jerome Jiang [Mon, 14 Aug 2017 18:55:42 +0000 (11:55 -0700)]

vp9 svc: Fix the stats output when sl = 1.

Actual frame size and bitrate is all 0 when using SVC sample encoder
with sl = 1 because the stats are set in parse_superframe_index which
will not caculate properly when sl = 1 since there is no superframe.

Use pkt->data.frame.sz instead when sl = 1.

Change-Id: I93f5e98a4c779e32b007e1564ba5396af9e34ad6

commit | commitdiff | tree

Scott LaVarnway [Mon, 14 Aug 2017 18:01:44 +0000 (18:01 +0000)]

Merge "vp9: strip mb graph code"

commit | commitdiff | tree

Johann [Tue, 28 Mar 2017 22:19:55 +0000 (15:19 -0700)]

temporal filter test: adjust inputs and runtime

Use input with a narrow range because the filter only applies when the
frames are similar.

Run CompareReferenceRandom more times. Especially before narrowing the
input range, the filter frequently did not apply.

Change-Id: Ie249bedf6d0d33dfa5884611cb1835788e418b38

commit | commitdiff | tree

James Zern [Mon, 14 Aug 2017 16:31:14 +0000 (09:31 -0700)]

disable SSSE3/VP9QuantizeTest* in hbd builds

this test fails with the configuration similar to the assembly prior to:
d52cb5972 quantize: copy ssse3 optimizations to intrinsics

BUG=webm:1458

Change-Id: Idc5c0b84c0598259fc49609a9f0756de531d3baf

commit | commitdiff | tree

Scott LaVarnway [Fri, 11 Aug 2017 19:24:33 +0000 (12:24 -0700)]

vp9: strip mb graph code

when CONFIG_REALTIME_ONLY is enabled.

BUG=webm:1446

Change-Id: I4b1b8e9a456830ba1b1bd3a8882e038d37ee7903

commit | commitdiff | tree

Johann [Fri, 11 Aug 2017 17:44:36 +0000 (10:44 -0700)]

Rename vp8 quantize file

BUG=webm:1457

Change-Id: Ie8fae018ad8417724fde087055b90228850d631d

commit | commitdiff | tree

Jerome Jiang [Fri, 11 Aug 2017 00:54:35 +0000 (00:54 +0000)]

Merge "vp9 SVC: Fix the denoiser frame buffer management."

commit | commitdiff | tree

Jerome Jiang [Mon, 7 Aug 2017 23:32:26 +0000 (16:32 -0700)]

vp9 SVC: Fix the denoiser frame buffer management.

Change the denoiser frame buffer management for SVC to more generally
handle the layer patterns in SVC (where last is not always refreshed).

This change is only for SVC with denoising and is bitexact.

Change-Id: Ic2b146a924cdf6e7114609158afa3d4880fe3fae

commit | commitdiff | tree

Linfeng Zhang [Thu, 10 Aug 2017 20:25:18 +0000 (20:25 +0000)]

Merge "Clean highbd idct x86 code with inline functions"

commit | commitdiff | tree

Johann Koenig [Thu, 10 Aug 2017 15:42:49 +0000 (15:42 +0000)]

Merge "neon: vpx_quantize_b_32x32"

commit | commitdiff | tree

Johann Koenig [Thu, 10 Aug 2017 15:42:20 +0000 (15:42 +0000)]

Merge "quantize: copy ssse3 optimizations to intrinsics"

commit | commitdiff | tree

paulwilkins [Tue, 8 Aug 2017 11:01:46 +0000 (12:01 +0100)]

Patch relating to Issue 1456.

Testing of 4k videos encoded with a fixed arbitrary chunking interval
uncovered a bug where by if a chunk ends 1 frame before a real scene cut,
the next chunk may be encoded with two consecutive key frames at the start
with the first being assigned 0 bits.

This fix insures that where there is a key frame group of length 1 it is
at least assigned 1 frames worth of bits not 0.

See also patch Change-Id: I692311a709ccdb6003e705103de9d05b59bf840a
which by virtue of allowing fast adaptation of Q made this bug more visible.

BUG=webm:1456

Change-Id: Ic9e016cb66d489b829412052273238975dc6f6ab

commit | commitdiff | tree

Linfeng Zhang [Wed, 9 Aug 2017 00:39:04 +0000 (17:39 -0700)]

Clean highbd idct x86 code with inline functions

Created inline functions highbd_butterfly_cospi16_sse2()
and highbd_butterfly_cospi16_sse4_1()

BUG=webm:1412

Change-Id: Icbc53a73712b6207379872a5e88d0a4d09e2322a

commit | commitdiff | tree

Marco Paniconi [Tue, 8 Aug 2017 23:08:10 +0000 (23:08 +0000)]

Merge "vp9: Partition logic adjustment for speed 6 feature."

commit | commitdiff | tree

Johann [Tue, 8 Aug 2017 21:21:58 +0000 (14:21 -0700)]

quantize test: check skip_block

Not all sizes were tested previously. Only 4x4 and 32x32

Change-Id: I4b4beab1b92a810a097a7306de04cc9e0e260315

commit | commitdiff | tree

Johann [Tue, 8 Aug 2017 21:19:56 +0000 (14:19 -0700)]

quantize test: use negative input

coeff contains signed values.

Change-Id: I02f74decf30379a28122169ab3e844d0f3bd7d23

commit | commitdiff | tree

Johann [Tue, 8 Aug 2017 21:05:16 +0000 (14:05 -0700)]

neon: vpx_quantize_b_32x32

With skip block the neon is about twice as fast as C.

The neon has no shortcut for coeff < zbin so it always takes the
same amount of time. Even if the C can take the shortcut, it is over
twice as fast in neon. If it can't, that gap increases to over 10x.

BUG=webm:1426

Change-Id: I400722146c1b5a5f6289f67d85fd642463d2bfc6

commit | commitdiff | tree

Johann [Thu, 3 Aug 2017 17:22:07 +0000 (10:22 -0700)]

quantize: copy ssse3 optimizations to intrinsics

Fairly minor differences from sse2. pabsw and psignw are the big gains.
Also re-uses some values in eob calculation to avoid an extra pcmp.

Fixes test failures in HBD and OS X builds.

Allows using it in 32bit builds, where it is about 40% faster than sse2.

Substantially faster than the assembly for skip_block. 10-20% faster the
rest of the time.

Change-Id: If783bb3567e561e47667e10133b9c84414a334e2

commit | commitdiff | tree

Marco [Tue, 8 Aug 2017 17:34:47 +0000 (10:34 -0700)]

vp9: Partition logic adjustment for speed 6 feature.

When adapt_partition_source_sad is enabled (currently only at
speed 6 for resoln <= 360p): use lower subsize (8x8 instead of 16x16)
for nonrd_select_partition on 32X32 blocks.

And force avoiding rectangular partition checks in
nonrd_pick_partition for speed >= 6.

Small increase ~0.5 in metrics for speed 6 on rtc_derf,
no change in speed.

Change-Id: Id751bc8f7573634571b2d6f5e29627cd5cebccae

commit | commitdiff | tree

Linfeng Zhang [Tue, 8 Aug 2017 00:37:02 +0000 (17:37 -0700)]

Update 32x32 idct sse2 funcs, add partial case 135

Change-Id: I2b9add83f6fd8f9138fed3bec04a59877a237a6a

commit | commitdiff | tree

Linfeng Zhang [Fri, 4 Aug 2017 00:50:03 +0000 (17:50 -0700)]

Rename highbd_multiplication_and_add_xx() to highbd_butterfly_xx()

in idct x86 code

Change-Id: I5159499a73a5c1b680516f6ca9c3d84f00c35083

commit | commitdiff | tree

Linfeng Zhang [Fri, 4 Aug 2017 00:46:21 +0000 (17:46 -0700)]

Replace multiplication_and_add() with butterfly() in idct x86 code

Change-Id: I266e45a3d75a5357c7d6e6f20ab5c6fdbfe4982e

commit | commitdiff | tree

Linfeng Zhang [Fri, 4 Aug 2017 00:42:54 +0000 (17:42 -0700)]

Update butterfly() in idct x86 optimizations.

Change-Id: Ic73e03bab9fdc085146f52094014db4af36ad701

commit | commitdiff | tree

Linfeng Zhang [Thu, 3 Aug 2017 00:48:40 +0000 (17:48 -0700)]

Add vpx_highbd_idct16x16_{10, 38, 256}_add_sse4_1

BUG=webm:1412

Change-Id: I8877c986b4042f7b8e33f5674c86700675a0e4ca

commit | commitdiff | tree

Linfeng Zhang [Fri, 4 Aug 2017 22:29:19 +0000 (15:29 -0700)]

Update for loop increment of idct x86 functions

Change-Id: Ided7895eaf41d5bc9d64fe536a17f5a078da68d4

commit | commitdiff | tree

Linfeng Zhang [Fri, 4 Aug 2017 22:10:12 +0000 (15:10 -0700)]

Update high bitdepth 16x16 idct x86 code

Prepare for high bitdepth 16x16 idct sse4.1 code.
Just functions moving and renaming.

BUG=webm:1412

Change-Id: Ie056fe4494b1f299491968beadcef990e2ab714a

commit | commitdiff | tree

Johann Koenig [Fri, 4 Aug 2017 20:34:50 +0000 (20:34 +0000)]

Merge "quantize test: consolidate sizes"

commit | commitdiff | tree

Johann [Wed, 2 Aug 2017 17:24:19 +0000 (10:24 -0700)]

quantize test: consolidate sizes

Pass a max txfm size parameter and combine the base quantize
test with the 32x32 test.

Change-Id: I72ddf020fe6888e864ea9f3642ee2d9a8e48a04b

commit | commitdiff | tree

Scott LaVarnway [Fri, 4 Aug 2017 14:48:46 +0000 (07:48 -0700)]

vpx_dsp: merge avx2 variance files

BUG=webm:1404

Change-Id: Ieb8f85c3811b05df78722cb41eeb1166966ceec4

commit | commitdiff | tree

Kaustubh Raste [Fri, 4 Aug 2017 05:26:56 +0000 (10:56 +0530)]

Fix mips dspr2 6 tap filter clobber list

Change-Id: Ib7c07e6ce00a5c7e59113b16e6661a8369f9e646

commit | commitdiff | tree

Linfeng Zhang [Fri, 4 Aug 2017 01:16:35 +0000 (01:16 +0000)]

Merge "Rewrite vpx_idct16x16_{10,256}_add_sse2() and add case 38 function"

Domain: Multimedia / Codec;

RSS Atom