platform/upstream/libvpx.git
9 years agoMerge "Allow specifying a different prefix in x86inc.asm"
Johann [Wed, 15 Apr 2015 19:12:58 +0000 (12:12 -0700)]
Merge "Allow specifying a different prefix in x86inc.asm"

9 years agoAllow specifying a different prefix in x86inc.asm
Johann [Tue, 14 Apr 2015 19:25:14 +0000 (15:25 -0400)]
Allow specifying a different prefix in x86inc.asm

Currently the prefix is forced to vp9 for any function using
x86inc.asm.

Change-Id: Icbca57ce68a52e743bdd7e9be86cfe8353f274c1

9 years agoMerge "Remove unused scaleopt.cpp"
Johann [Wed, 15 Apr 2015 13:36:45 +0000 (06:36 -0700)]
Merge "Remove unused scaleopt.cpp"

9 years agoRevert "Force_split on 16x16 blocks in variance partition."
Yunqing Wang [Tue, 14 Apr 2015 22:12:40 +0000 (15:12 -0700)]
Revert "Force_split on 16x16 blocks in variance partition."

This reverts commit eb8c667570aa83134c7db0690de9dbdde4d90291.
The patch caused mismatch while using multi-threads.

Change-Id: Icd646340af25b5d91e32f03ed3ea212e00e3e0be

9 years agoRemove unused scaleopt.cpp
Johann [Tue, 14 Apr 2015 20:59:30 +0000 (16:59 -0400)]
Remove unused scaleopt.cpp

Change-Id: Ibaeede61c128c73809332b9a853cd62b8d6d5325

9 years agoMerge "Force_split on 16x16 blocks in variance partition."
Marco [Tue, 14 Apr 2015 16:44:58 +0000 (09:44 -0700)]
Merge "Force_split on 16x16 blocks in variance partition."

9 years agoMerge "Remove unnecessary set postproc flags."
hkuang [Mon, 13 Apr 2015 21:33:43 +0000 (14:33 -0700)]
Merge "Remove unnecessary set postproc flags."

9 years agoForce_split on 16x16 blocks in variance partition.
Marco [Wed, 18 Mar 2015 20:55:19 +0000 (13:55 -0700)]
Force_split on 16x16 blocks in variance partition.

Force split on 16x16 block (to 8x8) based on the minmax over the 8x8 sub-blocks.

Also increase variance threshold for 32x32, and add exit condiiton in choose_partition
(with very safe threshold) based on sad used to select reference frame.

Some visual improvement near moving boundaries.
Average gain in psnr/ssim: ~0.6%, some clips go up ~1 or 2%.
Encoding time increase (due to more 8x8 blocks) from ~1-4%, depending on clip.

Change-Id: I4759bb181251ac41517cd45e326ce2997dadb577

9 years agoMerge "mips msa vp9 common headers added"
Parag Salasakar [Fri, 10 Apr 2015 04:50:15 +0000 (21:50 -0700)]
Merge "mips msa vp9 common headers added"

9 years agoMerge "Remove get_nonrd_var_based_fixed_partition function"
Jingning Han [Thu, 9 Apr 2015 21:45:19 +0000 (14:45 -0700)]
Merge "Remove get_nonrd_var_based_fixed_partition function"

9 years agoMerge "Compute prediction filter type cost only when needed"
Jingning Han [Thu, 9 Apr 2015 21:45:11 +0000 (14:45 -0700)]
Merge "Compute prediction filter type cost only when needed"

9 years agoMerge "SSSE3 assembly implementation of 8x8 Hadamard transform"
Jingning Han [Thu, 9 Apr 2015 18:16:11 +0000 (11:16 -0700)]
Merge "SSSE3 assembly implementation of 8x8 Hadamard transform"

9 years agoMerge "Remove unnecessary mv clamp with on demand border extension."
hkuang [Thu, 9 Apr 2015 17:08:06 +0000 (10:08 -0700)]
Merge "Remove unnecessary mv clamp with on demand border extension."

9 years agoRemove get_nonrd_var_based_fixed_partition function
Jingning Han [Thu, 9 Apr 2015 16:49:09 +0000 (09:49 -0700)]
Remove get_nonrd_var_based_fixed_partition function

This function has been replaced by other approaches and is not
in use now.

Change-Id: I387f45b5607d202539e482468ccc70e6c0f9341f

9 years agomips msa vp9 common headers added
Parag Salasakar [Wed, 25 Mar 2015 09:34:48 +0000 (15:04 +0530)]
mips msa vp9 common headers added

Change-Id: Ia31ada59172eb1818e1eb91009f83cbb1f581223

9 years agoMerge "vpxdec.sh: fix vp9_webm_less_than_50_frames w/valgrind"
James Zern [Thu, 9 Apr 2015 04:01:34 +0000 (21:01 -0700)]
Merge "vpxdec.sh: fix vp9_webm_less_than_50_frames w/valgrind"

9 years agoRemove unnecessary mv clamp with on demand border extension.
hkuang [Mon, 6 Apr 2015 17:17:14 +0000 (10:17 -0700)]
Remove unnecessary mv clamp with on demand border extension.

Change-Id: Ia2956f06f409b9b0ca8320ca4c1ea5680e938402

9 years agoRefactor dec_build_inter_predictors
Frank Galligan [Sun, 22 Mar 2015 20:41:13 +0000 (13:41 -0700)]
Refactor dec_build_inter_predictors

Refactor the loops in dec_build_inter_predictors to try and decrease
the number of instructions. Limited testing saw about 1% perf
increase on x86 and about 0.67 % perf increase on Arm.

Change-Id: I69cfe6335bb562fbaaebf43fb3f5c5a2a28882a2

9 years agovpxdec.sh: fix vp9_webm_less_than_50_frames w/valgrind
James Zern [Wed, 8 Apr 2015 18:45:04 +0000 (11:45 -0700)]
vpxdec.sh: fix vp9_webm_less_than_50_frames w/valgrind

add a check for the status line to awk and better report failure given
the program output will be lost in this case

Change-Id: I1348a80108c81099d609f2e2227dd2c31bd8cd54

9 years agoMerge "Improve accuracy of rate control in CQ mode"
Debargha Mukherjee [Wed, 8 Apr 2015 17:48:17 +0000 (10:48 -0700)]
Merge "Improve accuracy of rate control in CQ mode"

9 years agoMerge "vp9_full_search_sadx[38]: align sad arrays"
James Zern [Wed, 8 Apr 2015 03:57:21 +0000 (20:57 -0700)]
Merge "vp9_full_search_sadx[38]: align sad arrays"

9 years agoMerge "Optimize the checking for transform skipping"
Yaowu Xu [Tue, 7 Apr 2015 23:29:51 +0000 (16:29 -0700)]
Merge "Optimize the checking for transform skipping"

9 years agoMerge "move ref_frame_cost computations into a function"
Yaowu Xu [Tue, 7 Apr 2015 23:29:45 +0000 (16:29 -0700)]
Merge "move ref_frame_cost computations into a function"

9 years agoImprove accuracy of rate control in CQ mode
Debargha Mukherjee [Tue, 7 Apr 2015 23:15:11 +0000 (16:15 -0700)]
Improve accuracy of rate control in CQ mode

Modifies a special handling that improves rate control accuracy in
the constrained quality mode, when the undershoot and overshoot
limits are set tighter.

Change-Id: If62103f0ef3ed1cac92807400678c93da50cf046

9 years agoMerge "Test loopfilters with count=2"
Yaowu Xu [Tue, 7 Apr 2015 23:08:25 +0000 (16:08 -0700)]
Merge "Test loopfilters with count=2"

9 years agovp9_full_search_sadx[38]: align sad arrays
James Zern [Tue, 7 Apr 2015 21:30:17 +0000 (14:30 -0700)]
vp9_full_search_sadx[38]: align sad arrays

the sse4 code expects 16-byte aligned arrays; vp8 already had a similar
change applied:
b2aa401 Align SAD output array to be 16-byte aligned

Change-Id: I5e902035e5a87e23309e151113f3c0d4a8372226

9 years agoMerge "Enable Hadamard transform based cost estimate for all block sizes"
Jingning Han [Tue, 7 Apr 2015 19:51:27 +0000 (12:51 -0700)]
Merge "Enable Hadamard transform based cost estimate for all block sizes"

9 years agoMerge "Account for eob cost in the RTC mode decision process"
Jingning Han [Tue, 7 Apr 2015 19:50:30 +0000 (12:50 -0700)]
Merge "Account for eob cost in the RTC mode decision process"

9 years agoCompute prediction filter type cost only when needed
Jingning Han [Tue, 7 Apr 2015 19:39:54 +0000 (12:39 -0700)]
Compute prediction filter type cost only when needed

Skip redundant prediction filter type cost in filter search loop,
if the rate value will be reset in Hadamard transform based rate
distortion estimate.

Change-Id: Ie5221f4bc8da9461c449df367251aeeac52c6e5d

9 years agoMerge "webmdec: Fix for reaching eof in webm_guess_framerate"
Vignesh Venkatasubramanian [Tue, 7 Apr 2015 18:53:43 +0000 (11:53 -0700)]
Merge "webmdec: Fix for reaching eof in webm_guess_framerate"

9 years agowebmdec: Fix for reaching eof in webm_guess_framerate
Vignesh Venkatasubramanian [Fri, 3 Apr 2015 22:45:14 +0000 (15:45 -0700)]
webmdec: Fix for reaching eof in webm_guess_framerate

Reset the reached_eos flag in webm_guess_framerate in case it ends
up consuming the entire file. Also adding a vpxdec shell test to
verify this behavior.

Change-Id: I371eebd2105231dc0f60e65da1f71b233ad14be5

9 years agoOptimize the checking for transform skipping
Yaowu Xu [Tue, 7 Apr 2015 00:53:55 +0000 (17:53 -0700)]
Optimize the checking for transform skipping

If U is not skippable, then do not perform the check on V.

Change-Id: Iba5e8362bd42390197f373c44388a426a4404549

9 years agoMerge changes Ide5eefad,I28026b86,Ie9a6fac0,Ia8a20c67,I8c7f5b97,I33ca9cdd,I438cbf49
Jim Bankoski [Tue, 7 Apr 2015 00:05:35 +0000 (17:05 -0700)]
Merge changes Ide5eefad,I28026b86,Ie9a6fac0,Ia8a20c67,I8c7f5b97,I33ca9cdd,I438cbf49

* changes:
  vp8_regular_quantize_b_sse2: remove dead init
  vp8cx_pick_filter_level*: remove dead inits
  vp8_decode_frame: remove dead increment
  rdopt: remove dead stores
  find_next_key_frame: remove dead init & store
  multiframe_quality_enhance_block: remove dead stores
  vp8_print_modes_and_motion_vectors: remove dead stores

9 years agoSSSE3 assembly implementation of 8x8 Hadamard transform
Jingning Han [Sat, 4 Apr 2015 16:48:18 +0000 (09:48 -0700)]
SSSE3 assembly implementation of 8x8 Hadamard transform

It uses about 10% less CPU cycles than the SSE2 intrinsic
implementation.

Change-Id: I91017c0c068679a214b98cdd4cff3a6facfb7499

9 years agoEnable Hadamard transform based cost estimate for all block sizes
Jingning Han [Fri, 3 Apr 2015 18:33:24 +0000 (11:33 -0700)]
Enable Hadamard transform based cost estimate for all block sizes

This commit turns on the Hadamard transform based rate distortion
estimate for all block sizes in RTC coding mode. It conditionally
skips the rate distortion estimation if all zero block flag is set
on. No significant encoding speed change is observed. The
compression performance of speed -6 is improved by 1.7% over using
it only for block sizes of 32x32 and below.

Change-Id: I768145e6f05c737b05b5b5f1ee674e929532cafb

9 years agoMerge "Fix the scaling factor in UV skipping test"
Yunqing Wang [Sat, 4 Apr 2015 00:09:59 +0000 (17:09 -0700)]
Merge "Fix the scaling factor in UV skipping test"

9 years agovp8_regular_quantize_b_sse2: remove dead init
James Zern [Fri, 3 Apr 2015 23:39:17 +0000 (16:39 -0700)]
vp8_regular_quantize_b_sse2: remove dead init

Change-Id: Ide5eefadbb3cab38743a69f744a003abb37a6506

9 years agovp8cx_pick_filter_level*: remove dead inits
James Zern [Fri, 3 Apr 2015 23:37:53 +0000 (16:37 -0700)]
vp8cx_pick_filter_level*: remove dead inits

Change-Id: I28026b86d03264b9f4e2fc8ac1d3c74aa3954208

9 years agovp8_decode_frame: remove dead increment
James Zern [Fri, 3 Apr 2015 23:36:14 +0000 (16:36 -0700)]
vp8_decode_frame: remove dead increment

Change-Id: Ie9a6fac02796d24e6f4a15416d0b4c19010547df

9 years agordopt: remove dead stores
James Zern [Fri, 3 Apr 2015 23:21:43 +0000 (16:21 -0700)]
rdopt: remove dead stores

Change-Id: Ia8a20c6751cc6d63c60bb00b99c78faca1e61051

9 years agofind_next_key_frame: remove dead init & store
James Zern [Fri, 3 Apr 2015 23:18:23 +0000 (16:18 -0700)]
find_next_key_frame: remove dead init & store

Change-Id: I8c7f5b9718ef14e4397a263aa9f52a9edcf7d1cd

9 years agomultiframe_quality_enhance_block: remove dead stores
James Zern [Fri, 3 Apr 2015 23:15:51 +0000 (16:15 -0700)]
multiframe_quality_enhance_block: remove dead stores

Change-Id: I33ca9cddfdd54c3d8a23c1cb978986a537a20bf2

9 years agovp8_print_modes_and_motion_vectors: remove dead stores
James Zern [Fri, 3 Apr 2015 23:08:37 +0000 (16:08 -0700)]
vp8_print_modes_and_motion_vectors: remove dead stores

Change-Id: I438cbf4970fa2220fb73b0b41a29e654836d4e3b

9 years agoFix the scaling factor in UV skipping test
Yunqing Wang [Fri, 3 Apr 2015 22:35:31 +0000 (15:35 -0700)]
Fix the scaling factor in UV skipping test

The threshold scaling factor was calculated wrong using partition
size "bsize". Thank Yaowu for pointing it out. It was fixed and no
speed change was seen.

Change-Id: If7a5564456f0f68d6957df3bd2d1876bbb8dfd27

9 years agoTest loopfilters with count=2
Ed Baker [Thu, 2 Apr 2015 20:57:13 +0000 (13:57 -0700)]
Test loopfilters with count=2

The following functions use the count parameter to either loop or select
dedicated paths:
vp9_lpf_horizontal_16_c
vp9_lpf_horizontal_16_sse2
vp9_lpf_horizontal_16_avx2
vp9_lpf_horizontal_16_neon
vp9_highbd_lpf_horizontal_16_c
vp9_highbd_lpf_horizontal_16_sse2

Change-Id: I7abfd2cb30baa292b4ebe11c847968481103c037

9 years agoMerge "vp9: enable sse4 sad functions"
James Zern [Fri, 3 Apr 2015 21:57:52 +0000 (14:57 -0700)]
Merge "vp9: enable sse4 sad functions"

9 years agoMerge "Merge branch 'indianrunnerduck'"
Johann [Fri, 3 Apr 2015 20:43:20 +0000 (13:43 -0700)]
Merge "Merge branch 'indianrunnerduck'"

9 years agoMerge "Remove AltiVec flag"
Johann [Fri, 3 Apr 2015 20:42:49 +0000 (13:42 -0700)]
Merge "Remove AltiVec flag"

9 years agoMerge branch 'indianrunnerduck'
Johann [Fri, 3 Apr 2015 19:53:16 +0000 (12:53 -0700)]
Merge branch 'indianrunnerduck'

* indianrunnerduck:
  Update CHANGELOG for v1.4.0 (Indian Runner Duck) release
  vp9: fix high-bitdepth NEON build
  Fix use of scaling in joint motion search
  Prepare Release Candidate for libvpx v1.4.0
  vp8cx.h: vpx/vpx_encoder.h -> ./vpx_encoder.h

Change-Id: Ib2eee50f02e12623aae478871cb9150604bb2ac2

9 years agoUpdate CHANGELOG for v1.4.0 (Indian Runner Duck) release v1.4.0
Johann [Thu, 2 Apr 2015 22:44:01 +0000 (15:44 -0700)]
Update CHANGELOG for v1.4.0 (Indian Runner Duck) release

Change-Id: Id31b4da40c484aefc1236f5cc568171a9fd12af2

9 years agoMerge "Tune SSSE3 assembly implementation to improve quantization speed"
Jingning Han [Fri, 3 Apr 2015 18:24:28 +0000 (11:24 -0700)]
Merge "Tune SSSE3 assembly implementation to improve quantization speed"

9 years agoRemove AltiVec flag
Johann [Fri, 3 Apr 2015 17:33:20 +0000 (10:33 -0700)]
Remove AltiVec flag

Change-Id: I560b1a954a5089a8af69952b8084408c6a420b96

9 years agoAccount for eob cost in the RTC mode decision process
Jingning Han [Fri, 3 Apr 2015 16:20:25 +0000 (09:20 -0700)]
Account for eob cost in the RTC mode decision process

This commit accounts for the transform block end of coefficient flag
cost in the RTC mode decision process. This allows a more precise
rate estimate. It also turns on the model to block sizes up to 32x32.
The test sequences shows about 3% - 5% speed penalty for speed -6.
The average compression performance improvement for speed -6 is
1.58% in PSNR. The compression gains for hard clips like jimredvga,
mmmoving, and tacomascmv at low bit-rate range are 1.8%, 2.1%, and
3.2%, respectively.

Change-Id: Ic2ae211888e25a93979eac56b274c6e5ebcc21fb

9 years agoMerge "Fix error of "Left shift of negative value -1"."
hkuang [Fri, 3 Apr 2015 04:35:12 +0000 (21:35 -0700)]
Merge "Fix error of "Left shift of negative value -1"."

9 years agoMerge "Set vbp thresholds for aq3 boosted blocks"
Yunqing Wang [Fri, 3 Apr 2015 01:22:08 +0000 (18:22 -0700)]
Merge "Set vbp thresholds for aq3 boosted blocks"

9 years agomove ref_frame_cost computations into a function
Yaowu Xu [Thu, 2 Apr 2015 22:31:06 +0000 (15:31 -0700)]
move ref_frame_cost computations into a function

Change-Id: Iebf2ad2b1db7e2874788fda8d55e67f4cb1149f1

9 years agoFix error of "Left shift of negative value -1".
hkuang [Thu, 2 Apr 2015 19:08:35 +0000 (12:08 -0700)]
Fix error of "Left shift of negative value -1".

Change-Id: Ia4f3feb20df0e89cc51b02def858e12e927312cc

9 years agoMerge "Code cleanup: put (8x8/4x4)fill_variance into separate function."
Marco [Fri, 3 Apr 2015 00:33:01 +0000 (17:33 -0700)]
Merge "Code cleanup: put (8x8/4x4)fill_variance into separate function."

9 years agoMerge "Remove PPC build support"
Johann [Thu, 2 Apr 2015 23:26:48 +0000 (16:26 -0700)]
Merge "Remove PPC build support"

9 years agoSet vbp thresholds for aq3 boosted blocks
Yunqing Wang [Thu, 2 Apr 2015 20:57:22 +0000 (13:57 -0700)]
Set vbp thresholds for aq3 boosted blocks

The vbp thresholds are set seperately for boosted/non-boosted
superblocks according to their segment_id. This way we don't
have to force the boosted blocks to split to 32x32.

Speed 6 RTC set borg test result showed some quality gains.
Overall PSNR: +0.199%; Avg PSNR: +0.245%; SSIM: +0.802%.
No speed change was observed.

Change-Id: I37c6643a3e2da59c4b7dc10ebe05abc8abf4026a

9 years agovp9: fix high-bitdepth NEON build
James Zern [Wed, 1 Apr 2015 00:45:25 +0000 (17:45 -0700)]
vp9: fix high-bitdepth NEON build

remove incorrect specializations in rtcd and update a configuration
check in partial_idct_test.cc

(cherry picked from commit 8845334097d1cb03fc8d7a91c86f02235afc8da6)

Change-Id: I20f551f38ce502092b476fb16d3ca0969dba56f0

9 years agoFix use of scaling in joint motion search
Adrian Grange [Tue, 24 Mar 2015 15:55:35 +0000 (08:55 -0700)]
Fix use of scaling in joint motion search

To enable us to the scale-invariant motion estimation
code during mode selection, each of the reference
buffers is scaled to match the size of the frame
being encoded.

This fix ensures that a unit scaling factor is used in
this case rather than the one calculated assuming that
the reference frame is not scaled.

(cherry picked from commit 8d8d7bfde5d311bb7d4ff4e921a9dbaa8f389af5)

Change-Id: Id9a5c85dad402f3a7cc7ea9f30f204edad080ebf

9 years agoCode cleanup: put (8x8/4x4)fill_variance into separate function.
Marco [Thu, 2 Apr 2015 19:17:51 +0000 (12:17 -0700)]
Code cleanup: put (8x8/4x4)fill_variance into separate function.

Code cleanup, no change in behavior.

Change-Id: I043b889f8f0b3afb49de0da00873bc3499ebda24

9 years agoSmall fix to segment check in pickmode.
Marco [Thu, 2 Apr 2015 16:50:44 +0000 (09:50 -0700)]
Small fix to segment check in pickmode.

Change-Id: Id5fd82a504def2523292466fbaad5dade9424c72

9 years agoRemove PPC build support
Johann [Thu, 2 Apr 2015 16:13:57 +0000 (09:13 -0700)]
Remove PPC build support

There are no functional optimizations for AltiVec/PPC

Change-Id: I6877a7a9739017fe36fc769be22679c65ea99976

9 years agoMerge "vp9/neon: skip some files in high-bitdepth build"
James Zern [Thu, 2 Apr 2015 06:36:56 +0000 (23:36 -0700)]
Merge "vp9/neon: skip some files in high-bitdepth build"

9 years agoMerge "vp9: fix high-bitdepth NEON build"
James Zern [Thu, 2 Apr 2015 06:36:17 +0000 (23:36 -0700)]
Merge "vp9: fix high-bitdepth NEON build"

9 years agoMerge "use MAX_MB_PLANE consistently"
Yaowu Xu [Thu, 2 Apr 2015 01:24:39 +0000 (18:24 -0700)]
Merge "use MAX_MB_PLANE consistently"

9 years agoRemove unnecessary set postproc flags.
hkuang [Thu, 2 Apr 2015 00:11:35 +0000 (17:11 -0700)]
Remove unnecessary set postproc flags.

Change-Id: Iaf136969bc368a890f9671647576ee9d54eef03b

9 years agoMerge "Fix 10-bit video decode failure with --frame-parallel mode."
hkuang [Thu, 2 Apr 2015 00:07:58 +0000 (17:07 -0700)]
Merge "Fix 10-bit video decode failure with --frame-parallel mode."

9 years agoMerge "Reduce required xmm number by one in block_error_fp"
Jingning Han [Wed, 1 Apr 2015 22:46:22 +0000 (15:46 -0700)]
Merge "Reduce required xmm number by one in block_error_fp"

9 years agoTune SSSE3 assembly implementation to improve quantization speed
Jingning Han [Wed, 1 Apr 2015 22:22:39 +0000 (15:22 -0700)]
Tune SSSE3 assembly implementation to improve quantization speed

Change-Id: If0ca8b25b4800d4336e6cbc97194cd9b01c5b5a3

9 years agouse MAX_MB_PLANE consistently
Yaowu Xu [Wed, 1 Apr 2015 21:50:15 +0000 (14:50 -0700)]
use MAX_MB_PLANE consistently

Change-Id: Ic416a7f145001a88f5a7f70dde9b1edbc1b69381

9 years agoMerge "Simplify bsize calculation"
Yaowu Xu [Wed, 1 Apr 2015 22:06:55 +0000 (15:06 -0700)]
Merge "Simplify bsize calculation"

9 years agoMerge "Optimize quantization simd implementation"
Jingning Han [Wed, 1 Apr 2015 21:55:18 +0000 (14:55 -0700)]
Merge "Optimize quantization simd implementation"

9 years agoMerge "Simplify effective src_diff address computation"
Jingning Han [Wed, 1 Apr 2015 21:55:03 +0000 (14:55 -0700)]
Merge "Simplify effective src_diff address computation"

9 years agoMerge "Refactor block_yrd function for RTC coding mode"
Jingning Han [Wed, 1 Apr 2015 21:54:24 +0000 (14:54 -0700)]
Merge "Refactor block_yrd function for RTC coding mode"

9 years agoSimplify bsize calculation
Yaowu Xu [Wed, 1 Apr 2015 19:15:06 +0000 (12:15 -0700)]
Simplify bsize calculation

Change-Id: Ibc514684def9914c66f04cb7931f773e2b79c168

9 years agoSimplify effective src_diff address computation
Jingning Han [Wed, 1 Apr 2015 01:04:45 +0000 (18:04 -0700)]
Simplify effective src_diff address computation

Remove redundant offset calculation for effective src_diff address.

Change-Id: I4aab241a36abcef7fd8adf74aed5e12b8b88e0ef

9 years agoReduce required xmm number by one in block_error_fp
Jingning Han [Wed, 1 Apr 2015 16:19:13 +0000 (09:19 -0700)]
Reduce required xmm number by one in block_error_fp

Use 6 xmms instead of 8.

Change-Id: If976ad85d09191d2fb0565399d690f2869dbbcc7

9 years agoRefactor block_yrd function for RTC coding mode
Jingning Han [Wed, 1 Apr 2015 00:46:41 +0000 (17:46 -0700)]
Refactor block_yrd function for RTC coding mode

This commit separates Hadamard transform/quantization operations
from rate and distortion computation in block_yrd. This allows one
to skip SATD computation when all transform blocks are quantized
to zero. It also uses a new block error function that skips
repeated computation of sum of squared residuals. It reduces the
CPU cycles spent on block error calculation in block_yrd by 40%.

Change-Id: I726acb2454b44af1c3bd95385abecac209959b10

9 years agoOptimize quantization simd implementation
Jingning Han [Wed, 1 Apr 2015 18:39:36 +0000 (11:39 -0700)]
Optimize quantization simd implementation

This commit allows the quantizer to compare the AC coefficients to
the quantization step size to determine if further multiplication
operations are needed. It makes the quantization process 20% faster
without coding statistics change.

Change-Id: I735aaf6a9c0874c82175bb565b20e131464db64a

9 years agoEnhance the transform skipping decision-making in non-rd mode
Yunqing Wang [Wed, 25 Mar 2015 21:19:29 +0000 (14:19 -0700)]
Enhance the transform skipping decision-making in non-rd mode

For large partition blocks(block_size > 32x32), the variance
calculation is modified so that every 8x8 block's variance
is stored during the calculation, which is used in the
following transform skipping test. Also, the variance for
every tx block is calculated. The skipping test checks all tx
blocks in the partition, and sets the skip flag only if all tx
blocks are skippable. If the skip flag of Y plane is 1, a
quick evaluation is done on UV planes. If the current partition
block is skippable in YUV planes, the mode search checks fewer
inter modes and doesn't check intra modes.

The rtc set borg test(at speed 6) showed that:
Overall psnr: -0.527%; Avg psnr: -0.510%; ssim: -0.573%.
Average single-thread speedup on rtc set was 3.5%.
For 720p clips, more speedups were seen.
gipsrecmotion: 13%
gipsrestat: 12%
vidyo: 5 - 9%
dark: 15%
niklas: 6%

Change-Id: I8d8ebec0cb305f1de016516400bf007c3042666e

9 years agoFix 10-bit video decode failure with --frame-parallel mode.
hkuang [Fri, 27 Mar 2015 00:37:17 +0000 (17:37 -0700)]
Fix 10-bit video decode failure with --frame-parallel mode.

Also add unit test to avoid same error in the future.

Issue:981

Change-Id: Iaf9889d8d5514cfdff1ea098e6ae133be56d501f

9 years agovp9: enable sse4 sad functions
James Zern [Wed, 1 Apr 2015 03:57:25 +0000 (20:57 -0700)]
vp9: enable sse4 sad functions

sse4 isn't set by configure or used in rtcd, correct the sad entries to
use sse4_1 without changing the signatures for now.
this was done in vp8 post-vp9 branch.

Change-Id: Ia9f1fff9f2476fdfa53ed022778dd2f708caa271

9 years agovp9/neon: skip some files in high-bitdepth build
James Zern [Wed, 1 Apr 2015 01:06:21 +0000 (18:06 -0700)]
vp9/neon: skip some files in high-bitdepth build

exclude files that only contain functions for non-high-bitdepth builds.
this removes some warnings related to missing prototypes

Change-Id: Ic6642998c46a7b808c6c53b2f9c34bcd4d037abe

9 years agovp9: fix high-bitdepth NEON build
James Zern [Wed, 1 Apr 2015 00:45:25 +0000 (17:45 -0700)]
vp9: fix high-bitdepth NEON build

remove incorrect specializations in rtcd and update a configuration
check in partial_idct_test.cc

Change-Id: I20f551f38ce502092b476fb16d3ca0969dba56f0

9 years agoMerge "Rename vbp thresholds"
Yunqing Wang [Tue, 31 Mar 2015 23:33:30 +0000 (16:33 -0700)]
Merge "Rename vbp thresholds"

9 years agoMerge "webmdec: Fix read_frame return value for calls after EOS"
Vignesh Venkatasubramanian [Tue, 31 Mar 2015 23:11:56 +0000 (16:11 -0700)]
Merge "webmdec: Fix read_frame return value for calls after EOS"

9 years agoMerge "Set postproc flags in decoder_get_frame."
Marco [Tue, 31 Mar 2015 22:22:14 +0000 (15:22 -0700)]
Merge "Set postproc flags in decoder_get_frame."

9 years agoRename vbp thresholds
Yunqing Wang [Tue, 31 Mar 2015 22:14:44 +0000 (15:14 -0700)]
Rename vbp thresholds

Code refactoring

Change-Id: I410fcce1bc6d95c62c474445f4c97ea8469f1e79

9 years agoMerge "Tuning SATD rate calculation for speed"
Jingning Han [Tue, 31 Mar 2015 21:24:26 +0000 (14:24 -0700)]
Merge "Tuning SATD rate calculation for speed"

9 years agoMerge "Use aligned copy in 8x8 Hadamard transform SSE2"
Jingning Han [Tue, 31 Mar 2015 19:16:47 +0000 (12:16 -0700)]
Merge "Use aligned copy in 8x8 Hadamard transform SSE2"

9 years agoMerge "Allow block skip coding option in RTC mode"
Jingning Han [Tue, 31 Mar 2015 19:16:36 +0000 (12:16 -0700)]
Merge "Allow block skip coding option in RTC mode"

9 years agoMerge "Fix 8x8 Hadamard SSE2 implementation"
Jingning Han [Tue, 31 Mar 2015 19:16:27 +0000 (12:16 -0700)]
Merge "Fix 8x8 Hadamard SSE2 implementation"

9 years agoMerge "VP9E_GET_ACTIVE_MAP API function."
Alex Converse [Tue, 31 Mar 2015 18:52:56 +0000 (11:52 -0700)]
Merge "VP9E_GET_ACTIVE_MAP API function."

9 years agoTuning SATD rate calculation for speed
Jingning Han [Tue, 31 Mar 2015 17:57:41 +0000 (10:57 -0700)]
Tuning SATD rate calculation for speed

This commit allows the encoder to check the eob per transform
block to decide how to compute the SATD rate cost. If the entire
block is quantized to zero, there is no need to add anything; if
only the DC coefficient is non-zero, add its absolute value;
otherwise, sum over the block. This reduces the CPU cycles spent
on vp9_satd_sse2 to one third.

Change-Id: I0d56044b793b286efc0875fafc0b8bf2d2047e32

9 years agoMerge "Move vp9_coef_con_tree to common/"
hui su [Tue, 31 Mar 2015 17:51:10 +0000 (10:51 -0700)]
Merge "Move vp9_coef_con_tree to common/"

9 years agoUse aligned copy in 8x8 Hadamard transform SSE2
Jingning Han [Tue, 31 Mar 2015 17:08:29 +0000 (10:08 -0700)]
Use aligned copy in 8x8 Hadamard transform SSE2

This reduces the 8x8 Hadamard transform cycles by 20%.

Change-Id: If34c5e02f3afa42244c6efabe121f7cf5d2df41b

9 years agoMerge "Enable 16x16 Hadamard transform in SATD based mode decision"
Jingning Han [Tue, 31 Mar 2015 16:55:41 +0000 (09:55 -0700)]
Merge "Enable 16x16 Hadamard transform in SATD based mode decision"