Johann [Wed, 15 Apr 2015 19:12:58 +0000 (12:12 -0700)]
Merge "Allow specifying a different prefix in x86inc.asm"
Johann [Tue, 14 Apr 2015 19:25:14 +0000 (15:25 -0400)]
Allow specifying a different prefix in x86inc.asm
Currently the prefix is forced to vp9 for any function using
x86inc.asm.
Change-Id: Icbca57ce68a52e743bdd7e9be86cfe8353f274c1
Johann [Wed, 15 Apr 2015 13:36:45 +0000 (06:36 -0700)]
Merge "Remove unused scaleopt.cpp"
Yunqing Wang [Tue, 14 Apr 2015 22:12:40 +0000 (15:12 -0700)]
Revert "Force_split on 16x16 blocks in variance partition."
This reverts commit
eb8c667570aa83134c7db0690de9dbdde4d90291.
The patch caused mismatch while using multi-threads.
Change-Id: Icd646340af25b5d91e32f03ed3ea212e00e3e0be
Johann [Tue, 14 Apr 2015 20:59:30 +0000 (16:59 -0400)]
Remove unused scaleopt.cpp
Change-Id: Ibaeede61c128c73809332b9a853cd62b8d6d5325
Marco [Tue, 14 Apr 2015 16:44:58 +0000 (09:44 -0700)]
Merge "Force_split on 16x16 blocks in variance partition."
hkuang [Mon, 13 Apr 2015 21:33:43 +0000 (14:33 -0700)]
Merge "Remove unnecessary set postproc flags."
Marco [Wed, 18 Mar 2015 20:55:19 +0000 (13:55 -0700)]
Force_split on 16x16 blocks in variance partition.
Force split on 16x16 block (to 8x8) based on the minmax over the 8x8 sub-blocks.
Also increase variance threshold for 32x32, and add exit condiiton in choose_partition
(with very safe threshold) based on sad used to select reference frame.
Some visual improvement near moving boundaries.
Average gain in psnr/ssim: ~0.6%, some clips go up ~1 or 2%.
Encoding time increase (due to more 8x8 blocks) from ~1-4%, depending on clip.
Change-Id: I4759bb181251ac41517cd45e326ce2997dadb577
Parag Salasakar [Fri, 10 Apr 2015 04:50:15 +0000 (21:50 -0700)]
Merge "mips msa vp9 common headers added"
Jingning Han [Thu, 9 Apr 2015 21:45:19 +0000 (14:45 -0700)]
Merge "Remove get_nonrd_var_based_fixed_partition function"
Jingning Han [Thu, 9 Apr 2015 21:45:11 +0000 (14:45 -0700)]
Merge "Compute prediction filter type cost only when needed"
Jingning Han [Thu, 9 Apr 2015 18:16:11 +0000 (11:16 -0700)]
Merge "SSSE3 assembly implementation of 8x8 Hadamard transform"
hkuang [Thu, 9 Apr 2015 17:08:06 +0000 (10:08 -0700)]
Merge "Remove unnecessary mv clamp with on demand border extension."
Jingning Han [Thu, 9 Apr 2015 16:49:09 +0000 (09:49 -0700)]
Remove get_nonrd_var_based_fixed_partition function
This function has been replaced by other approaches and is not
in use now.
Change-Id: I387f45b5607d202539e482468ccc70e6c0f9341f
Parag Salasakar [Wed, 25 Mar 2015 09:34:48 +0000 (15:04 +0530)]
mips msa vp9 common headers added
Change-Id: Ia31ada59172eb1818e1eb91009f83cbb1f581223
James Zern [Thu, 9 Apr 2015 04:01:34 +0000 (21:01 -0700)]
Merge "vpxdec.sh: fix vp9_webm_less_than_50_frames w/valgrind"
hkuang [Mon, 6 Apr 2015 17:17:14 +0000 (10:17 -0700)]
Remove unnecessary mv clamp with on demand border extension.
Change-Id: Ia2956f06f409b9b0ca8320ca4c1ea5680e938402
Frank Galligan [Sun, 22 Mar 2015 20:41:13 +0000 (13:41 -0700)]
Refactor dec_build_inter_predictors
Refactor the loops in dec_build_inter_predictors to try and decrease
the number of instructions. Limited testing saw about 1% perf
increase on x86 and about 0.67 % perf increase on Arm.
Change-Id: I69cfe6335bb562fbaaebf43fb3f5c5a2a28882a2
James Zern [Wed, 8 Apr 2015 18:45:04 +0000 (11:45 -0700)]
vpxdec.sh: fix vp9_webm_less_than_50_frames w/valgrind
add a check for the status line to awk and better report failure given
the program output will be lost in this case
Change-Id: I1348a80108c81099d609f2e2227dd2c31bd8cd54
Debargha Mukherjee [Wed, 8 Apr 2015 17:48:17 +0000 (10:48 -0700)]
Merge "Improve accuracy of rate control in CQ mode"
James Zern [Wed, 8 Apr 2015 03:57:21 +0000 (20:57 -0700)]
Merge "vp9_full_search_sadx[38]: align sad arrays"
Yaowu Xu [Tue, 7 Apr 2015 23:29:51 +0000 (16:29 -0700)]
Merge "Optimize the checking for transform skipping"
Yaowu Xu [Tue, 7 Apr 2015 23:29:45 +0000 (16:29 -0700)]
Merge "move ref_frame_cost computations into a function"
Debargha Mukherjee [Tue, 7 Apr 2015 23:15:11 +0000 (16:15 -0700)]
Improve accuracy of rate control in CQ mode
Modifies a special handling that improves rate control accuracy in
the constrained quality mode, when the undershoot and overshoot
limits are set tighter.
Change-Id: If62103f0ef3ed1cac92807400678c93da50cf046
Yaowu Xu [Tue, 7 Apr 2015 23:08:25 +0000 (16:08 -0700)]
Merge "Test loopfilters with count=2"
James Zern [Tue, 7 Apr 2015 21:30:17 +0000 (14:30 -0700)]
vp9_full_search_sadx[38]: align sad arrays
the sse4 code expects 16-byte aligned arrays; vp8 already had a similar
change applied:
b2aa401 Align SAD output array to be 16-byte aligned
Change-Id: I5e902035e5a87e23309e151113f3c0d4a8372226
Jingning Han [Tue, 7 Apr 2015 19:51:27 +0000 (12:51 -0700)]
Merge "Enable Hadamard transform based cost estimate for all block sizes"
Jingning Han [Tue, 7 Apr 2015 19:50:30 +0000 (12:50 -0700)]
Merge "Account for eob cost in the RTC mode decision process"
Jingning Han [Tue, 7 Apr 2015 19:39:54 +0000 (12:39 -0700)]
Compute prediction filter type cost only when needed
Skip redundant prediction filter type cost in filter search loop,
if the rate value will be reset in Hadamard transform based rate
distortion estimate.
Change-Id: Ie5221f4bc8da9461c449df367251aeeac52c6e5d
Vignesh Venkatasubramanian [Tue, 7 Apr 2015 18:53:43 +0000 (11:53 -0700)]
Merge "webmdec: Fix for reaching eof in webm_guess_framerate"
Vignesh Venkatasubramanian [Fri, 3 Apr 2015 22:45:14 +0000 (15:45 -0700)]
webmdec: Fix for reaching eof in webm_guess_framerate
Reset the reached_eos flag in webm_guess_framerate in case it ends
up consuming the entire file. Also adding a vpxdec shell test to
verify this behavior.
Change-Id: I371eebd2105231dc0f60e65da1f71b233ad14be5
Yaowu Xu [Tue, 7 Apr 2015 00:53:55 +0000 (17:53 -0700)]
Optimize the checking for transform skipping
If U is not skippable, then do not perform the check on V.
Change-Id: Iba5e8362bd42390197f373c44388a426a4404549
Jim Bankoski [Tue, 7 Apr 2015 00:05:35 +0000 (17:05 -0700)]
Merge changes Ide5eefad,I28026b86,Ie9a6fac0,Ia8a20c67,I8c7f5b97,I33ca9cdd,I438cbf49
* changes:
vp8_regular_quantize_b_sse2: remove dead init
vp8cx_pick_filter_level*: remove dead inits
vp8_decode_frame: remove dead increment
rdopt: remove dead stores
find_next_key_frame: remove dead init & store
multiframe_quality_enhance_block: remove dead stores
vp8_print_modes_and_motion_vectors: remove dead stores
Jingning Han [Sat, 4 Apr 2015 16:48:18 +0000 (09:48 -0700)]
SSSE3 assembly implementation of 8x8 Hadamard transform
It uses about 10% less CPU cycles than the SSE2 intrinsic
implementation.
Change-Id: I91017c0c068679a214b98cdd4cff3a6facfb7499
Jingning Han [Fri, 3 Apr 2015 18:33:24 +0000 (11:33 -0700)]
Enable Hadamard transform based cost estimate for all block sizes
This commit turns on the Hadamard transform based rate distortion
estimate for all block sizes in RTC coding mode. It conditionally
skips the rate distortion estimation if all zero block flag is set
on. No significant encoding speed change is observed. The
compression performance of speed -6 is improved by 1.7% over using
it only for block sizes of 32x32 and below.
Change-Id: I768145e6f05c737b05b5b5f1ee674e929532cafb
Yunqing Wang [Sat, 4 Apr 2015 00:09:59 +0000 (17:09 -0700)]
Merge "Fix the scaling factor in UV skipping test"
James Zern [Fri, 3 Apr 2015 23:39:17 +0000 (16:39 -0700)]
vp8_regular_quantize_b_sse2: remove dead init
Change-Id: Ide5eefadbb3cab38743a69f744a003abb37a6506
James Zern [Fri, 3 Apr 2015 23:37:53 +0000 (16:37 -0700)]
vp8cx_pick_filter_level*: remove dead inits
Change-Id: I28026b86d03264b9f4e2fc8ac1d3c74aa3954208
James Zern [Fri, 3 Apr 2015 23:36:14 +0000 (16:36 -0700)]
vp8_decode_frame: remove dead increment
Change-Id: Ie9a6fac02796d24e6f4a15416d0b4c19010547df
James Zern [Fri, 3 Apr 2015 23:21:43 +0000 (16:21 -0700)]
rdopt: remove dead stores
Change-Id: Ia8a20c6751cc6d63c60bb00b99c78faca1e61051
James Zern [Fri, 3 Apr 2015 23:18:23 +0000 (16:18 -0700)]
find_next_key_frame: remove dead init & store
Change-Id: I8c7f5b9718ef14e4397a263aa9f52a9edcf7d1cd
James Zern [Fri, 3 Apr 2015 23:15:51 +0000 (16:15 -0700)]
multiframe_quality_enhance_block: remove dead stores
Change-Id: I33ca9cddfdd54c3d8a23c1cb978986a537a20bf2
James Zern [Fri, 3 Apr 2015 23:08:37 +0000 (16:08 -0700)]
vp8_print_modes_and_motion_vectors: remove dead stores
Change-Id: I438cbf4970fa2220fb73b0b41a29e654836d4e3b
Yunqing Wang [Fri, 3 Apr 2015 22:35:31 +0000 (15:35 -0700)]
Fix the scaling factor in UV skipping test
The threshold scaling factor was calculated wrong using partition
size "bsize". Thank Yaowu for pointing it out. It was fixed and no
speed change was seen.
Change-Id: If7a5564456f0f68d6957df3bd2d1876bbb8dfd27
Ed Baker [Thu, 2 Apr 2015 20:57:13 +0000 (13:57 -0700)]
Test loopfilters with count=2
The following functions use the count parameter to either loop or select
dedicated paths:
vp9_lpf_horizontal_16_c
vp9_lpf_horizontal_16_sse2
vp9_lpf_horizontal_16_avx2
vp9_lpf_horizontal_16_neon
vp9_highbd_lpf_horizontal_16_c
vp9_highbd_lpf_horizontal_16_sse2
Change-Id: I7abfd2cb30baa292b4ebe11c847968481103c037
James Zern [Fri, 3 Apr 2015 21:57:52 +0000 (14:57 -0700)]
Merge "vp9: enable sse4 sad functions"
Johann [Fri, 3 Apr 2015 20:43:20 +0000 (13:43 -0700)]
Merge "Merge branch 'indianrunnerduck'"
Johann [Fri, 3 Apr 2015 20:42:49 +0000 (13:42 -0700)]
Merge "Remove AltiVec flag"
Johann [Fri, 3 Apr 2015 19:53:16 +0000 (12:53 -0700)]
Merge branch 'indianrunnerduck'
* indianrunnerduck:
Update CHANGELOG for v1.4.0 (Indian Runner Duck) release
vp9: fix high-bitdepth NEON build
Fix use of scaling in joint motion search
Prepare Release Candidate for libvpx v1.4.0
vp8cx.h: vpx/vpx_encoder.h -> ./vpx_encoder.h
Change-Id: Ib2eee50f02e12623aae478871cb9150604bb2ac2
Johann [Thu, 2 Apr 2015 22:44:01 +0000 (15:44 -0700)]
Update CHANGELOG for v1.4.0 (Indian Runner Duck) release
Change-Id: Id31b4da40c484aefc1236f5cc568171a9fd12af2
Jingning Han [Fri, 3 Apr 2015 18:24:28 +0000 (11:24 -0700)]
Merge "Tune SSSE3 assembly implementation to improve quantization speed"
Johann [Fri, 3 Apr 2015 17:33:20 +0000 (10:33 -0700)]
Remove AltiVec flag
Change-Id: I560b1a954a5089a8af69952b8084408c6a420b96
Jingning Han [Fri, 3 Apr 2015 16:20:25 +0000 (09:20 -0700)]
Account for eob cost in the RTC mode decision process
This commit accounts for the transform block end of coefficient flag
cost in the RTC mode decision process. This allows a more precise
rate estimate. It also turns on the model to block sizes up to 32x32.
The test sequences shows about 3% - 5% speed penalty for speed -6.
The average compression performance improvement for speed -6 is
1.58% in PSNR. The compression gains for hard clips like jimredvga,
mmmoving, and tacomascmv at low bit-rate range are 1.8%, 2.1%, and
3.2%, respectively.
Change-Id: Ic2ae211888e25a93979eac56b274c6e5ebcc21fb
hkuang [Fri, 3 Apr 2015 04:35:12 +0000 (21:35 -0700)]
Merge "Fix error of "Left shift of negative value -1"."
Yunqing Wang [Fri, 3 Apr 2015 01:22:08 +0000 (18:22 -0700)]
Merge "Set vbp thresholds for aq3 boosted blocks"
Yaowu Xu [Thu, 2 Apr 2015 22:31:06 +0000 (15:31 -0700)]
move ref_frame_cost computations into a function
Change-Id: Iebf2ad2b1db7e2874788fda8d55e67f4cb1149f1
hkuang [Thu, 2 Apr 2015 19:08:35 +0000 (12:08 -0700)]
Fix error of "Left shift of negative value -1".
Change-Id: Ia4f3feb20df0e89cc51b02def858e12e927312cc
Marco [Fri, 3 Apr 2015 00:33:01 +0000 (17:33 -0700)]
Merge "Code cleanup: put (8x8/4x4)fill_variance into separate function."
Johann [Thu, 2 Apr 2015 23:26:48 +0000 (16:26 -0700)]
Merge "Remove PPC build support"
Yunqing Wang [Thu, 2 Apr 2015 20:57:22 +0000 (13:57 -0700)]
Set vbp thresholds for aq3 boosted blocks
The vbp thresholds are set seperately for boosted/non-boosted
superblocks according to their segment_id. This way we don't
have to force the boosted blocks to split to 32x32.
Speed 6 RTC set borg test result showed some quality gains.
Overall PSNR: +0.199%; Avg PSNR: +0.245%; SSIM: +0.802%.
No speed change was observed.
Change-Id: I37c6643a3e2da59c4b7dc10ebe05abc8abf4026a
James Zern [Wed, 1 Apr 2015 00:45:25 +0000 (17:45 -0700)]
vp9: fix high-bitdepth NEON build
remove incorrect specializations in rtcd and update a configuration
check in partial_idct_test.cc
(cherry picked from commit
8845334097d1cb03fc8d7a91c86f02235afc8da6)
Change-Id: I20f551f38ce502092b476fb16d3ca0969dba56f0
Adrian Grange [Tue, 24 Mar 2015 15:55:35 +0000 (08:55 -0700)]
Fix use of scaling in joint motion search
To enable us to the scale-invariant motion estimation
code during mode selection, each of the reference
buffers is scaled to match the size of the frame
being encoded.
This fix ensures that a unit scaling factor is used in
this case rather than the one calculated assuming that
the reference frame is not scaled.
(cherry picked from commit
8d8d7bfde5d311bb7d4ff4e921a9dbaa8f389af5)
Change-Id: Id9a5c85dad402f3a7cc7ea9f30f204edad080ebf
Marco [Thu, 2 Apr 2015 19:17:51 +0000 (12:17 -0700)]
Code cleanup: put (8x8/4x4)fill_variance into separate function.
Code cleanup, no change in behavior.
Change-Id: I043b889f8f0b3afb49de0da00873bc3499ebda24
Marco [Thu, 2 Apr 2015 16:50:44 +0000 (09:50 -0700)]
Small fix to segment check in pickmode.
Change-Id: Id5fd82a504def2523292466fbaad5dade9424c72
Johann [Thu, 2 Apr 2015 16:13:57 +0000 (09:13 -0700)]
Remove PPC build support
There are no functional optimizations for AltiVec/PPC
Change-Id: I6877a7a9739017fe36fc769be22679c65ea99976
James Zern [Thu, 2 Apr 2015 06:36:56 +0000 (23:36 -0700)]
Merge "vp9/neon: skip some files in high-bitdepth build"
James Zern [Thu, 2 Apr 2015 06:36:17 +0000 (23:36 -0700)]
Merge "vp9: fix high-bitdepth NEON build"
Yaowu Xu [Thu, 2 Apr 2015 01:24:39 +0000 (18:24 -0700)]
Merge "use MAX_MB_PLANE consistently"
hkuang [Thu, 2 Apr 2015 00:11:35 +0000 (17:11 -0700)]
Remove unnecessary set postproc flags.
Change-Id: Iaf136969bc368a890f9671647576ee9d54eef03b
hkuang [Thu, 2 Apr 2015 00:07:58 +0000 (17:07 -0700)]
Merge "Fix 10-bit video decode failure with --frame-parallel mode."
Jingning Han [Wed, 1 Apr 2015 22:46:22 +0000 (15:46 -0700)]
Merge "Reduce required xmm number by one in block_error_fp"
Jingning Han [Wed, 1 Apr 2015 22:22:39 +0000 (15:22 -0700)]
Tune SSSE3 assembly implementation to improve quantization speed
Change-Id: If0ca8b25b4800d4336e6cbc97194cd9b01c5b5a3
Yaowu Xu [Wed, 1 Apr 2015 21:50:15 +0000 (14:50 -0700)]
use MAX_MB_PLANE consistently
Change-Id: Ic416a7f145001a88f5a7f70dde9b1edbc1b69381
Yaowu Xu [Wed, 1 Apr 2015 22:06:55 +0000 (15:06 -0700)]
Merge "Simplify bsize calculation"
Jingning Han [Wed, 1 Apr 2015 21:55:18 +0000 (14:55 -0700)]
Merge "Optimize quantization simd implementation"
Jingning Han [Wed, 1 Apr 2015 21:55:03 +0000 (14:55 -0700)]
Merge "Simplify effective src_diff address computation"
Jingning Han [Wed, 1 Apr 2015 21:54:24 +0000 (14:54 -0700)]
Merge "Refactor block_yrd function for RTC coding mode"
Yaowu Xu [Wed, 1 Apr 2015 19:15:06 +0000 (12:15 -0700)]
Simplify bsize calculation
Change-Id: Ibc514684def9914c66f04cb7931f773e2b79c168
Jingning Han [Wed, 1 Apr 2015 01:04:45 +0000 (18:04 -0700)]
Simplify effective src_diff address computation
Remove redundant offset calculation for effective src_diff address.
Change-Id: I4aab241a36abcef7fd8adf74aed5e12b8b88e0ef
Jingning Han [Wed, 1 Apr 2015 16:19:13 +0000 (09:19 -0700)]
Reduce required xmm number by one in block_error_fp
Use 6 xmms instead of 8.
Change-Id: If976ad85d09191d2fb0565399d690f2869dbbcc7
Jingning Han [Wed, 1 Apr 2015 00:46:41 +0000 (17:46 -0700)]
Refactor block_yrd function for RTC coding mode
This commit separates Hadamard transform/quantization operations
from rate and distortion computation in block_yrd. This allows one
to skip SATD computation when all transform blocks are quantized
to zero. It also uses a new block error function that skips
repeated computation of sum of squared residuals. It reduces the
CPU cycles spent on block error calculation in block_yrd by 40%.
Change-Id: I726acb2454b44af1c3bd95385abecac209959b10
Jingning Han [Wed, 1 Apr 2015 18:39:36 +0000 (11:39 -0700)]
Optimize quantization simd implementation
This commit allows the quantizer to compare the AC coefficients to
the quantization step size to determine if further multiplication
operations are needed. It makes the quantization process 20% faster
without coding statistics change.
Change-Id: I735aaf6a9c0874c82175bb565b20e131464db64a
Yunqing Wang [Wed, 25 Mar 2015 21:19:29 +0000 (14:19 -0700)]
Enhance the transform skipping decision-making in non-rd mode
For large partition blocks(block_size > 32x32), the variance
calculation is modified so that every 8x8 block's variance
is stored during the calculation, which is used in the
following transform skipping test. Also, the variance for
every tx block is calculated. The skipping test checks all tx
blocks in the partition, and sets the skip flag only if all tx
blocks are skippable. If the skip flag of Y plane is 1, a
quick evaluation is done on UV planes. If the current partition
block is skippable in YUV planes, the mode search checks fewer
inter modes and doesn't check intra modes.
The rtc set borg test(at speed 6) showed that:
Overall psnr: -0.527%; Avg psnr: -0.510%; ssim: -0.573%.
Average single-thread speedup on rtc set was 3.5%.
For 720p clips, more speedups were seen.
gipsrecmotion: 13%
gipsrestat: 12%
vidyo: 5 - 9%
dark: 15%
niklas: 6%
Change-Id: I8d8ebec0cb305f1de016516400bf007c3042666e
hkuang [Fri, 27 Mar 2015 00:37:17 +0000 (17:37 -0700)]
Fix 10-bit video decode failure with --frame-parallel mode.
Also add unit test to avoid same error in the future.
Issue:981
Change-Id: Iaf9889d8d5514cfdff1ea098e6ae133be56d501f
James Zern [Wed, 1 Apr 2015 03:57:25 +0000 (20:57 -0700)]
vp9: enable sse4 sad functions
sse4 isn't set by configure or used in rtcd, correct the sad entries to
use sse4_1 without changing the signatures for now.
this was done in vp8 post-vp9 branch.
Change-Id: Ia9f1fff9f2476fdfa53ed022778dd2f708caa271
James Zern [Wed, 1 Apr 2015 01:06:21 +0000 (18:06 -0700)]
vp9/neon: skip some files in high-bitdepth build
exclude files that only contain functions for non-high-bitdepth builds.
this removes some warnings related to missing prototypes
Change-Id: Ic6642998c46a7b808c6c53b2f9c34bcd4d037abe
James Zern [Wed, 1 Apr 2015 00:45:25 +0000 (17:45 -0700)]
vp9: fix high-bitdepth NEON build
remove incorrect specializations in rtcd and update a configuration
check in partial_idct_test.cc
Change-Id: I20f551f38ce502092b476fb16d3ca0969dba56f0
Yunqing Wang [Tue, 31 Mar 2015 23:33:30 +0000 (16:33 -0700)]
Merge "Rename vbp thresholds"
Vignesh Venkatasubramanian [Tue, 31 Mar 2015 23:11:56 +0000 (16:11 -0700)]
Merge "webmdec: Fix read_frame return value for calls after EOS"
Marco [Tue, 31 Mar 2015 22:22:14 +0000 (15:22 -0700)]
Merge "Set postproc flags in decoder_get_frame."
Yunqing Wang [Tue, 31 Mar 2015 22:14:44 +0000 (15:14 -0700)]
Rename vbp thresholds
Code refactoring
Change-Id: I410fcce1bc6d95c62c474445f4c97ea8469f1e79
Jingning Han [Tue, 31 Mar 2015 21:24:26 +0000 (14:24 -0700)]
Merge "Tuning SATD rate calculation for speed"
Jingning Han [Tue, 31 Mar 2015 19:16:47 +0000 (12:16 -0700)]
Merge "Use aligned copy in 8x8 Hadamard transform SSE2"
Jingning Han [Tue, 31 Mar 2015 19:16:36 +0000 (12:16 -0700)]
Merge "Allow block skip coding option in RTC mode"
Jingning Han [Tue, 31 Mar 2015 19:16:27 +0000 (12:16 -0700)]
Merge "Fix 8x8 Hadamard SSE2 implementation"
Alex Converse [Tue, 31 Mar 2015 18:52:56 +0000 (11:52 -0700)]
Merge "VP9E_GET_ACTIVE_MAP API function."
Jingning Han [Tue, 31 Mar 2015 17:57:41 +0000 (10:57 -0700)]
Tuning SATD rate calculation for speed
This commit allows the encoder to check the eob per transform
block to decide how to compute the SATD rate cost. If the entire
block is quantized to zero, there is no need to add anything; if
only the DC coefficient is non-zero, add its absolute value;
otherwise, sum over the block. This reduces the CPU cycles spent
on vp9_satd_sse2 to one third.
Change-Id: I0d56044b793b286efc0875fafc0b8bf2d2047e32
hui su [Tue, 31 Mar 2015 17:51:10 +0000 (10:51 -0700)]
Merge "Move vp9_coef_con_tree to common/"
Jingning Han [Tue, 31 Mar 2015 17:08:29 +0000 (10:08 -0700)]
Use aligned copy in 8x8 Hadamard transform SSE2
This reduces the 8x8 Hadamard transform cycles by 20%.
Change-Id: If34c5e02f3afa42244c6efabe121f7cf5d2df41b
Jingning Han [Tue, 31 Mar 2015 16:55:41 +0000 (09:55 -0700)]
Merge "Enable 16x16 Hadamard transform in SATD based mode decision"