Parag Salasakar [Tue, 4 Aug 2015 04:29:45 +0000 (04:29 +0000)]
Merge "mips msa vpx subtract test added"
Jingning Han [Tue, 4 Aug 2015 04:16:22 +0000 (04:16 +0000)]
Merge "Move inverse transfrom dspr2 functions from vp9 to vpx_dsp"
James Zern [Tue, 4 Aug 2015 02:34:32 +0000 (02:34 +0000)]
Merge "update libyuv to r1456"
James Zern [Tue, 4 Aug 2015 02:30:41 +0000 (02:30 +0000)]
Merge "add vp9_vector_var_neon"
James Zern [Mon, 3 Aug 2015 23:22:21 +0000 (16:22 -0700)]
gen_msvs_proj.sh: avoid asm object name collisions
fixes link under vs9; this is the same change as:
dbf6e3f gen_msvs_vcxproj.sh: Avoid object name collisions.
Change-Id: I2a188c9024d0605e60e5e03ddcef1a25e7e53585
Jingning Han [Mon, 3 Aug 2015 17:50:32 +0000 (10:50 -0700)]
Move inverse transfrom dspr2 functions from vp9 to vpx_dsp
Change-Id: Ia9cf7c31cab4ba3dd6b9bb668c4b3e84bd55cf69
Jingning Han [Mon, 3 Aug 2015 18:58:37 +0000 (18:58 +0000)]
Merge "Add common_dspr2.c file to vpx_dsp/mips"
Yaowu Xu [Mon, 3 Aug 2015 18:43:55 +0000 (18:43 +0000)]
Merge "Correct the allocation size for ssim_vars"
Jingning Han [Mon, 3 Aug 2015 17:17:45 +0000 (10:17 -0700)]
Add common_dspr2.c file to vpx_dsp/mips
Move the declaration of commonly referenced variable to
vpx_dsp/mips/common_dspr2.c.
Change-Id: Ia51287b02e2ac5cfae0fba98c721f0810618f28e
Yaowu Xu [Mon, 3 Aug 2015 17:46:12 +0000 (10:46 -0700)]
Correct the allocation size for ssim_vars
Ssim_vars is used to accumulate stats based 4x4 pixel blocks, this
commit changes the allocations size to be based on mi_rows and mi_cols
to avoid out-of-bound memory access for larger size videos. The hard
coded 720x480 can only work for image size up to 2880x1920.
Change-Id: Id9d07f3f777385b448ac88a6034b7472e4cf3c79
Jingning Han [Mon, 3 Aug 2015 16:54:13 +0000 (09:54 -0700)]
Remove vpx_ prefix from the dspr2 file name in vpx_dsp/mips
Make it consistent with other formats.
Change-Id: I28f0d05ff7c5bf2b815989b3f1bd6c6b25608677
Jingning Han [Mon, 3 Aug 2015 16:03:09 +0000 (16:03 +0000)]
Merge "Add vpx_dsp_rtcd.h to inv_txfm_sse2.c"
Jingning Han [Mon, 3 Aug 2015 16:03:02 +0000 (16:03 +0000)]
Merge "Remove vp9_common.h from idct16x16_neon.c"
Parag Salasakar [Mon, 3 Aug 2015 04:12:11 +0000 (09:42 +0530)]
mips msa vpx subtract test added
Change-Id: I0f0827a665c4d3039d3e5f09fa8c75c8f2bb2bab
Jingning Han [Fri, 31 Jul 2015 19:57:52 +0000 (12:57 -0700)]
Add _dspr2 to local function names
It avoids symbol conflicts between function names of various
implementation versions.
Change-Id: Iad79ebcb8e289457801812a7745c8380b5b06a46
Jingning Han [Mon, 3 Aug 2015 03:18:39 +0000 (03:18 +0000)]
Merge "Factor out mips/msa inverse transform implementations"
Jingning Han [Sun, 2 Aug 2015 21:56:09 +0000 (21:56 +0000)]
Merge "Add x86inc flag guard to inv_txfm_sse2.asm"
Jingning Han [Sun, 2 Aug 2015 15:22:06 +0000 (08:22 -0700)]
Remove vp9_common.h from idct16x16_neon.c
Change-Id: I3df35a99900ef8ce549d315866849a10db1a4c7b
Jingning Han [Sun, 2 Aug 2015 15:43:13 +0000 (08:43 -0700)]
Add x86inc flag guard to inv_txfm_sse2.asm
Fix the VS build failure.
Change-Id: I4fb9d1c83980c4b52d5a848a9cb02ec72493dccb
Jingning Han [Sun, 2 Aug 2015 15:24:56 +0000 (08:24 -0700)]
Add vpx_dsp_rtcd.h to inv_txfm_sse2.c
Change-Id: Ibab434fb4bd6da02dba087582ed74811f555c3ed
James Zern [Sat, 1 Aug 2015 18:45:49 +0000 (11:45 -0700)]
vpx_convolve_copy_sse2: fix win64
xmm6-7 need to be stored
Change-Id: I6c51559598d335946ec91be6246b49589c63b724
Jingning Han [Fri, 31 Jul 2015 18:15:55 +0000 (11:15 -0700)]
Factor out mips/msa inverse transform implementations
Move mips/msa inverse transform implementations from vp9 folder to
vpx_dsp.
Change-Id: Ic4cf3f05247c3c63db7b532a0e5000017a962391
Jingning Han [Sat, 1 Aug 2015 16:20:43 +0000 (16:20 +0000)]
Merge "Use precise header files in inverse transform msa implementations"
Jingning Han [Sat, 1 Aug 2015 16:20:24 +0000 (16:20 +0000)]
Merge "Factor inverse transform functions into vpx_dsp"
Parag Salasakar [Sat, 1 Aug 2015 02:12:20 +0000 (02:12 +0000)]
Merge "mips msa vp8 temporal filter optimization"
James Zern [Fri, 24 Jul 2015 23:54:51 +0000 (16:54 -0700)]
update libyuv to r1456
picks up build warning fixes for visual studio 2015
Change-Id: Idea85fa70d1aeb2a46ea355b87fe41ec5b2b9520
Jingning Han [Sat, 1 Aug 2015 01:01:37 +0000 (01:01 +0000)]
Merge "Add dynamic range notes to vp9_vector_var_c"
James Zern [Fri, 31 Jul 2015 02:46:55 +0000 (19:46 -0700)]
add vp9_vector_var_neon
~50-60% faster depending on the width
Change-Id: I9d007cfa10b9aaa2169c8c009d95522df6123a92
Aℓex Converse [Fri, 31 Jul 2015 23:51:01 +0000 (23:51 +0000)]
Merge "Turn off simple_model_rd_from_var at speed 4."
Jingning Han [Fri, 31 Jul 2015 23:41:51 +0000 (16:41 -0700)]
Add dynamic range notes to vp9_vector_var_c
Change-Id: If536ad31046ecd9e2ecd9c21f52f8192c8153ad7
Jingning Han [Fri, 31 Jul 2015 17:53:25 +0000 (10:53 -0700)]
Use precise header files in inverse transform msa implementations
Change-Id: Ie8a79d9e2837842c3f60776b661cd42782b108d5
James Zern [Fri, 31 Jul 2015 23:22:34 +0000 (23:22 +0000)]
Merge "VP9_COPY_CONVOLVE_SSE2 optimization"
Jingning Han [Fri, 31 Jul 2015 01:53:18 +0000 (18:53 -0700)]
Factor inverse transform functions into vpx_dsp
This commit moves the module inverse transform functions from vp9
to vpx_dsp folder. The hybrid transform wrapper functions stay in
the vp9 folder, since it involves codec-specific data structures.
Change-Id: Ib066367c953d3d024c73ba65157bbd70a95c9ef8
Alex Converse [Fri, 31 Jul 2015 22:50:17 +0000 (15:50 -0700)]
Turn off simple_model_rd_from_var at speed 4.
This got erroneously changed during the refactor. This fixes
SvcTest.TwoPassEncode2TemporalLayersWithMultipleFrameContextsAndTiles.
Change-Id: Ifa5ab0e098396c5e2d10478db87df256eadfa4c7
James Zern [Fri, 31 Jul 2015 22:22:48 +0000 (22:22 +0000)]
Merge changes Iecdbbc34,I8b4db93f
* changes:
Android.mk: fix *_rtcd.h deps for armeabi-v7a
Android.mk: add a dep on vpx_config.asm for x86_64
Scott LaVarnway [Thu, 30 Jul 2015 12:02:04 +0000 (05:02 -0700)]
VP9_COPY_CONVOLVE_SSE2 optimization
This function suffers from a couple problems in small core(tablets):
-The load of the next iteration is blocked by the store of previous iteration
-4k aliasing (between future store and older loads)
-current small core machine are in-order machine and because of it the store will spin the rehabQ until the load is finished
fixed by:
- prefetching 2 lines ahead
- unroll copy of 2 rows of block
- pre-load all xmm regiters before the loop, final stores after the loop
The function is optimized by:
copy_convolve_sse2 64x64 - 16%
copy_convolve_sse2 32x32 - 52%
copy_convolve_sse2 16x16 - 6%
copy_convolve_sse2 8x8 - 2.5%
copy_convolve_sse2 4x4 - 2.7%
credit goes to Tom Craver(tom.r.craver@intel.com) and Ilya Albrekht(ilya.albrekht@intel.com)
Change-Id: I63d3428799c50b2bf7b5677c8268bacb9fc29671
Jingning Han [Fri, 31 Jul 2015 21:29:50 +0000 (21:29 +0000)]
Merge "Fix compiler warning in mips/dspr2"
Aℓex Converse [Fri, 31 Jul 2015 21:19:11 +0000 (21:19 +0000)]
Merge "Compute skippable inside the block_rd_txfm loop."
Jingning Han [Fri, 31 Jul 2015 19:33:35 +0000 (12:33 -0700)]
Fix compiler warning in mips/dspr2
This commit fixes the mix declaration and definition warning when
mips/dspr2 is turned on.
Change-Id: I633d6fe42368b9ac35b106786ebac6969ad53552
Aℓex Converse [Fri, 31 Jul 2015 19:05:54 +0000 (19:05 +0000)]
Merge changes Ic1ce346a,Ic0b4e92c
* changes:
Simplify model_rd_for_sb HBD ifdefs
Simplify dist_block HBD ifdefs
Alex Converse [Fri, 31 Jul 2015 00:39:23 +0000 (17:39 -0700)]
Compute skippable inside the block_rd_txfm loop.
Change-Id: Iaa43aeeb7a2074495e00cdb83bb551c3f13d3ed2
Zoe Liu [Fri, 31 Jul 2015 18:23:19 +0000 (18:23 +0000)]
Merge "Refactor mips/dspr2 on convolution."
Zoe Liu [Fri, 31 Jul 2015 18:20:14 +0000 (18:20 +0000)]
Merge "Code refactor on InterpKernel"
Alex Converse [Fri, 31 Jul 2015 17:56:11 +0000 (10:56 -0700)]
Simplify model_rd_for_sb HBD ifdefs
Change-Id: Ic1ce346a053800ae3b2d77178f46e6a388357f6d
Alex Converse [Fri, 31 Jul 2015 00:52:55 +0000 (17:52 -0700)]
Simplify dist_block HBD ifdefs
Change-Id: Ic0b4e92cbaf813bcca8a8e9052c936c2e025e114
Aℓex Converse [Fri, 31 Jul 2015 17:59:22 +0000 (17:59 +0000)]
Merge "Short circuit rate_block in block_rd_txfm."
Zoe Liu [Tue, 28 Jul 2015 17:52:24 +0000 (10:52 -0700)]
Refactor mips/dspr2 on convolution.
Change-Id: If59a39d5a92c261537342726f94bb7f7f26dfff3
Zoe Liu [Wed, 22 Jul 2015 17:40:42 +0000 (10:40 -0700)]
Code refactor on InterpKernel
It in essence refactors the code for both the interpolation
filtering and the convolution. This change includes the moving
of all the files as well as the changing of the code from vp9_
prefix to vpx_ prefix accordingly, for underneath architectures:
(1) x86;
(2) arm/neon; and
(3) mips/msa.
The work on mips/drsp2 will be done in a separate change list.
Change-Id: Ic3ce7fb7f81210db7628b373c73553db68793c46
Alex Converse [Thu, 30 Jul 2015 18:52:28 +0000 (11:52 -0700)]
Give skip_txfm constants names.
This is using a define instead of an enum to keep byte packing.
Change-Id: I3abb07c8bfe377e19be4531b624af7b7b4207792
Alex Converse [Thu, 30 Jul 2015 22:33:47 +0000 (15:33 -0700)]
Short circuit rate_block in block_rd_txfm.
Don't run rate_block (cost_coeffs) if distortion alone is enough to
surpass best_rd.
This decreases 2nd pass runtime on HD at speed 2 by about 2%. There is
zero effect on output if tx_cache is removed.
Change-Id: Ia3b1cc77bfbe6ee988c395fde06c0eb92940b784
Parag Salasakar [Fri, 31 Jul 2015 06:33:19 +0000 (12:03 +0530)]
mips msa vp8 temporal filter optimization
average improvement ~2x-3x
Change-Id: I05593bed583234dc7809aaec6cab82773a29505d
Parag Salasakar [Fri, 31 Jul 2015 03:59:10 +0000 (09:29 +0530)]
mips msa vp8 block subtract optimization
average improvement ~2x-3x
Change-Id: I30abf4c92cddcc9e87b7a40d4106076e1ec701c2
Parag Salasakar [Fri, 31 Jul 2015 03:44:03 +0000 (03:44 +0000)]
Merge "mips msa vp8 quantize optimization"
Yunqing Wang [Wed, 29 Jul 2015 20:37:41 +0000 (13:37 -0700)]
Remove tx cache and speed up tx size selection
1. The RD scores obtained during the tx size selection were stored in the
tx cache, and used to help make the tx decision for the following frames.
This wasn't used anymore in VP9 encoder. Recovered the related decision
making code from 1.5+ years ago, and borg tests didn't show any quality
gain. This patch removed it to lower the complexity.
2. An optimization was done after the above refactoring. If the tx_mode
is not TX_MODE_SELECT, we only need to test the chosen tx size instead
of all posible tx sizes. This gave a 1.5% average speed gain at speed 2,
and a 1% average speed gain at speed 3.
Change-Id: Id8cd650e066a8cef33829d8c15388a8138adc78c
Aℓex Converse [Thu, 30 Jul 2015 23:04:28 +0000 (23:04 +0000)]
Merge "Convert simple_model_rd_from_var from a speed check to a speed feature."
Hui Su [Thu, 30 Jul 2015 22:29:35 +0000 (22:29 +0000)]
Merge "Exclude vpx intra prediction functions in vp8-only build"
Alex Converse [Thu, 30 Jul 2015 20:52:02 +0000 (13:52 -0700)]
Convert simple_model_rd_from_var from a speed check to a speed feature.
Change-Id: I8877025e172fff29bc4e270790211463b676b4d7
hui su [Thu, 30 Jul 2015 02:43:29 +0000 (19:43 -0700)]
Exclude vpx intra prediction functions in vp8-only build
Currently vp8 is not using the intra prediction functions in vpx_dsp.
Change-Id: I1522b5f5cb12a81999fb126cf7c62c70259e7a52
James Zern [Wed, 29 Jul 2015 23:07:05 +0000 (16:07 -0700)]
Android.mk: fix *_rtcd.h deps for armeabi-v7a
strip '.neon' so *_rtcd.h depends on the correct file
Change-Id: Iecdbbc34c9ce5c6d0a4b466332d52f4e6a0cb128
Parag Salasakar [Thu, 30 Jul 2015 05:26:40 +0000 (10:56 +0530)]
mips msa vp8 quantize optimization
average improvement ~2x-3x
Change-Id: I6fc37191bf9cb5a67e1af9787d0d27659c17bdba
Alex Converse [Thu, 30 Jul 2015 19:36:57 +0000 (12:36 -0700)]
Cleanup rdcost_block_args
Change-Id: I9d613cbe9e76b5dd15e935878ef9fd04521690ba
Aℓex Converse [Thu, 30 Jul 2015 19:37:28 +0000 (19:37 +0000)]
Merge "Clean up some casts."
Jingning Han [Thu, 30 Jul 2015 05:37:53 +0000 (05:37 +0000)]
Merge "Cosmetics - Fix header file order in unit tests"
Jingning Han [Wed, 29 Jul 2015 21:51:36 +0000 (14:51 -0700)]
Cosmetics - Fix header file order in unit tests
Change-Id: I9582a8d74990125b71e8fe620f7f3f2585a30798
Parag Salasakar [Thu, 30 Jul 2015 02:44:42 +0000 (08:14 +0530)]
mips msa vp8 fdct optimization
average improvement ~2x-4x
Change-Id: Id0bc600440f7ef53348f585ebadb1ac6869e9a00
Parag Salasakar [Thu, 30 Jul 2015 02:34:06 +0000 (02:34 +0000)]
Merge "mips msa vp8 post proc optimization"
Aℓex Converse [Thu, 30 Jul 2015 01:06:08 +0000 (01:06 +0000)]
Merge "Comment zcoeff_blk."
Alex Converse [Wed, 29 Jul 2015 23:53:33 +0000 (16:53 -0700)]
Comment zcoeff_blk.
Change-Id: Iefc2eb78e71472ecf51802ec59ff32caef4bd0f4
Yaowu Xu [Wed, 29 Jul 2015 23:27:34 +0000 (16:27 -0700)]
Add const to a variable declaration
Change-Id: Idf572c22a87098665f5179dc3212a06d9a85a342
Yaowu Xu [Wed, 29 Jul 2015 23:23:14 +0000 (16:23 -0700)]
Fix a typo
Change-Id: Ief8eea8fe6bef139d1e94f8d6dfac5a44efe785d
James Zern [Wed, 29 Jul 2015 22:38:43 +0000 (15:38 -0700)]
Android.mk: add a dep on vpx_config.asm for x86_64
Change-Id: I8b4db93f754607aab64351745bd102ab238d9501
Alex Converse [Fri, 24 Jul 2015 21:59:03 +0000 (14:59 -0700)]
Clean up some casts.
Change-Id: I264ca534cd7d4755906e20aea47e7a2523bca611
Parag Salasakar [Wed, 29 Jul 2015 04:10:26 +0000 (09:40 +0530)]
mips msa vp8 post proc optimization
average improvement ~2x-4x
Change-Id: I93abc15389649c169bb8b69127c0b95407d34692
Parag Salasakar [Wed, 29 Jul 2015 04:00:41 +0000 (04:00 +0000)]
Merge "mips msa vp8 filter by weight optimization"
James Zern [Wed, 29 Jul 2015 00:47:09 +0000 (00:47 +0000)]
Merge "add vp9_block_error_fp_neon"
Hui Su [Wed, 29 Jul 2015 00:38:48 +0000 (00:38 +0000)]
Merge "Replace prefix vp9_ with vpx_ for intra prediction functions"
Jingning Han [Wed, 29 Jul 2015 00:07:31 +0000 (00:07 +0000)]
Merge "Replace vp9_ prefix in 2D-DCT functions with vpx_"
Jingning Han [Wed, 29 Jul 2015 00:06:56 +0000 (00:06 +0000)]
Merge "Remove vp9_dct.h file"
Jingning Han [Wed, 29 Jul 2015 00:06:37 +0000 (00:06 +0000)]
Merge "Move DC only forward 2D-DCT functions to vpx_dsp"
Jingning Han [Tue, 28 Jul 2015 22:57:40 +0000 (15:57 -0700)]
Replace vp9_ prefix in 2D-DCT functions with vpx_
Clean up the forward 2D-DCT function names in vpx_dsp.
Change-Id: I3117978596d198b690036e7eb05fe429caf3bc25
Jingning Han [Tue, 28 Jul 2015 22:25:05 +0000 (15:25 -0700)]
Remove vp9_dct.h file
The forward 32x32 2D-DCT functions are aligned in vpx_dsp folder.
The vp9_dct.h file is not effectively used now.
Change-Id: Ie7946b6fdd784b8e91496242337bc9002c75c281
Aℓex Converse [Tue, 28 Jul 2015 21:59:33 +0000 (21:59 +0000)]
Merge "Remove branch in inner loop of foreach_transformed_block_in_plane()"
Aℓex Converse [Tue, 28 Jul 2015 21:59:02 +0000 (21:59 +0000)]
Merge changes If196d9e5,Ib669d572
* changes:
Simplify is_skippable to point straight to eobs.
Don't initialize extra context tree buffers for 4x8 and 8x4.
Jingning Han [Tue, 28 Jul 2015 21:42:25 +0000 (14:42 -0700)]
Move DC only forward 2D-DCT functions to vpx_dsp
This completes the forward transform functions layout refactoring.
Change-Id: I996fb0fb795f41e2040f7b21db985774098aedbd
James Zern [Tue, 28 Jul 2015 21:50:35 +0000 (21:50 +0000)]
Merge "build/make/Android.mk: support TARGET_ARCH_ABI=x86_64"
Johann [Tue, 28 Jul 2015 21:00:32 +0000 (14:00 -0700)]
Don't use 'h' for functions using x86inc.asm
In newer version of x86inc.asm 'h' is used as a modifier for register
names.
Change-Id: Ie5b9dd2f91ecdc8f6f18b2701b6dc23042b604e4
Hui Su [Tue, 28 Jul 2015 20:41:01 +0000 (20:41 +0000)]
Merge "Move intra prediction functions from vp9/common/ to vpx_dsp/"
Jingning Han [Tue, 28 Jul 2015 20:36:59 +0000 (20:36 +0000)]
Merge "Factor 32x32 fwd DCT to vpx_dsp folder"
Jingning Han [Mon, 27 Jul 2015 23:05:15 +0000 (16:05 -0700)]
Factor 32x32 fwd DCT to vpx_dsp folder
Move the 32x32 2D-DCT implementations from vp9/ to vpx_dsp/.
Change-Id: Id3980696f8b69906ff7a59ff9fb2b9013d60047d
Frank Galligan [Tue, 28 Jul 2015 16:05:41 +0000 (09:05 -0700)]
Fix dspr2 build.
Change-Id: I18895c29d6db872d033b3874de9dcd9501d0c10e
James Zern [Sat, 25 Jul 2015 19:27:56 +0000 (12:27 -0700)]
add vp9_block_error_fp_neon
~60-70% faster depending on the block size
Change-Id: Icdbaa9977a91a63cbcc6ead0cf19d5a2af7f27e1
Parag Salasakar [Tue, 28 Jul 2015 02:46:34 +0000 (08:16 +0530)]
mips msa vp8 filter by weight optimization
average improvement ~3x-5x
Change-Id: Ia808ae56b118e0e1b293901447aa5a0f597b405b
Parag Salasakar [Tue, 28 Jul 2015 02:27:31 +0000 (02:27 +0000)]
Merge "mips msa vp8 recon intra optimization"
Yunqing Wang [Tue, 28 Jul 2015 01:25:14 +0000 (01:25 +0000)]
Merge "Remove tx_select_threshes"
Jingning Han [Mon, 27 Jul 2015 21:56:43 +0000 (14:56 -0700)]
Move forward dct sse2 header file to vpx_dsp
Change-Id: Iba03852ce778c956200818e3473cfb2b48cf8d8e
hui su [Tue, 21 Jul 2015 16:39:46 +0000 (09:39 -0700)]
Replace prefix vp9_ with vpx_ for intra prediction functions
Change-Id: I8ae6fb586f8d5d018ace228df11714f82b085076
hui su [Sun, 19 Jul 2015 22:02:56 +0000 (15:02 -0700)]
Move intra prediction functions from vp9/common/ to vpx_dsp/
Change-Id: I64edc26cf4aab050c83f2d393df6250628ad43b8
Jingning Han [Mon, 27 Jul 2015 19:05:33 +0000 (12:05 -0700)]
Use common coefficient definition in neon idct implementations
Replace the duplicate coefficient definition in neon implementations
of inverse transform with those from vpx_dsp/txfm_common.h
Change-Id: I4cd9bd9569ab1793dfdbb6f16d80bcb581599f0d
Yunqing Wang [Mon, 27 Jul 2015 18:58:39 +0000 (11:58 -0700)]
Remove tx_select_threshes
Removed unused tx_select_threshes and tx_select_diff.
Change-Id: I5e9e7ad170056efe14b5f071e94d0c5a36e4a34c
Jingning Han [Fri, 24 Jul 2015 17:27:23 +0000 (10:27 -0700)]
Replace vp9_idct.h for precise dependency
This commit replaces vp9_idct.h with txfm_common.h in many SIMD
implementation files for precise file dependency.
Change-Id: If73dd726bb16537e7494f28538b0a169810f9756