James Zern [Wed, 26 Jun 2013 18:09:08 +0000 (11:09 -0700)]
test/fdct*: fix some warnings
comment out some unused parameters and adjust the format to avoid:
./test/fdct4x4_test.cc|27| warning C4138: '*/' found outside of comment
Change-Id: I60f93b4c3cd7e8d61f0de80019f3404b40161f03
Dmitry Kovalev [Wed, 26 Jun 2013 17:23:27 +0000 (10:23 -0700)]
Merge "Using get_plane_block_{width, height} instead of custom code."
John Koleszar [Wed, 26 Jun 2013 05:44:39 +0000 (22:44 -0700)]
Merge "vpxenc: send usage to stderr"
John Koleszar [Wed, 26 Jun 2013 05:44:26 +0000 (22:44 -0700)]
Merge ".gitignore: add gcov files"
John Koleszar [Wed, 26 Jun 2013 05:44:21 +0000 (22:44 -0700)]
Merge "Move vp9_counts_to_nmv_context to encoder"
John Koleszar [Wed, 26 Jun 2013 05:44:16 +0000 (22:44 -0700)]
Merge "Move vp9_full_to_model_counts to encoder"
John Koleszar [Wed, 26 Jun 2013 05:30:50 +0000 (22:30 -0700)]
Merge "make: add libvpx_test_srcs.txt target"
John Koleszar [Wed, 26 Jun 2013 05:29:37 +0000 (22:29 -0700)]
Merge "tests/*source: test file pointer before reading"
John Koleszar [Wed, 26 Jun 2013 05:27:39 +0000 (22:27 -0700)]
Merge "encode_test_driver: check for fatal failures"
Jingning Han [Wed, 26 Jun 2013 02:46:55 +0000 (19:46 -0700)]
Merge "Refactor intra predictor block"
James Zern [Wed, 26 Jun 2013 00:55:28 +0000 (17:55 -0700)]
tests/*source: test file pointer before reading
if the caller did not abort after an ASSERT failure in Begin()
FillFrame() would segfault.
Change-Id: I2d3f5a0918611bbd081be6f686dea19c56695073
James Zern [Wed, 26 Jun 2013 00:53:20 +0000 (17:53 -0700)]
encode_test_driver: check for fatal failures
Make the base test be:
!(fatal || abort_) removing some redundancy in the encode tests
Change-Id: I8ffaf33fcf9a3030b38ea3e8eb94704cdc2fc920
Jingning Han [Tue, 25 Jun 2013 23:01:48 +0000 (16:01 -0700)]
Refactor intra predictor block
Remove vp9_intra4x4_predict(). Use the common intra prediction
function for all block sizes.
Change-Id: Ibd19d51dfa3da8bbdfb79ddeb81530b2e2089560
Dmitry Kovalev [Tue, 25 Jun 2013 22:19:18 +0000 (15:19 -0700)]
Renaming "nmv" to "mv".
Change-Id: I8299f55c3b930221e52c2237f2ddea65b94fd33b
Dmitry Kovalev [Tue, 25 Jun 2013 21:11:18 +0000 (14:11 -0700)]
Using get_plane_block_{width, height} instead of custom code.
Change-Id: I453ed11b965e857a14c18ea5c0f4a0a48e7dc0d9
Ronald S. Bultje [Tue, 25 Jun 2013 20:51:18 +0000 (13:51 -0700)]
Merge "Only do metrics on cropped (visible) area of picture."
Ronald S. Bultje [Tue, 25 Jun 2013 20:51:04 +0000 (13:51 -0700)]
Merge "Don't skip right/bottom border pixels in SSIM calculations."
Ronald S. Bultje [Tue, 25 Jun 2013 20:50:53 +0000 (13:50 -0700)]
Merge "Add averaging-SAD functions for 8-point comp-inter motion search."
James Zern [Tue, 25 Jun 2013 20:50:30 +0000 (13:50 -0700)]
make: add libvpx_test_srcs.txt target
same application as libvpx_srcs.txt
Change-Id: I1f096cc3c180d205365663c1aa5533b52561d811
Jingning Han [Tue, 25 Jun 2013 20:17:23 +0000 (13:17 -0700)]
Merge "Cosmetic changes in 4x4 fwd transform unit test"
Jingning Han [Tue, 25 Jun 2013 20:17:05 +0000 (13:17 -0700)]
Merge "Tune the rounding operations in 8x8 ADST/DCT sse2"
James Zern [Tue, 25 Jun 2013 19:57:49 +0000 (12:57 -0700)]
Merge "I420VideoSource: normalize framerate types"
Ronald S. Bultje [Mon, 10 Jun 2013 18:47:22 +0000 (11:47 -0700)]
Only do metrics on cropped (visible) area of picture.
The part where we align it by 8 or 16 is an implementation detail that
shouldn't matter to the outside world.
Change-Id: I9edd6f08b51b31c839c0ea91f767640bccb08d53
Ronald S. Bultje [Mon, 10 Jun 2013 18:36:04 +0000 (11:36 -0700)]
Don't skip right/bottom border pixels in SSIM calculations.
Change-Id: I75acb55ade54bef6ad7703ed5e691581fa2f8fe1
Ronald S. Bultje [Tue, 25 Jun 2013 18:26:49 +0000 (11:26 -0700)]
Add averaging-SAD functions for 8-point comp-inter motion search.
Makes first 50 frames of bus @ 1500kbps encode from 3min22.7 to 3min18.2,
i.e. 2.3% faster. In addition, use the sub_pixel_avg functions to calc
the variance of the averaging predictor. This is slightly suboptimal
because the function is subpixel-position-aware, but it will (at least
for the SSE2 version) not actually use a bilinear filter for a full-pixel
position, thus leading to approximately the same performance compared to
if we implemented an actual average-aware full-pixel variance function.
That gains another 0.3 seconds (i.e. encode time goes to 3min17.4), thus
leading to a total gain of 2.7%.
Change-Id: I3f059d2b04243921868cfed2568d4fa65d7b5acd
James Zern [Tue, 25 Jun 2013 19:56:40 +0000 (12:56 -0700)]
Merge "intrapred_test: add virtual dtor to IntraPredBase"
Jingning Han [Fri, 21 Jun 2013 22:56:24 +0000 (15:56 -0700)]
Tune the rounding operations in 8x8 ADST/DCT sse2
Improve the round-trip precision to meet the unit test setttings.
Change-Id: I303febae56b4b990ea3798b8ebed94c0510ecf79
Ronald S. Bultje [Tue, 25 Jun 2013 19:00:41 +0000 (12:00 -0700)]
Merge "Add SAD unit tests for all rectangular sizes."
Ronald S. Bultje [Tue, 25 Jun 2013 19:00:36 +0000 (12:00 -0700)]
Merge "Don't re-allocate comp_pred buffers for each call to comp motion search."
Dmitry Kovalev [Tue, 25 Jun 2013 18:50:55 +0000 (11:50 -0700)]
Merge "Removing unused code."
Jingning Han [Fri, 21 Jun 2013 23:00:44 +0000 (16:00 -0700)]
Cosmetic changes in 4x4 fwd transform unit test
Change-Id: I7a9ea03b92160f1052e56665b19a155211ee241f
Jingning Han [Tue, 25 Jun 2013 18:21:17 +0000 (11:21 -0700)]
Merge "Add 8x8 dct/adst unit tests"
Yaowu Xu [Tue, 25 Jun 2013 17:44:47 +0000 (10:44 -0700)]
Merge "Changed size of mb_mode_context to 8 bits"
Scott LaVarnway [Tue, 25 Jun 2013 17:34:19 +0000 (10:34 -0700)]
Merge "Small mode_info_context cleanup in filter_block_plane"
Dmitry Kovalev [Tue, 25 Jun 2013 00:56:06 +0000 (17:56 -0700)]
Removing unused code.
Removing block index (ib) parameter from get_tx_type_{8x8, 16x16}
functions.
Change-Id: Ia213335aae7a7cb027f97b9cc9b04519840250f1
Dmitry Kovalev [Tue, 25 Jun 2013 17:16:06 +0000 (10:16 -0700)]
Merge "Removing find_seg_id and using vp9_get_pred_mi_segid instead."
Dmitry Kovalev [Tue, 25 Jun 2013 17:15:33 +0000 (10:15 -0700)]
Merge "Transforming scale_mv_component_q4 into scale_mv_q4 function."
Jingning Han [Fri, 21 Jun 2013 18:45:47 +0000 (11:45 -0700)]
Add 8x8 dct/adst unit tests
This commit enables 8x8 DCT and hybrid transform unit tests. It
also tunes the forward hybrid transform rounding opertions for
more precise round-trip performance.
Change-Id: If05c1ce59d75d641b9c6c91527d02d3a6ef498c3
Jingning Han [Tue, 25 Jun 2013 16:49:03 +0000 (09:49 -0700)]
Merge "Use aligned buffer operations in 8x8/16x16 2D-DCT"
Scott LaVarnway [Tue, 25 Jun 2013 16:28:50 +0000 (12:28 -0400)]
Small mode_info_context cleanup in filter_block_plane
Unnecessary updates to xd->mode_info_context.
Change-Id: I36d2d68ca48366f727548526726b1b5437f62968
John Koleszar [Tue, 25 Jun 2013 16:15:07 +0000 (09:15 -0700)]
vpxenc: send usage to stderr
Thanks to hiiragikei AT gmail.com for the fix.
Change-Id: Iab6c0822593fc5557d86efbb014ff6409ff05b35
Yaowu Xu [Tue, 25 Jun 2013 16:13:22 +0000 (09:13 -0700)]
Merge "Enable sse2 implmentation of 8x8 ADST/DCT"
Yaowu Xu [Tue, 25 Jun 2013 16:07:01 +0000 (09:07 -0700)]
Merge "change to enable use_largest_txform feature"
Jingning Han [Tue, 25 Jun 2013 02:52:55 +0000 (19:52 -0700)]
Use aligned buffer operations in 8x8/16x16 2D-DCT
This reduces 16x16 2D-DCT runtime from 865 cycles to 837 cycles.
Change-Id: I137758b81cd127b936175284310e81378db64552
Jingning Han [Thu, 20 Jun 2013 16:00:23 +0000 (09:00 -0700)]
Enable sse2 implmentation of 8x8 ADST/DCT
This commit makes use of the butterfly structure to enable the sse2
version implementation of 8x8 ADST/DCT hybrid transform coding.
The runtime of hybrid transform module goes down from 1170 cycles
to 245 cycles. Overall speed-up around 1.5%.
Change-Id: Ic808ffd21ece8a9d0410d8c0243d7b6c28ac3b3f
Yaowu Xu [Mon, 24 Jun 2013 23:43:26 +0000 (16:43 -0700)]
change to enable use_largest_txform feature
for all regular inter frames at speed 1
Change-Id: I0a8b301273ecf2b8730ab1f6b7a05f89f4d498e0
John Koleszar [Mon, 24 Jun 2013 22:59:32 +0000 (15:59 -0700)]
.gitignore: add gcov files
Change-Id: I0a58578e7cf27f3de839eb62a334e343eaed12c5
John Koleszar [Mon, 24 Jun 2013 22:58:18 +0000 (15:58 -0700)]
Move vp9_counts_to_nmv_context to encoder
This function only used from within vp9_encodemv.c.
Change-Id: Ib3fc7c30b1e2d27321397ac474cbc8976bc1f4b1
John Koleszar [Mon, 24 Jun 2013 22:46:15 +0000 (15:46 -0700)]
Move vp9_full_to_model_counts to encoder
This function is not called from the decoder, so it doesn't need to be
in common/.
Change-Id: I6977dd462a25b4ff39c9c7e1b0b5b16aa58ee733
John Koleszar [Mon, 24 Jun 2013 22:08:58 +0000 (15:08 -0700)]
Merge "Remove unused vp9_build_intra_predictors_sb{y,uv}_s"
John Koleszar [Mon, 24 Jun 2013 22:08:54 +0000 (15:08 -0700)]
Merge "Remove unused vp9_model_to_full_probs_sb()"
Scott LaVarnway [Mon, 24 Jun 2013 21:11:16 +0000 (17:11 -0400)]
Changed size of mb_mode_context to 8 bits
This reduced the size of the MODE_INFO array (mip and prev_mip)
by 425,568 bytes each for 1080p resolutions.
Change-Id: Ifa513ec2d0a49e8ec0867ec90620762fb7f1261d
Ronald S. Bultje [Mon, 24 Jun 2013 18:28:19 +0000 (11:28 -0700)]
Add SAD unit tests for all rectangular sizes.
Change-Id: I47e81b51f072abdb276bdec85423febba34b5f81
Ronald S. Bultje [Sat, 22 Jun 2013 00:19:36 +0000 (17:19 -0700)]
Don't re-allocate comp_pred buffers for each call to comp motion search.
Instead, just allocate a few bytes on the stack, this is 4k, which isn't
all that much.
Change-Id: I82af6ee89e6ed01faaa23ff891ee7ced76df8c16
Yaowu Xu [Mon, 24 Jun 2013 16:55:21 +0000 (09:55 -0700)]
Merge "Fix loopfilter of leftmost 4x4 edges in SB"
John Koleszar [Sat, 22 Jun 2013 00:06:43 +0000 (17:06 -0700)]
Fix loopfilter of leftmost 4x4 edges in SB
For cases where there's no transform set in bit 0 (the left edge of
the SB) but bit 0 of mask_4x4_int is set (the edge 4 pixels from the
left edge needs filtering), it was incorrectly being skipped before.
This situation only happens on the leftmost edge of the image, as
the edge at column 0 is intentionally skipped since there aren't
pixels to the left to read.
Change-Id: Ib2fbbcb40166e90af31b1a0e13b85b68c226cbd3
Ronald S. Bultje [Sat, 22 Jun 2013 04:22:55 +0000 (21:22 -0700)]
Merge "Allocate memory using appropriate expected alignment in unit tests."
James Zern [Sat, 22 Jun 2013 02:34:51 +0000 (19:34 -0700)]
I420VideoSource: normalize framerate types
ctor inputs are ints as are vpx_rational_t members
Change-Id: I62a39bf3df123727a872e40b74e3ee9e55ef2ede
James Zern [Sat, 22 Jun 2013 02:33:50 +0000 (19:33 -0700)]
intrapred_test: add virtual dtor to IntraPredBase
classes with virtual functions should have virtual destructors
Change-Id: If54e2f8384f0bfcbf812cc727eb9d0a586173674
Ronald S. Bultje [Sat, 22 Jun 2013 00:03:57 +0000 (17:03 -0700)]
Allocate memory using appropriate expected alignment in unit tests.
Fixes crashes of test_libvpx on 32-bit Linux.
Change-Id: If94e7628a86b788ca26c004861dee2f162e47ed6
John Koleszar [Fri, 21 Jun 2013 23:31:18 +0000 (16:31 -0700)]
Merge "Add some unaligned test vectors"
John Koleszar [Fri, 21 Jun 2013 23:10:05 +0000 (16:10 -0700)]
Remove unused vp9_build_intra_predictors_sb{y,uv}_s
The functions no longer referenced.
Change-Id: If2705dfbc607f79ec8ec2242d5e03bec27a35aaf
Ronald S. Bultje [Fri, 21 Jun 2013 22:53:25 +0000 (15:53 -0700)]
Merge "Remove emms - that shouldn't be there."
John Koleszar [Fri, 21 Jun 2013 22:38:55 +0000 (15:38 -0700)]
Remove unused vp9_model_to_full_probs_sb()
This function never referenced.
Change-Id: I1c42cd355bfa88e17d169f7335a44be682af58cc
Dmitry Kovalev [Fri, 21 Jun 2013 22:34:29 +0000 (15:34 -0700)]
Transforming scale_mv_component_q4 into scale_mv_q4 function.
Using MV instead of int_mv for function arguments.
Change-Id: Ic25e13dccbc98fac1fa1b3255127e00cca2a57f6
Ronald S. Bultje [Fri, 21 Jun 2013 21:45:04 +0000 (14:45 -0700)]
Remove emms - that shouldn't be there.
Change-Id: I8fcab81e390f93dc17e9666bbf8f77883b5aa897
James Zern [Wed, 19 Jun 2013 02:15:56 +0000 (19:15 -0700)]
variance_test: use REGISTER_STATE_CHECK
Change-Id: Id54ad9a781634f075e990d5bade5be8490959975
Dmitry Kovalev [Thu, 20 Jun 2013 22:52:47 +0000 (15:52 -0700)]
Removing find_seg_id and using vp9_get_pred_mi_segid instead.
Change-Id: Ia40229903c08f14020e90e94cfdf494aba1be827
Ronald S. Bultje [Fri, 21 Jun 2013 19:55:46 +0000 (12:55 -0700)]
Add missing SECTION .text marker in assembly file.
Fixes a crash on Windows when building with MSVC.
Change-Id: I124ac756a1be55d190fadda5fcc46d23b1445dbf
Ronald S. Bultje [Fri, 21 Jun 2013 19:54:52 +0000 (12:54 -0700)]
Implement SSE2 block_error.
Change vp9_block_error() to return a 64bit error variable, change all
callers to expect a 64bit return value (this will prevent overflows,
which we basically don't check for at all right now). Remove duplicate
block_error() function, which fixed that through truncation. Remove
old (incompatible) mmx/sse2 block_error SIMD versions and replace with
a new one that returns a 64bit value.
Encoding time of first 50 frames of bus @ 1500kbps goes from 3min29 to
3min23, i.e. a 3% overall speedup.
Change-Id: Ib71ac5508b5ee8a80f1753cd85d72df1629abe68
Ronald S. Bultje [Fri, 21 Jun 2013 19:49:50 +0000 (12:49 -0700)]
Merge "Add subtract_block SSE2 version and unit test."
Ronald S. Bultje [Fri, 21 Jun 2013 19:49:43 +0000 (12:49 -0700)]
Merge "SSE2/SSSE3 optimizations and unit test for sub_pixel_avg_variance()."
Ronald S. Bultje [Fri, 21 Jun 2013 16:35:37 +0000 (09:35 -0700)]
Add subtract_block SSE2 version and unit test.
3% faster overall (3min35.0 to 3min28.5).
Change-Id: I5ff8a5c2c91586b6632ca5009ad1ea51ce94af5e
Yaowu Xu [Fri, 21 Jun 2013 05:37:01 +0000 (22:37 -0700)]
Merge "Get some speed back for cpuused 1"
Yaowu Xu [Thu, 20 Jun 2013 22:23:37 +0000 (15:23 -0700)]
Get some speed back for cpuused 1
and remove unused code.
Change-Id: If380440c4450294b5450b7a9eeb94a376846ec01
Yaowu Xu [Fri, 21 Jun 2013 02:04:30 +0000 (19:04 -0700)]
Merge "rename variables to avoid build error in MSVC"
Yaowu Xu [Thu, 20 Jun 2013 18:48:08 +0000 (11:48 -0700)]
rename variables to avoid build error in MSVC
Change-Id: I7960178c95c54d5c4497e44cfc8c493566294b34
Yaowu Xu [Fri, 21 Jun 2013 00:42:50 +0000 (17:42 -0700)]
Merge "Implement sse2 and ssse3 versions for all sub_pixel_variance sizes."
Ronald S. Bultje [Thu, 20 Jun 2013 22:59:48 +0000 (15:59 -0700)]
SSE2/SSSE3 optimizations and unit test for sub_pixel_avg_variance().
Encoding of bus @ 1500kbps (first 50 frames) goes from 3min57 to
3min35, i.e. approximately a 10.5% speedup. Note that the SIMD versions
which use a bilinear filter (x_offset & 7 || y_offset & 7) aren't
perfectly interleaved, and can probably be improved further in the
future. I've marked this with a few TODOs/FIXMEs in the code.
Change-Id: I5c9e900c0f0d32e431a50fecae213b510b2549f9
Jim Bankoski [Thu, 20 Jun 2013 22:10:16 +0000 (15:10 -0700)]
Merge "clean out libvpx-srcs.txt if built"
Jim Bankoski [Thu, 20 Jun 2013 22:05:42 +0000 (15:05 -0700)]
clean out libvpx-srcs.txt if built
Change-Id: Idfd69e66e8982275eb00d8007a55efd1a4f86a98
James Zern [Thu, 20 Jun 2013 22:02:27 +0000 (15:02 -0700)]
Merge "Revert "test_libvpx: disable pthreads in gtest""
Frank Galligan [Thu, 20 Jun 2013 21:05:17 +0000 (14:05 -0700)]
Fix win64 warning.
- size_t vs int.
Change-Id: Ib47ebd932a4b69db9f52a43000bb69d0a96b9134
James Zern [Thu, 20 Jun 2013 19:49:15 +0000 (12:49 -0700)]
Revert "test_libvpx: disable pthreads in gtest"
This reverts commit
90a9900abb79fabfd44189a959d14ca677c2777a
Seems to break the Mac build:
src/include/gtest/internal/gtest-port.h:1208:: pthread_mutex_lock(&mutex_)failed with error 22
Abort trap: 6
Change-Id: Icbe31161d7c27f1b0a28d33409e7712430bbf0ae
Jingning Han [Thu, 20 Jun 2013 17:22:40 +0000 (10:22 -0700)]
Merge "Add unit tests for 4x4 ADST"
Johann [Thu, 20 Jun 2013 17:19:39 +0000 (10:19 -0700)]
Merge "Cast value to avoid size_t/int warning on win64"
Dmitry Kovalev [Thu, 20 Jun 2013 17:17:12 +0000 (10:17 -0700)]
Merge "Renaming 'nmv' to 'mv' for several functions."
Dmitry Kovalev [Thu, 20 Jun 2013 17:17:05 +0000 (10:17 -0700)]
Merge "Function decomposition inside vp9_decodemv.c file."
Deb Mukherjee [Wed, 19 Jun 2013 23:23:21 +0000 (16:23 -0700)]
Improving model rd with variance and quant step
Improves the rd modeling function and implements them using interpolation
from a table which is a little faster. Also uses sse as input to the
modeling function rather than var - since there is no dc prediction
used and as a result the sse works a little better.
derfraw300: +0.05%
Speedup: ~1%
Change-Id: I151353c6451e0e8fe3ae18ab9842f8f67e5151ff
Johann [Thu, 20 Jun 2013 16:52:08 +0000 (09:52 -0700)]
Cast value to avoid size_t/int warning on win64
dboolhuff.c(50) : warning C4267: 'initializing' : conversion from
'size_t' to 'int'
Change-Id: I6b85759efb2fa19f362f406623d8a7583a55c036
Jim Bankoski [Thu, 20 Jun 2013 16:24:04 +0000 (09:24 -0700)]
adds force partitioning greater than or less than block size
adds a new speed feature to force partitioning to be greater than
or less than a certain size
Change-Id: I8c048eeeef93700ae822eccf98f8751a45b2e7d0
Jim Bankoski [Thu, 20 Jun 2013 14:46:51 +0000 (07:46 -0700)]
adds a set partitioning to speed features
this feature lets you set a partitioning size to be used by the entire
frame.
Change-Id: I208a4c8c701375cbb054418266f677768b6f8f06
Jim Bankoski [Thu, 20 Jun 2013 14:17:01 +0000 (07:17 -0700)]
partition by variance using var from last frame
This uses variance to split partition. Variance is calculated using
nearest mv, always from last ref frame.
Change-Id: Idd015b4a9aa3bc82591759eac239680c07496896
Jim Bankoski [Wed, 19 Jun 2013 22:53:47 +0000 (15:53 -0700)]
convert all speed things to speed features
Change-Id: Ie24489a4d39f3e53e816eeebf75a1c9c7d94515a
Jim Bankoski [Wed, 19 Jun 2013 21:26:49 +0000 (14:26 -0700)]
new partition via variance
Change-Id: Ideee45cad8b38087c509cd404484728e85d0c427
Jim Bankoski [Wed, 19 Jun 2013 19:16:45 +0000 (12:16 -0700)]
fix to set up new speed feature
This uses the speed feature functionality for code.
Change-Id: I9cd16c0c5f98520ae27ebba81aa2c178546587f8
Jim Bankoski [Wed, 19 Jun 2013 18:05:34 +0000 (11:05 -0700)]
don't copy partitions for key frames or altrefs
force us to go through slow partitioning for keyframes, altref and
overlays.
Change-Id: I1a286361bf74083e71973575a7296be46eb98742
Ronald S. Bultje [Thu, 20 Jun 2013 16:34:25 +0000 (09:34 -0700)]
Implement sse2 and ssse3 versions for all sub_pixel_variance sizes.
Overall speedup around 5% (bus @ 1500kbps first 50 frames 4min10 ->
3min58). Specific changes to timings for each function compared to
original assembly-optimized versions (or just new version timings if
no previous assembly-optimized version was available):
sse2 4x4: 99 -> 82 cycles
sse2 4x8: 128 cycles
sse2 8x4: 121 cycles
sse2 8x8: 149 -> 129 cycles
sse2 8x16: 235 -> 245 cycles (?)
sse2 16x8: 269 -> 203 cycles
sse2 16x16: 441 -> 349 cycles
sse2 16x32: 641 cycles
sse2 32x16: 643 cycles
sse2 32x32: 1733 -> 1154 cycles
sse2 32x64: 2247 cycles
sse2 64x32: 2323 cycles
sse2 64x64: 6984 -> 4442 cycles
ssse3 4x4: 100 cycles (?)
ssse3 4x8: 103 cycles
ssse3 8x4: 71 cycles
ssse3 8x8: 147 cycles
ssse3 8x16: 158 cycles
ssse3 16x8: 188 -> 162 cycles
ssse3 16x16: 316 -> 273 cycles
ssse3 16x32: 535 cycles
ssse3 32x16: 564 cycles
ssse3 32x32: 973 cycles
ssse3 32x64: 1930 cycles
ssse3 64x32: 1922 cycles
ssse3 64x64: 3760 cycles
Change-Id: I81ff6fe51daf35a40d19785167004664d7e0c59d
Jim Bankoski [Thu, 20 Jun 2013 16:28:11 +0000 (09:28 -0700)]
disable speed > 1 speed corrections in firstpass
need to rework these
Change-Id: I17dc2c88d2faadd2f8fb117c52c25f04ea2e9856
Jim Bankoski [Wed, 19 Jun 2013 23:03:27 +0000 (16:03 -0700)]
new debug modes code
The new print out includes skips and has prefixed sections so you can
grep to find things like transforms chosen on each frame.
Change-Id: I195043424647d9514cfc3ff6720a5b20d010fa1b