Alexey Tourbin [Mon, 23 Apr 2018 05:37:44 +0000 (08:37 +0300)]
lz4.c: refactor the decoding routines
I noticed that LZ4_decompress_generic is sometimes instantiated with
identical set of parameters, or (what's worse) with a subtly different
sets of parameters. For example, LZ4_decompress_fast_withPrefix64k is
instantiated as follows:
return LZ4_decompress_generic(source, dest, 0, originalSize, endOnOutputSize,
full, 0, withPrefix64k, (BYTE*)dest - 64 KB, NULL, 64 KB);
while the equivalent withPrefix64k call in LZ4_decompress_usingDict_generic
passes 0 for the last argument instead of 64 KB. It turns out that there
is no difference in this case: if you change 64 KB to 0 KB in
LZ4_decompress_fast_withPrefix64k, you get the same binary code.
Moreover, because it's been clarified that LZ4_decompress_fast doesn't
check match offsets, it is now obvious that both of these fast/withPrefix64k
instantiations are simply redundant. Exactly because LZ4_decompress_fast
doesn't check offsets, it serves well with any prefixed dictionary.
There's a difference, though, with LZ4_decompress_safe_withPrefix64k.
It also passes 64 KB as the last argument, and if you change that to 0,
as in LZ4_decompress_usingDict_generic, you get a completely different
binary code. It seems that passing 0 enables offset checking:
const int checkOffset = ((safeDecode) && (dictSize < (int)(64 KB)));
However, the resulting code seems to run a bit faster. How come
enabling extra checks can make the code run faster? Curiouser and
curiouser! This needs extra study. Currently I take the view that
the dictSize should be set to non-zero when nothing else will do,
i.e. when passing the external dictionary via dictStart. Otherwise,
lowPrefix betrays just enough information about the dictionary.
* * *
Anyway, with this change, I instantiate all the necessary cases as
functions with distinctive names, which also take fewer arguments and
are therefore less error-prone. I also make the functions non-inline.
(The compiler won't inline the functions because they are used more than
once. Hence I attach LZ4_FORCE_O2_GCC_PPC64LE to the instances while
removing from the callers.) The number of instances is now is reduced
from 18 (safe+fast+partial+4*continue+4*prefix+4*dict+2*prefix64+forceExtDict)
down to 7 (safe+fast+partial+2*prefix+2*dict). The size of the code is
not the only issue here. Separate helper function are much more
amenable to profile-guided optimization: it is enough to profile only
a few basic functions, while the other less-often used functions, such
as LZ4_decompress_*_continue, will benefit automatically.
This is the list of LZ4_decompress* functions in liblz4.so, sorted by size.
Exported functions are marked with a capital T.
$ nm -S lib/liblz4.so |grep -wi T |grep LZ4_decompress |sort -k2
0000000000016260 0000000000000005 T LZ4_decompress_fast_withPrefix64k
0000000000016dc0 0000000000000025 T LZ4_decompress_fast_usingDict
0000000000016d80 0000000000000040 T LZ4_decompress_safe_usingDict
0000000000016d10 000000000000006b T LZ4_decompress_fast_continue
0000000000016c70 000000000000009f T LZ4_decompress_safe_continue
00000000000156c0 000000000000059c T LZ4_decompress_fast
0000000000014a90 00000000000005fa T LZ4_decompress_safe
0000000000015c60 00000000000005fa T LZ4_decompress_safe_withPrefix64k
0000000000002280 00000000000005fa t LZ4_decompress_safe_withSmallPrefix
0000000000015090 000000000000062f T LZ4_decompress_safe_partial
0000000000002880 00000000000008ea t LZ4_decompress_fast_extDict
0000000000016270 0000000000000993 t LZ4_decompress_safe_forceExtDict
Yann Collet [Thu, 19 Apr 2018 18:50:20 +0000 (11:50 -0700)]
Merge pull request #503 from lz4/l120
minor length reduction of several large lines
Yann Collet [Thu, 19 Apr 2018 17:52:48 +0000 (10:52 -0700)]
Merge pull request #502 from lhacc1/dev
Wrap likely/unlikely macroses with #ifndef
Yann Collet [Thu, 19 Apr 2018 17:50:40 +0000 (10:50 -0700)]
modified indentation for consistency
Yann Collet [Wed, 18 Apr 2018 23:49:27 +0000 (16:49 -0700)]
minor length reduction of several large lines
Yann Collet [Wed, 18 Apr 2018 17:16:25 +0000 (10:16 -0700)]
Merge pull request #497 from lz4/lowAddr
Compatibility with low memory addresses
Dmitrii Rodionov [Wed, 18 Apr 2018 09:20:56 +0000 (12:20 +0300)]
Wrap likely/unlikely macroses with #ifndef
It prevent redefine error when project using lz4 has its own likely/unlikely
macroses.
Yann Collet [Tue, 17 Apr 2018 23:47:56 +0000 (16:47 -0700)]
fixed LZ4_compress_fast_extState_fastReset() in 32-bit mode
Yann Collet [Tue, 17 Apr 2018 23:18:37 +0000 (16:18 -0700)]
fix dictDelta setting error
wrong test
Yann Collet [Tue, 17 Apr 2018 22:29:17 +0000 (15:29 -0700)]
fix matchIndex overflow
can happen with dictCtx
Yann Collet [Tue, 17 Apr 2018 19:07:22 +0000 (12:07 -0700)]
Merge branch 'dev' into lowAddr
Yann Collet [Tue, 17 Apr 2018 19:06:44 +0000 (12:06 -0700)]
Merge pull request #501 from felixhandte/fix-dict-load-offset
Always Bump Offset by 64 KB in LZ4_loadDict()
W. Felix Handte [Tue, 17 Apr 2018 18:01:44 +0000 (14:01 -0400)]
Always Bump Offset by 64 KB in LZ4_loadDict()
This actually ensures the guarantee referred to in the comment in
LZ4_compress_fast_continue().
Yann Collet [Tue, 17 Apr 2018 06:59:42 +0000 (23:59 -0700)]
fixed dictCtx compression
Yann Collet [Tue, 17 Apr 2018 00:15:02 +0000 (17:15 -0700)]
edited a few traces for debugging
Yann Collet [Mon, 16 Apr 2018 23:54:03 +0000 (16:54 -0700)]
fixed minor format warnings
Yann Collet [Mon, 16 Apr 2018 23:14:28 +0000 (16:14 -0700)]
fixed fuzzer tests
which were modified in parallel within branc `dev`
Yann Collet [Mon, 16 Apr 2018 23:12:38 +0000 (16:12 -0700)]
Merge branch 'dev' into lowAddr
Yann Collet [Mon, 16 Apr 2018 22:11:28 +0000 (15:11 -0700)]
fixed gcc performance regression
Yann Collet [Fri, 13 Apr 2018 20:22:38 +0000 (13:22 -0700)]
Merge pull request #499 from felixhandte/lz4-attach-dict-tests
Test LZ4_attach_dictionary() and Friends
W. Felix Handte [Thu, 12 Apr 2018 23:17:53 +0000 (19:17 -0400)]
Further Test that ExtDictCtx Mode Produces the Exact Same Output
W. Felix Handte [Thu, 12 Apr 2018 22:23:01 +0000 (18:23 -0400)]
Add Tests for LZ4_attach_dictionary and Friends
Yann Collet [Fri, 13 Apr 2018 09:45:32 +0000 (02:45 -0700)]
fixed minor unused variable warning
Yann Collet [Fri, 13 Apr 2018 09:26:14 +0000 (02:26 -0700)]
added comment on variables required after _next_match
Yann Collet [Fri, 13 Apr 2018 09:10:53 +0000 (02:10 -0700)]
fixed potential ptrdiff_t overflow (32-bits mode)
Also removed pointer comparison, which should solve #485
Cyan4973 [Fri, 13 Apr 2018 08:01:54 +0000 (01:01 -0700)]
compatibility with gcc-4.4 string.h version
Someone found it would be a great idea to define there a global variable under the very generic name "index".
Cause problem with shadow warnings, so no variable can be named "index" now ...
Also : automatically update API manual
Cyan4973 [Fri, 13 Apr 2018 07:59:27 +0000 (00:59 -0700)]
added sudo rights for low-mem-address tests
test4973 [Thu, 12 Apr 2018 23:12:21 +0000 (16:12 -0700)]
fixed : counting matches which overlap extDict and prefix
test4973 [Thu, 12 Apr 2018 14:25:40 +0000 (07:25 -0700)]
modified a few traces for debug
Yann Collet [Thu, 12 Apr 2018 20:23:51 +0000 (13:23 -0700)]
Merge pull request #496 from lz4/circleci
Reduced LZ4 test time on circle-ci
Yann Collet [Thu, 12 Apr 2018 13:47:27 +0000 (06:47 -0700)]
modified versionsTest
to use MOREFLAGS rather CPPFLAGS
as some older versions of LZ4 overwrite CPPFLAGS environment variable.
test4973 [Wed, 11 Apr 2018 23:49:40 +0000 (16:49 -0700)]
fixed LZ4_compress_fast_extState_fastReset()
test4973 [Wed, 11 Apr 2018 23:45:19 +0000 (16:45 -0700)]
Merge branch 'dev' into lowAddr
Yann Collet [Wed, 11 Apr 2018 23:41:25 +0000 (16:41 -0700)]
allow system-defined CPPFLAGS in /tests
Yann Collet [Wed, 11 Apr 2018 23:31:43 +0000 (16:31 -0700)]
reduced test time on circle-ci
Yann Collet [Wed, 11 Apr 2018 23:15:42 +0000 (16:15 -0700)]
Merge pull request #492 from felixhandte/avoid-prepare-in-continue
Several Changes Concerning Table Preparation in LZ4 Fast
W. Felix Handte [Wed, 11 Apr 2018 22:42:09 +0000 (18:42 -0400)]
Fix Silly Warning (const-ness in declaration has no effect on value types!)
W. Felix Handte [Wed, 11 Apr 2018 20:55:12 +0000 (16:55 -0400)]
Minor Fixes
W. Felix Handte [Wed, 11 Apr 2018 20:31:52 +0000 (16:31 -0400)]
Add a LZ4_STATIC_LINKING_ONLY Macro to Guard Experimental APIs
W. Felix Handte [Wed, 11 Apr 2018 20:04:24 +0000 (16:04 -0400)]
Expose dictCtx Functionality in LZ4
W. Felix Handte [Wed, 11 Apr 2018 19:13:01 +0000 (15:13 -0400)]
Rename _extState_noReset -> _extState_fastReset and Edit Comments
W. Felix Handte [Wed, 11 Apr 2018 19:12:34 +0000 (15:12 -0400)]
Remove Extraneous Assignment (clearedTable == 0)
W. Felix Handte [Tue, 10 Apr 2018 17:12:30 +0000 (13:12 -0400)]
Expose a Faster Stream Reset Function
test4973 [Tue, 10 Apr 2018 03:38:00 +0000 (20:38 -0700)]
fix minor conversion warning
cast from void not implicit for C++
test4973 [Tue, 10 Apr 2018 00:08:17 +0000 (17:08 -0700)]
fixed minor conversion warning
ptr diff -> U32
test4973 [Mon, 9 Apr 2018 23:23:39 +0000 (16:23 -0700)]
Merge branch 'dev' into lowAddr
W. Felix Handte [Fri, 6 Apr 2018 20:52:29 +0000 (16:52 -0400)]
Avoid Calling LZ4_prepareTable() in LZ4_compress_fast_continue()
Yann Collet [Sat, 7 Apr 2018 00:35:45 +0000 (17:35 -0700)]
Merge pull request #494 from felixhandte/kill-goto
Return to Allowing Early Returns in LZ4_compress_generic()
W. Felix Handte [Fri, 6 Apr 2018 22:52:55 +0000 (18:52 -0400)]
Return to Allowing Early Returns in LZ4_compress_generic()
Or: `goto` Considered Harmful
Or: https://xkcd.com/292/
Yann Collet [Fri, 6 Apr 2018 22:33:28 +0000 (15:33 -0700)]
Merge pull request #493 from lz4/statusLine
fixed DISPLAYUPDATE()
Yann Collet [Fri, 6 Apr 2018 21:16:23 +0000 (14:16 -0700)]
fixed DISPLAYUPDATE()
wrong comparison, which was always overflowing (hence was always true)
except when it was not (i386, reported by pmc)
in which case it would never show any information.
test4973 [Fri, 6 Apr 2018 02:05:49 +0000 (19:05 -0700)]
noticed a bug when re-using hash table
./fuzzer -vv -s4217 -t7518
test4973 [Fri, 6 Apr 2018 01:39:22 +0000 (18:39 -0700)]
added low-memory address test to travis
requires modification linux configuration (sudo)
test4973 [Fri, 6 Apr 2018 01:29:42 +0000 (18:29 -0700)]
fixed byPtr mode
switch to byU32 when src address is < 64K
note : byPtr is still useful in 32-bits, as it's about ~10% faster
test4973 [Fri, 6 Apr 2018 00:52:54 +0000 (17:52 -0700)]
fixed byPtr match search
test4973 [Fri, 6 Apr 2018 00:16:33 +0000 (17:16 -0700)]
fixed immediate match search
test4973 [Thu, 5 Apr 2018 23:38:43 +0000 (16:38 -0700)]
changed LZ4_compress_generic() logic
to use indexes (U32) instead of Ptr.
byPtr is still present.
test4973 [Thu, 5 Apr 2018 19:40:33 +0000 (12:40 -0700)]
fixed lz4 compression starting at small address
when using byU32 and byU16 modes
test4973 [Wed, 4 Apr 2018 18:38:55 +0000 (11:38 -0700)]
Merge branch 'dev' into lowAddr
Yann Collet [Mon, 2 Apr 2018 03:33:42 +0000 (20:33 -0700)]
Merge pull request #490 from kenjichanhkg/dev
added vs2017 projects
Kenji Chan [Mon, 2 Apr 2018 02:52:45 +0000 (10:52 +0800)]
added vs2017 projects
Yann Collet [Wed, 21 Mar 2018 21:53:02 +0000 (14:53 -0700)]
Merge pull request #486 from felixhandte/fix-test-makefile-clean-up
Add Dependency to Fix Parallel `make test` Runs
Yann Collet [Wed, 21 Mar 2018 21:52:53 +0000 (14:52 -0700)]
Merge pull request #487 from felixhandte/better-obsoletion-comment
Better Describe Functionality of Obsolete Streaming Functions
W. Felix Handte [Wed, 21 Mar 2018 15:48:35 +0000 (11:48 -0400)]
Also Fix a Comment
W. Felix Handte [Wed, 21 Mar 2018 15:39:41 +0000 (11:39 -0400)]
Better Describe Functionality of Obsolete Streaming Functions
W. Felix Handte [Wed, 21 Mar 2018 15:28:51 +0000 (11:28 -0400)]
Add Dependency to Fix Parallel `make test` Runs
When run with `-jN`, the `rm tmp*` can run in the middle of the `test-lz4-dict`
job, which will then fail, finding its files to have been axed. This adds a
dependency between the two.
Yann Collet [Wed, 21 Mar 2018 14:19:48 +0000 (07:19 -0700)]
added c90 test to c_standards
to catch `//` comments
test4973 [Wed, 21 Mar 2018 14:14:13 +0000 (07:14 -0700)]
added low address fuzzer tests
Yann Collet [Wed, 21 Mar 2018 14:07:24 +0000 (07:07 -0700)]
fix comment style
Yann Collet [Tue, 20 Mar 2018 00:19:25 +0000 (17:19 -0700)]
bench: introduced hidden command -S
to benchmark multiple files with separate results
Yann Collet [Mon, 19 Mar 2018 23:18:25 +0000 (16:18 -0700)]
Merge branch 'dev' of github.com:Cyan4973/lz4 into dev
Yann Collet [Mon, 19 Mar 2018 23:18:10 +0000 (16:18 -0700)]
minor man fix on clevels
Yann Collet [Mon, 19 Mar 2018 17:06:33 +0000 (10:06 -0700)]
Merge pull request #484 from lz4/fasterDict
Faster dictionary compression
Yann Collet [Mon, 19 Mar 2018 02:07:55 +0000 (19:07 -0700)]
Merge pull request #406 from felixhandte/ref-dict-table
Use the Dictionary Hash Table in Place
W. Felix Handte [Wed, 31 Jan 2018 23:11:37 +0000 (18:11 -0500)]
Remove Framebench Tool
W. Felix Handte [Wed, 14 Mar 2018 19:58:38 +0000 (15:58 -0400)]
Move LZ4_compress_fast_extState_noReset Declaration to Unstable Section
W. Felix Handte [Wed, 14 Mar 2018 19:51:59 +0000 (15:51 -0400)]
Restore the Other Old Streaming Functions in a Degraded Fashion
W. Felix Handte [Tue, 13 Mar 2018 21:47:34 +0000 (17:47 -0400)]
Switch ALLOC() to ALLOC_AND_ZERO() to Paper Over Existing Uninitialized Read
W. Felix Handte [Tue, 13 Mar 2018 21:45:09 +0000 (17:45 -0400)]
Split lz4CtxLevel into Two Fields
W. Felix Handte [Tue, 13 Mar 2018 21:35:44 +0000 (17:35 -0400)]
Another Allocation Fail Check
W. Felix Handte [Tue, 13 Mar 2018 19:42:03 +0000 (15:42 -0400)]
Restore LZ4_sizeofStreamState, We Didn't Actually Need to Delete It
W. Felix Handte [Tue, 13 Mar 2018 19:18:08 +0000 (15:18 -0400)]
Restore checkTag Cleaning
W. Felix Handte [Tue, 13 Mar 2018 19:07:19 +0000 (15:07 -0400)]
Rename Enums and Add Comment
W. Felix Handte [Mon, 12 Mar 2018 22:46:54 +0000 (18:46 -0400)]
Whitespace Fixes
W. Felix Handte [Mon, 12 Mar 2018 22:32:24 +0000 (18:32 -0400)]
Add NULL Checks
W. Felix Handte [Fri, 9 Mar 2018 17:05:31 +0000 (12:05 -0500)]
Simpler Ternary Statements
W. Felix Handte [Mon, 12 Mar 2018 22:13:24 +0000 (18:13 -0400)]
Renames and Comment Fixes
W. Felix Handte [Mon, 12 Mar 2018 20:11:55 +0000 (16:11 -0400)]
Hoist LZ4F Dictionary Setup into Helper LZ4F_applyCDict()
W. Felix Handte [Mon, 12 Mar 2018 20:11:44 +0000 (16:11 -0400)]
Minor Style Fixes
W. Felix Handte [Thu, 8 Mar 2018 19:09:06 +0000 (14:09 -0500)]
Preserve currentOffset==0 When Possible
W. Felix Handte [Fri, 9 Mar 2018 17:14:42 +0000 (12:14 -0500)]
Specialize _extState() for Clean Ctx Rather Than Calling _safeExtState()
W. Felix Handte [Thu, 8 Mar 2018 17:30:34 +0000 (12:30 -0500)]
Remove Switch In Favor of Ternary Statement
W. Felix Handte [Thu, 8 Mar 2018 17:29:45 +0000 (12:29 -0500)]
Further Avoid a dictionary==NULL Check
W. Felix Handte [Tue, 6 Mar 2018 20:53:22 +0000 (15:53 -0500)]
Optimize Dict Check Condition
W. Felix Handte [Tue, 6 Mar 2018 16:52:02 +0000 (11:52 -0500)]
Move to 4KB Cut-Off
W. Felix Handte [Wed, 14 Feb 2018 01:06:24 +0000 (17:06 -0800)]
Reset Table on Inputs Larger than 2KB
W. Felix Handte [Mon, 5 Mar 2018 16:59:22 +0000 (11:59 -0500)]
Avoid DictSmall Checks By Strategically Bumping CurrentOffset
W. Felix Handte [Sat, 17 Feb 2018 01:33:51 +0000 (17:33 -0800)]
Restore DictIssue Check
W. Felix Handte [Tue, 30 Jan 2018 20:22:29 +0000 (15:22 -0500)]
Avoid dictionary == NULL Check
W. Felix Handte [Fri, 26 Jan 2018 22:29:50 +0000 (17:29 -0500)]
Replace calloc() Calls With malloc() Where Possible