From 0fea528e3a70f8578ca6e7f15d922dab8aa9ff25 Mon Sep 17 00:00:00 2001 From: Yann Collet Date: Wed, 5 Sep 2018 14:05:08 -0700 Subject: [PATCH] updated documentation regarding dictionary compression following suggestion from @stbrumme (#558) Also : bumped version number, regenerated man page and html doc --- NEWS | 6 ++++++ README.md | 23 ++++++++++++++--------- doc/lz4_manual.html | 6 +++--- doc/lz4frame_manual.html | 4 ++-- lib/lz4.h | 2 +- programs/lz4.1 | 14 +++++++++++--- programs/lz4.1.md | 20 +++++++++++++------- programs/lz4cli.c | 2 +- tests/.gitignore | 6 +++++- tests/Makefile | 3 ++- 10 files changed, 58 insertions(+), 28 deletions(-) diff --git a/NEWS b/NEWS index 0139e61..8ee3c92 100644 --- a/NEWS +++ b/NEWS @@ -1,3 +1,9 @@ +v1.8.3 +fix : data corruption for files > 64KB at level 9 under specific conditions (#560) +cli : new command --fast, by @jennifermliu +build : added Haiku target, by @fbrosson +doc : updated documentation regarding dictionary compression + v1.8.2 perf: *much* faster dictionary compression on small files, by @felixhandte perf: improved decompression speed and binary size, by Alexey Tourbin (@svpv) diff --git a/README.md b/README.md index 406792a..e64020d 100644 --- a/README.md +++ b/README.md @@ -2,18 +2,23 @@ LZ4 - Extremely fast compression ================================ LZ4 is lossless compression algorithm, -providing compression speed at 400 MB/s per core, +providing compression speed > 500 MB/s per core, scalable with multi-cores CPU. It features an extremely fast decoder, with speed in multiple GB/s per core, typically reaching RAM speed limits on multi-core systems. Speed can be tuned dynamically, selecting an "acceleration" factor -which trades compression ratio for more speed up. +which trades compression ratio for faster speed. On the other end, a high compression derivative, LZ4_HC, is also provided, trading CPU time for improved compression ratio. All versions feature the same decompression speed. +LZ4 is also compatible with [dictionary compression](https://github.com/facebook/zstd#the-case-for-small-data-compression), +and can ingest any input file as dictionary, +including those created by [Zstandard Dictionary Builder](https://github.com/facebook/zstd/blob/v1.3.5/programs/zstd.1.md#dictionary-builder). +(note: only the final 64KB are used). + LZ4 library is provided as open-source software using BSD 2-Clause license. @@ -67,8 +72,8 @@ in single-thread mode. [zlib]: http://www.zlib.net/ [Zstandard]: http://www.zstd.net/ -LZ4 is also compatible and well optimized for x32 mode, -for which it provides some additional speed performance. +LZ4 is also compatible and optimized for x32 mode, +for which it provides additional speed performance. Installation @@ -76,7 +81,7 @@ Installation ``` make -make install # this command may require root access +make install # this command may require root permissions ``` LZ4's `Makefile` supports standard [Makefile conventions], @@ -94,10 +99,10 @@ Documentation The raw LZ4 block compression format is detailed within [lz4_Block_format]. -To compress an arbitrarily long file or data stream, multiple blocks are required. -Organizing these blocks and providing a common header format to handle their content -is the purpose of the Frame format, defined into [lz4_Frame_format]. -Interoperable versions of LZ4 must respect this frame format. +Arbitrarily long files or data streams are compressed using multiple blocks, +for streaming requirements. These blocks are organized into a frame, +defined into [lz4_Frame_format]. +Interoperable versions of LZ4 must also respect the frame format. [lz4_Block_format]: doc/lz4_Block_format.md [lz4_Frame_format]: doc/lz4_Frame_format.md diff --git a/doc/lz4_manual.html b/doc/lz4_manual.html index 3fc71e4..c7c5763 100644 --- a/doc/lz4_manual.html +++ b/doc/lz4_manual.html @@ -1,10 +1,10 @@ -1.8.2 Manual +1.8.3 Manual -

1.8.2 Manual

+

1.8.3 Manual


Contents

    @@ -179,7 +179,7 @@ int LZ4_freeStream (LZ4_stream_t* streamPtr); 'dst' buffer must be already allocated. If dstCapacity >= LZ4_compressBound(srcSize), compression is guaranteed to succeed, and runs faster. - Important : The previous 64KB of compressed data is assumed to remain present and unmodified in memory! + Important : The previous 64KB of source data is assumed to remain present and unmodified in memory! Special 1 : When input is a double-buffer, they can have any size, including < 64 KB. Make sure that buffers are separated by at least one byte. diff --git a/doc/lz4frame_manual.html b/doc/lz4frame_manual.html index 53ea7eb..fb8e0ce 100644 --- a/doc/lz4frame_manual.html +++ b/doc/lz4frame_manual.html @@ -1,10 +1,10 @@ -1.8.2 Manual +1.8.3 Manual -

    1.8.2 Manual

    +

    1.8.3 Manual


    Contents

      diff --git a/lib/lz4.h b/lib/lz4.h index a0eddce..491b67a 100644 --- a/lib/lz4.h +++ b/lib/lz4.h @@ -93,7 +93,7 @@ extern "C" { /*------ Version ------*/ #define LZ4_VERSION_MAJOR 1 /* for breaking interface changes */ #define LZ4_VERSION_MINOR 8 /* for new (non-breaking) interface capabilities */ -#define LZ4_VERSION_RELEASE 2 /* for tweaks, bug-fixes, or development */ +#define LZ4_VERSION_RELEASE 3 /* for tweaks, bug-fixes, or development */ #define LZ4_VERSION_NUMBER (LZ4_VERSION_MAJOR *100*100 + LZ4_VERSION_MINOR *100 + LZ4_VERSION_RELEASE) diff --git a/programs/lz4.1 b/programs/lz4.1 index e0f6a81..f35e29d 100644 --- a/programs/lz4.1 +++ b/programs/lz4.1 @@ -1,5 +1,5 @@ . -.TH "LZ4" "1" "2018-01-13" "lz4 1.8.1" "User Commands" +.TH "LZ4" "1" "September 2018" "lz4 1.8.3" "User Commands" . .SH "NAME" \fBlz4\fR \- lz4, unlz4, lz4cat \- Compress or decompress \.lz4 files @@ -115,7 +115,11 @@ Benchmark mode, using \fB#\fR compression level\. . .TP \fB\-#\fR -Compression level, with # being any value from 1 to 16\. Higher values trade compression speed for compression ratio\. Values above 16 are considered the same as 16\. Recommended values are 1 for fast compression (default), and 9 for high compression\. Speed/compression trade\-off will vary depending on data to compress\. Decompression speed remains fast at all settings\. +Compression level, with # being any value from 1 to 12\. Higher values trade compression speed for compression ratio\. Values above 12 are considered the same as 12\. Recommended values are 1 for fast compression (default), and 9 for high compression\. Speed/compression trade\-off will vary depending on data to compress\. Decompression speed remains fast at all settings\. +. +.TP +\fB\-D dictionaryName\fR +Compress, decompress or benchmark using dictionary \fIdictionaryName\fR\. Compression and decompression must use the same dictionary to be compatible\. Using a different dictionary during decompression will either abort due to decompression error, or generate a checksum error\. . .TP \fB\-f\fR \fB\-\-[no\-]force\fR @@ -151,6 +155,10 @@ Block size [4\-7](default : 7) Block Dependency (improves compression ratio on small blocks) . .TP +\fB\-\-fast[=#]\fR +switch to ultra\-fast compression levels\. If \fB=#\fR is not present, it defaults to \fB1\fR\. The higher the value, the faster the compression speed, at the cost of some compression ratio\. This setting overwrites compression level if one was set previously\. Similarly, if a compression level is set after \fB\-\-fast\fR, it overrides it\. +. +.TP \fB\-\-[no\-]frame\-crc\fR Select frame checksum (default:enabled) . @@ -214,7 +222,7 @@ Benchmark multiple compression levels, from b# to e# (included) . .TP \fB\-i#\fR -Minimum evaluation in seconds [1\-9] (default : 3) +Minimum evaluation time in seconds [1\-9] (default : 3) . .SH "BUGS" Report bugs at: https://github\.com/lz4/lz4/issues diff --git a/programs/lz4.1.md b/programs/lz4.1.md index d4eaf8a..12b8e29 100644 --- a/programs/lz4.1.md +++ b/programs/lz4.1.md @@ -125,6 +125,19 @@ only the latest one will be applied. Speed/compression trade-off will vary depending on data to compress. Decompression speed remains fast at all settings. +* `--fast[=#]`: + switch to ultra-fast compression levels. + The higher the value, the faster the compression speed, at the cost of some compression ratio. + If `=#` is not present, it defaults to `1`. + This setting overrides compression level if one was set previously. + Similarly, if a compression level is set after `--fast`, it overrides it. + +* `-D dictionaryName`: + Compress, decompress or benchmark using dictionary _dictionaryName_. + Compression and decompression must use the same dictionary to be compatible. + Using a different dictionary during decompression will either + abort due to decompression error, or generate a checksum error. + * `-f` `--[no-]force`: This option has several effects: @@ -156,13 +169,6 @@ only the latest one will be applied. * `-BD`: Block Dependency (improves compression ratio on small blocks) -* `--fast[=#]`: - switch to ultra-fast compression levels. - If `=#` is not present, it defaults to `1`. - The higher the value, the faster the compression speed, at the cost of some compression ratio. - This setting overwrites compression level if one was set previously. - Similarly, if a compression level is set after `--fast`, it overrides it. - * `--[no-]frame-crc`: Select frame checksum (default:enabled) diff --git a/programs/lz4cli.c b/programs/lz4cli.c index dc60b00..26a8089 100644 --- a/programs/lz4cli.c +++ b/programs/lz4cli.c @@ -110,7 +110,7 @@ static int usage(const char* exeName) DISPLAY( " -9 : High compression \n"); DISPLAY( " -d : decompression (default for %s extension)\n", LZ4_EXTENSION); DISPLAY( " -z : force compression \n"); - DISPLAY( " -D FILE: use dictionary in FILE \n"); + DISPLAY( " -D FILE: use FILE as dictionary \n"); DISPLAY( " -f : overwrite output without prompting \n"); DISPLAY( " -k : preserve source files(s) (default) \n"); DISPLAY( "--rm : remove source file(s) after successful de/compression \n"); diff --git a/tests/.gitignore b/tests/.gitignore index 36dff42..9aa42a0 100644 --- a/tests/.gitignore +++ b/tests/.gitignore @@ -1,5 +1,5 @@ -# test build artefacts +# build artefacts datagen frametest frametest32 @@ -8,8 +8,12 @@ fullbench32 fuzzer fuzzer32 fasttest +roundTripTest checkTag # test artefacts tmp* versionsTest + +# local tests +afl diff --git a/tests/Makefile b/tests/Makefile index 81033b5..f270c46 100644 --- a/tests/Makefile +++ b/tests/Makefile @@ -114,7 +114,8 @@ clean: fullbench$(EXT) fullbench32$(EXT) \ fuzzer$(EXT) fuzzer32$(EXT) \ frametest$(EXT) frametest32$(EXT) \ - fasttest$(EXT) datagen$(EXT) checkTag$(EXT) + fasttest$(EXT) roundTripTest$(EXT) \ + datagen$(EXT) checkTag$(EXT) @rm -fR $(TESTDIR) @echo Cleaning completed -- 2.7.4