1 /* xdelta 3 - delta compression tools and library
2 * Copyright (C) 2001, 2003, 2004, 2005, 2006, 2007,
3 * 2008, 2009, 2010, 2011, 2012, 2013. Joshua P. MacDonald
5 * This program is free software; you can redistribute it and/or modify
6 * it under the terms of the GNU General Public License as published by
7 * the Free Software Foundation; either version 2 of the License, or
8 * (at your option) any later version.
10 * This program is distributed in the hope that it will be useful,
11 * but WITHOUT ANY WARRANTY; without even the implied warranty of
12 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
13 * GNU General Public License for more details.
15 * You should have received a copy of the GNU General Public License
16 * along with this program; if not, write to the Free Software
17 * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
19 -------------------------------------------------------------------
23 The goal of this library is to to implement both the (stand-alone)
24 data-compression and delta-compression aspects of VCDIFF encoding, and
25 to support a programming interface that works like Zlib
26 (http://www.gzip.org/zlib.html). See RFC3284: The VCDIFF Generic
27 Differencing and Compression Data Format.
29 VCDIFF is a unified encoding that combines data-compression and
30 delta-encoding ("differencing").
32 VCDIFF has a detailed byte-code instruction set with many features.
33 The instruction format supports an immediate size operand for small
34 COPYs and ADDs (e.g., under 18 bytes). There are also instruction
35 "modes", which are used to compress COPY addresses by using two
36 address caches. An instruction mode refers to slots in the NEAR
37 and SAME caches for recent addresses. NEAR remembers the
38 previous 4 (by default) COPY addresses, and SAME catches
39 frequent re-uses of the same address using a 3-way (by default)
40 256-entry associative cache of [ADDR mod 256], the encoded byte.
41 A hit in the NEAR/SAME cache requires 0/1 ADDR bytes.
43 VCDIFF has a default instruction table, but an alternate
44 instruction tables may themselves be be delta-compressed and
45 included in the encoding header. This allows even more freedom.
46 There are 9 instruction modes in the default code table, 4 near, 3
47 same, VCD_SELF (absolute encoding) and VCD_HERE (relative to the
50 ----------------------------------------------------------------------
54 Aside from the details of encoding and decoding, there are a bunch
57 1. STRING-MATCH. A two-level fingerprinting approach is used. A
58 single loop computes the two checksums -- small and large -- at
59 successive offsets in the TARGET file. The large checksum is more
60 accurate and is used to discover SOURCE matches, which are
61 potentially very long. The small checksum is used to discover
62 copies within the TARGET. Small matching, which is more expensive,
63 usually dominates the large STRING-MATCH costs in this code - the
64 more exhaustive the search, the better the results. Either of the
65 two string-matching mechanisms may be disabled.
67 2. INSTRUCTION SELECTION. The IOPT buffer here represents a queue
68 used to store overlapping copy instructions. There are two possible
69 optimizations that go beyond a greedy search. Both of these fall
70 into the category of "non-greedy matching" optimizations.
72 The first optimization stems from backward SOURCE-COPY matching.
73 When a new SOURCE-COPY instruction covers a previous instruction in
74 the target completely, it is erased from the queue. Randal Burns
75 originally analyzed these algorithms and did a lot of related work
76 (\cite the 1.5-pass algorithm).
78 The second optimization comes by the encoding of common very-small
79 COPY and ADD instructions, for which there are special DOUBLE-code
80 instructions, which code two instructions in a single byte.
82 The cost of bad instruction-selection overhead is relatively high
83 for data-compression, relative to delta-compression, so this second
84 optimization is fairly important. With "lazy" matching (the name
85 used in Zlib for a similar optimization), the string-match
86 algorithm searches after a match for potential overlapping copy
87 instructions. In Xdelta and by default, VCDIFF, the minimum match
88 size is 4 bytes, whereas Zlib searches with a 3-byte minimum. This
89 feature, combined with double instructions, provides a nice
90 challenge. Search in this file for "black magic", a heuristic.
92 3. STREAM ALIGNMENT. Stream alignment is needed to compress large
93 inputs in constant space. See xd3_srcwin_move_point().
95 4. WINDOW SELECTION. When the IOPT buffer flushes, in the first call
96 to xd3_iopt_finish_encoding containing any kind of copy instruction,
97 the parameters of the source window must be decided: the offset into
98 the source and the length of the window. Since the IOPT buffer is
99 finite, the program may be forced to fix these values before knowing
100 the best offset/length.
102 5. SECONDARY COMPRESSION. VCDIFF supports a secondary encoding to
103 be applied to the individual sections of the data format, which are
104 ADDRess, INSTruction, and DATA. Several secondary compressor
105 variations are implemented here, although none is standardized yet.
107 One is an adaptive huffman algorithm -- the FGK algorithm (Faller,
108 Gallager, and Knuth, 1985). This compressor is extremely slow.
110 The other is a simple static Huffman routine, which is the base
111 case of a semi-adaptive scheme published by D.J. Wheeler and first
112 widely used in bzip2 (by Julian Seward). This is a very
113 interesting algorithm, originally published in nearly cryptic form
114 by D.J. Wheeler. !!!NOTE!!! Because these are not standardized,
115 secondary compression remains off by default.
116 ftp://ftp.cl.cam.ac.uk/users/djw3/bred3.{c,ps}
117 --------------------------------------------------------------------
123 For user convenience, it is essential to recognize Gzip-compressed
124 files and automatically Gzip-decompress them prior to
125 delta-compression (or else no delta-compression will be achieved
126 unless the user manually decompresses the inputs). The compressed
127 represention competes with Xdelta, and this must be hidden from the
128 command-line user interface. The Xdelta-1.x encoding was simple, not
129 compressed itself, so Xdelta-1.x uses Zlib internally to compress the
132 This implementation supports external compression, which implements
133 the necessary fork() and pipe() mechanics. There is a tricky step
134 involved to support automatic detection of a compressed input in a
135 non-seekable input. First you read a bit of the input to detect
136 magic headers. When a compressed format is recognized, exec() the
137 external compression program and create a second child process to
138 copy the original input stream. [Footnote: There is a difficulty
139 related to using Gzip externally. It is not possible to decompress
140 and recompress a Gzip file transparently. If FILE.GZ had a
141 cryptographic signature, then, after: (1) Gzip-decompression, (2)
142 Xdelta-encoding, (3) Gzip-compression the signature could be
143 broken. The only way to solve this problem is to guess at Gzip's
144 compression level or control it by other means. I recommend that
145 specific implementations of any compression scheme store
146 information needed to exactly re-compress the input, that way
147 external compression is transparent - however, this won't happen
148 here until it has stabilized.]
150 2. APPLICATION-HEADER
152 This feature was introduced in RFC3284. It allows any application
153 to include a header within the VCDIFF file format. This allows
154 general inter-application data exchange with support for
155 application-specific extensions to communicate metadata.
159 An optional checksum value is included with each window, which can
160 be used to validate the final result. This verifies the correct source
161 file was used for decompression as well as the obvious advantage:
162 checking the implementation (and underlying) correctness.
166 The code makes efforts to avoid copying data more than necessary.
167 The code delays many initialization tasks until the first use, it
168 optimizes for identical (perfectly matching) inputs. It does not
169 compute any checksums until the first lookup misses. Memory usage
170 is reduced. String-matching is templatized (by slightly gross use
171 of CPP) to hard-code alternative compile-time defaults. The code
172 has few outside dependencies.
173 ----------------------------------------------------------------------
175 The default rfc3284 instruction table:
176 (see RFC for the explanation)
178 TYPE SIZE MODE TYPE SIZE MODE INDEX
179 --------------------------------------------------------------------
180 1. Run 0 0 Noop 0 0 0
181 2. Add 0, [1,17] 0 Noop 0 0 [1,18]
182 3. Copy 0, [4,18] 0 Noop 0 0 [19,34]
183 4. Copy 0, [4,18] 1 Noop 0 0 [35,50]
184 5. Copy 0, [4,18] 2 Noop 0 0 [51,66]
185 6. Copy 0, [4,18] 3 Noop 0 0 [67,82]
186 7. Copy 0, [4,18] 4 Noop 0 0 [83,98]
187 8. Copy 0, [4,18] 5 Noop 0 0 [99,114]
188 9. Copy 0, [4,18] 6 Noop 0 0 [115,130]
189 10. Copy 0, [4,18] 7 Noop 0 0 [131,146]
190 11. Copy 0, [4,18] 8 Noop 0 0 [147,162]
191 12. Add [1,4] 0 Copy [4,6] 0 [163,174]
192 13. Add [1,4] 0 Copy [4,6] 1 [175,186]
193 14. Add [1,4] 0 Copy [4,6] 2 [187,198]
194 15. Add [1,4] 0 Copy [4,6] 3 [199,210]
195 16. Add [1,4] 0 Copy [4,6] 4 [211,222]
196 17. Add [1,4] 0 Copy [4,6] 5 [223,234]
197 18. Add [1,4] 0 Copy 4 6 [235,238]
198 19. Add [1,4] 0 Copy 4 7 [239,242]
199 20. Add [1,4] 0 Copy 4 8 [243,246]
200 21. Copy 4 [0,8] Add 1 0 [247,255]
201 --------------------------------------------------------------------
203 Reading the source: Overview
205 This file includes itself in several passes to macro-expand certain
206 sections with variable forms. Just read ahead, there's only a
207 little confusion. I know this sounds ugly, but hard-coding some of
208 the string-matching parameters results in a 10-15% increase in
209 string-match performance. The only time this hurts is when you have
210 unbalanced #if/endifs.
212 A single compilation unit tames the Makefile. In short, this is to
213 allow the above-described hack without an explodingMakefile. The
214 single compilation unit includes the core library features,
215 configurable string-match templates, optional main() command-line
216 tool, misc optional features, and a regression test. Features are
217 controled with CPP #defines, see Makefile.am.
219 The initial __XDELTA3_C_HEADER_PASS__ starts first, the _INLINE_ and
220 _TEMPLATE_ sections follow. Easy stuff first, hard stuff last.
222 Optional features include:
224 xdelta3-main.h The command-line interface, external compression
225 support, POSIX-specific, info & VCDIFF-debug tools.
226 xdelta3-second.h The common secondary compression routines.
227 xdelta3-decoder.h All decoding routines.
228 xdelta3-djw.h The semi-adaptive huffman secondary encoder.
229 xdelta3-fgk.h The adaptive huffman secondary encoder.
230 xdelta3-test.h The unit test covers major algorithms,
231 encoding and decoding. There are single-bit
232 error decoding tests. There are 32/64-bit file size
233 boundary tests. There are command-line tests.
234 There are compression tests. There are external
235 compression tests. There are string-matching tests.
236 There should be more tests...
238 Additional headers include:
240 xdelta3.h The public header file.
241 xdelta3-cfgs.h The default settings for default, built-in
242 encoders. These are hard-coded at
243 compile-time. There is also a single
244 soft-coded string matcher for experimenting
245 with arbitrary values.
246 xdelta3-list.h A cyclic list template
248 Misc little debug utilities:
250 badcopy.c Randomly modifies an input file based on two
251 parameters: (1) the probability that a byte in
252 the file is replaced with a pseudo-random value,
253 and (2) the mean change size. Changes are
254 generated using an expoential distribution
255 which approximates the expected error_prob
257 --------------------------------------------------------------------
259 This file itself is unusually large. I hope to defend this layout
260 with lots of comments. Everything in this file is related to
261 encoding and decoding. I like it all together - the template stuff
264 #ifndef __XDELTA3_C_HEADER_PASS__
265 #define __XDELTA3_C_HEADER_PASS__
269 /***********************************************************************
271 ***********************************************************************/
273 #ifndef XD3_MAIN /* the main application */
278 #define VCDIFF_TOOLS XD3_MAIN
281 #ifndef SECONDARY_FGK /* one from the algorithm preservation department: */
282 #define SECONDARY_FGK 0 /* adaptive Huffman routines */
285 #ifndef SECONDARY_DJW /* semi-adaptive/static Huffman for the eventual */
286 #define SECONDARY_DJW 0 /* standardization, off by default until such time. */
289 #ifndef SECONDARY_LZMA
291 #define SECONDARY_LZMA 1
293 #define SECONDARY_LZMA 0
298 #define IF_ENCODER(x) x
300 #define IF_ENCODER(x)
303 /***********************************************************************/
305 /* header indicator bits */
306 #define VCD_SECONDARY (1U << 0) /* uses secondary compressor */
307 #define VCD_CODETABLE (1U << 1) /* supplies code table data */
308 #define VCD_APPHEADER (1U << 2) /* supplies application data */
309 #define VCD_INVHDR (~0x7U)
311 /* window indicator bits */
312 #define VCD_SOURCE (1U << 0) /* copy window in source file */
313 #define VCD_TARGET (1U << 1) /* copy window in target file */
314 #define VCD_ADLER32 (1U << 2) /* has adler32 checksum */
315 #define VCD_INVWIN (~0x7U)
317 #define VCD_SRCORTGT (VCD_SOURCE | VCD_TARGET)
319 /* delta indicator bits */
320 #define VCD_DATACOMP (1U << 0)
321 #define VCD_INSTCOMP (1U << 1)
322 #define VCD_ADDRCOMP (1U << 2)
323 #define VCD_INVDEL (~0x7U)
328 VCD_FGK_ID = 16 /* Note: these are not standard IANA-allocated IDs! */
334 /* Note: SEC_COUNT_FREQS Not implemented (to eliminate 1st Huffman pass) */
335 SEC_COUNT_FREQS = (1 << 0)
336 } xd3_secondary_flags;
339 DATA_SECTION, /* These indicate which section to the secondary
341 INST_SECTION, /* The header section is not compressed, therefore not
346 typedef unsigned int xd3_rtype;
348 /***********************************************************************/
350 #include "xdelta3-list.h"
352 XD3_MAKELIST(xd3_rlist, xd3_rinst, link);
354 /***********************************************************************/
356 #define SECONDARY_MIN_SAVINGS 2 /* Secondary compression has to save
357 at least this many bytes. */
358 #define SECONDARY_MIN_INPUT 10 /* Secondary compression needs at
359 least this many bytes. */
361 #define VCDIFF_MAGIC1 0xd6 /* 1st file byte */
362 #define VCDIFF_MAGIC2 0xc3 /* 2nd file byte */
363 #define VCDIFF_MAGIC3 0xc4 /* 3rd file byte */
364 #define VCDIFF_VERSION 0x00 /* 4th file byte */
366 #define VCD_SELF 0 /* 1st address mode */
367 #define VCD_HERE 1 /* 2nd address mode */
369 #define CODE_TABLE_STRING_SIZE (6 * 256) /* Should fit a code table string. */
370 #define CODE_TABLE_VCDIFF_SIZE (6 * 256) /* Should fit a compressed code
373 #define SECONDARY_ANY (SECONDARY_DJW || SECONDARY_FGK || SECONDARY_LZMA)
375 #define ALPHABET_SIZE 256 /* Used in test code--size of the secondary
376 * compressor alphabet. */
378 #define HASH_PERMUTE 1 /* The input is permuted by random nums */
379 #define ADLER_LARGE_CKSUM 1 /* Adler checksum vs. RK checksum */
381 #define HASH_CKOFFSET 1U /* Table entries distinguish "no-entry" from
382 * offset 0 using this offset. */
384 #define MIN_SMALL_LOOK 2U /* Match-optimization stuff. */
385 #define MIN_LARGE_LOOK 2U
386 #define MIN_MATCH_OFFSET 1U
387 #define MAX_MATCH_SPLIT 18U /* VCDIFF code table: 18 is the default limit
388 * for direct-coded ADD sizes */
390 #define LEAST_MATCH_INCR 0 /* The least number of bytes an overlapping
391 * match must beat the preceding match by. This
392 * is a bias for the lazy match optimization. A
393 * non-zero value means that an adjacent match
394 * has to be better by more than the step
395 * between them. 0. */
397 #define MIN_MATCH 4U /* VCDIFF code table: MIN_MATCH=4 */
398 #define MIN_ADD 1U /* 1 */
399 #define MIN_RUN 8U /* The shortest run, if it is shorter than this
400 * an immediate add/copy will be just as good.
401 * ADD1/COPY6 = 1I+1D+1A bytes, RUN18 =
404 #define MAX_MODES 9 /* Maximum number of nodes used for
405 * compression--does not limit decompression. */
407 #define ENC_SECTS 4 /* Number of separate output sections. */
409 #define HDR_TAIL(s) ((s)->enc_tails[0])
410 #define DATA_TAIL(s) ((s)->enc_tails[1])
411 #define INST_TAIL(s) ((s)->enc_tails[2])
412 #define ADDR_TAIL(s) ((s)->enc_tails[3])
414 #define HDR_HEAD(s) ((s)->enc_heads[0])
415 #define DATA_HEAD(s) ((s)->enc_heads[1])
416 #define INST_HEAD(s) ((s)->enc_heads[2])
417 #define ADDR_HEAD(s) ((s)->enc_heads[3])
419 #define TOTAL_MODES(x) (2+(x)->acache.s_same+(x)->acache.s_near)
421 /* Template instances. */
423 #define IF_BUILD_SLOW(x) x
425 #define IF_BUILD_SLOW(x)
428 #define IF_BUILD_FAST(x) x
430 #define IF_BUILD_FAST(x)
433 #define IF_BUILD_FASTER(x) x
435 #define IF_BUILD_FASTER(x)
437 #if XD3_BUILD_FASTEST
438 #define IF_BUILD_FASTEST(x) x
440 #define IF_BUILD_FASTEST(x)
443 #define IF_BUILD_SOFT(x) x
445 #define IF_BUILD_SOFT(x)
447 #if XD3_BUILD_DEFAULT
448 #define IF_BUILD_DEFAULT(x) x
450 #define IF_BUILD_DEFAULT(x)
453 /* Consume N bytes of input, only used by the decoder. */
454 #define DECODE_INPUT(n) \
456 stream->total_in += (xoff_t) (n); \
457 stream->avail_in -= (n); \
458 stream->next_in += (n); \
461 /* Update the run-length state */
462 #define NEXTRUN(c) do { if ((c) == run_c) { run_l += 1; } \
463 else { run_c = (c); run_l = 1; } } while (0)
465 /* This CPP-conditional stuff can be cleaned up... */
467 #define IF_REGRESSION(x) x
469 #define IF_REGRESSION(x)
472 /***********************************************************************/
475 static void* xd3_alloc0 (xd3_stream *stream,
480 static xd3_output* xd3_alloc_output (xd3_stream *stream,
481 xd3_output *old_output);
483 static int xd3_alloc_iopt (xd3_stream *stream, usize_t elts);
485 static void xd3_free_output (xd3_stream *stream,
488 static int xd3_emit_byte (xd3_stream *stream,
489 xd3_output **outputp,
492 static int xd3_emit_bytes (xd3_stream *stream,
493 xd3_output **outputp,
497 static int xd3_emit_double (xd3_stream *stream, xd3_rinst *first,
498 xd3_rinst *second, usize_t code);
499 static int xd3_emit_single (xd3_stream *stream, xd3_rinst *single,
502 static usize_t xd3_sizeof_output (xd3_output *output);
503 static void xd3_encode_reset (xd3_stream *stream);
505 static int xd3_source_match_setup (xd3_stream *stream, xoff_t srcpos);
506 static int xd3_source_extend_match (xd3_stream *stream);
507 static int xd3_srcwin_setup (xd3_stream *stream);
508 static usize_t xd3_iopt_last_matched (xd3_stream *stream);
509 static int xd3_emit_uint32_t (xd3_stream *stream, xd3_output **output,
512 static usize_t xd3_smatch (xd3_stream *stream,
515 usize_t *match_offset);
516 static int xd3_string_match_init (xd3_stream *stream);
517 static uint32_t xd3_scksum (uint32_t *state, const uint8_t *seg,
519 static usize_t xd3_comprun (const uint8_t *seg, usize_t slook, uint8_t *run_cp);
520 static int xd3_srcwin_move_point (xd3_stream *stream,
521 usize_t *next_move_point);
523 static int xd3_emit_run (xd3_stream *stream, usize_t pos,
524 usize_t size, uint8_t *run_c);
525 static usize_t xd3_checksum_hash (const xd3_hash_cfg *cfg,
526 const usize_t cksum);
527 static xoff_t xd3_source_cksum_offset(xd3_stream *stream, usize_t low);
528 static void xd3_scksum_insert (xd3_stream *stream,
535 static void xd3_verify_run_state (xd3_stream *stream,
539 static void xd3_verify_large_state (xd3_stream *stream,
542 static void xd3_verify_small_state (xd3_stream *stream,
546 #endif /* XD3_DEBUG */
547 #endif /* XD3_ENCODER */
549 static int xd3_decode_allocate (xd3_stream *stream, usize_t size,
550 uint8_t **copied1, usize_t *alloc1);
552 static void* xd3_alloc (xd3_stream *stream, usize_t elts, usize_t size);
553 static void xd3_free (xd3_stream *stream, void *ptr);
555 static int xd3_read_uint32_t (xd3_stream *stream, const uint8_t **inpp,
556 const uint8_t *max, uint32_t *valp);
559 static int xd3_selftest (void);
562 /***********************************************************************/
564 #define UINT32_OFLOW_MASK 0xfe000000U
565 #define UINT64_OFLOW_MASK 0xfe00000000000000ULL
567 #if SIZEOF_USIZE_T == 4
568 #define USIZE_T_MAX UINT32_MAX
569 #define xd3_decode_size xd3_decode_uint32_t
570 #define xd3_emit_size xd3_emit_uint32_t
571 #define xd3_sizeof_size xd3_sizeof_uint32_t
572 #define xd3_read_size xd3_read_uint32_t
573 #elif SIZEOF_USIZE_T == 8
574 #define USIZE_T_MAX UINT64_MAX
575 #define xd3_decode_size xd3_decode_uint64_t
576 #define xd3_emit_size xd3_emit_uint64_t
577 #define xd3_sizeof_size xd3_sizeof_uint64_t
578 #define xd3_read_size xd3_read_uint64_t
581 #if SIZEOF_XOFF_T == 4
582 #define XOFF_T_MAX UINT32_MAX
583 #define xd3_decode_offset xd3_decode_uint32_t
584 #define xd3_emit_offset xd3_emit_uint32_t
585 #elif SIZEOF_XOFF_T == 8
586 #define XOFF_T_MAX UINT64_MAX
587 #define xd3_decode_offset xd3_decode_uint64_t
588 #define xd3_emit_offset xd3_emit_uint64_t
591 #define USIZE_T_OVERFLOW(a,b) ((USIZE_T_MAX - (usize_t) (a)) < (usize_t) (b))
592 #define XOFF_T_OVERFLOW(a,b) ((XOFF_T_MAX - (xoff_t) (a)) < (xoff_t) (b))
594 const char* xd3_strerror (int ret)
598 case XD3_INPUT: return "XD3_INPUT";
599 case XD3_OUTPUT: return "XD3_OUTPUT";
600 case XD3_GETSRCBLK: return "XD3_GETSRCBLK";
601 case XD3_GOTHEADER: return "XD3_GOTHEADER";
602 case XD3_WINSTART: return "XD3_WINSTART";
603 case XD3_WINFINISH: return "XD3_WINFINISH";
604 case XD3_TOOFARBACK: return "XD3_TOOFARBACK";
605 case XD3_INTERNAL: return "XD3_INTERNAL";
606 case XD3_INVALID: return "XD3_INVALID";
607 case XD3_INVALID_INPUT: return "XD3_INVALID_INPUT";
608 case XD3_NOSECOND: return "XD3_NOSECOND";
609 case XD3_UNIMPLEMENTED: return "XD3_UNIMPLEMENTED";
614 /***********************************************************************/
616 #define xd3_sec_data(s) ((s)->sec_stream_d)
617 #define xd3_sec_inst(s) ((s)->sec_stream_i)
618 #define xd3_sec_addr(s) ((s)->sec_stream_a)
624 xd3_secondary_flags flags;
626 /* xd3_sec_stream is opaque to the generic code */
627 xd3_sec_stream* (*alloc) (xd3_stream *stream);
628 void (*destroy) (xd3_stream *stream,
629 xd3_sec_stream *sec);
630 int (*init) (xd3_stream *stream,
631 xd3_sec_stream *sec_stream,
633 int (*decode) (xd3_stream *stream,
634 xd3_sec_stream *sec_stream,
635 const uint8_t **input,
636 const uint8_t *input_end,
638 const uint8_t *output_end);
640 int (*encode) (xd3_stream *stream,
641 xd3_sec_stream *sec_stream,
648 #define BIT_STATE_ENCODE_INIT { 0, 1 }
649 #define BIT_STATE_DECODE_INIT { 0, 0x100 }
651 typedef struct _bit_state bit_state;
658 #if SECONDARY_ANY == 0
665 xd3_decode_secondary (xd3_stream *stream,
667 xd3_sec_stream **sec_streamp);
670 xd3_encode_secondary (xd3_stream *stream,
673 xd3_sec_stream **sec_streamp,
677 #endif /* SECONDARY_ANY */
680 extern const xd3_sec_type fgk_sec_type;
682 #define FGK_CASE(s) \
683 s->sec_type = & fgk_sec_type; \
687 #define FGK_CASE(s) \
688 s->msg = "unavailable secondary compressor: FGK Adaptive Huffman"; \
693 extern const xd3_sec_type djw_sec_type;
695 #define DJW_CASE(s) \
696 s->sec_type = & djw_sec_type; \
700 #define DJW_CASE(s) \
701 s->msg = "unavailable secondary compressor: DJW Static Huffman"; \
706 extern const xd3_sec_type lzma_sec_type;
708 #define LZMA_CASE(s) \
709 s->sec_type = & lzma_sec_type; \
713 #define LZMA_CASE(s) \
714 s->msg = "unavailable secondary compressor: LZMA"; \
718 /***********************************************************************/
720 #include "xdelta3-hash.h"
722 /* Process template passes - this includes xdelta3.c several times. */
723 #define __XDELTA3_C_TEMPLATE_PASS__
724 #include "xdelta3-cfgs.h"
725 #undef __XDELTA3_C_TEMPLATE_PASS__
727 /* Process the inline pass. */
728 #define __XDELTA3_C_INLINE_PASS__
730 #undef __XDELTA3_C_INLINE_PASS__
732 /* Secondary compression */
734 #include "xdelta3-second.h"
738 #include "xdelta3-fgk.h"
739 const xd3_sec_type fgk_sec_type =
742 "FGK Adaptive Huffman",
744 (xd3_sec_stream* (*)(xd3_stream*)) fgk_alloc,
745 (void (*)(xd3_stream*, xd3_sec_stream*)) fgk_destroy,
746 (int (*)(xd3_stream*, xd3_sec_stream*, int)) fgk_init,
747 (int (*)(xd3_stream*, xd3_sec_stream*, const uint8_t**, const uint8_t*,
748 uint8_t**, const uint8_t*)) xd3_decode_fgk,
749 IF_ENCODER((int (*)(xd3_stream*, xd3_sec_stream*, xd3_output*,
750 xd3_output*, xd3_sec_cfg*)) xd3_encode_fgk)
755 #include "xdelta3-djw.h"
756 const xd3_sec_type djw_sec_type =
761 (xd3_sec_stream* (*)(xd3_stream*)) djw_alloc,
762 (void (*)(xd3_stream*, xd3_sec_stream*)) djw_destroy,
763 (int (*)(xd3_stream*, xd3_sec_stream*, int)) djw_init,
764 (int (*)(xd3_stream*, xd3_sec_stream*, const uint8_t**, const uint8_t*,
765 uint8_t**, const uint8_t*)) xd3_decode_huff,
766 IF_ENCODER((int (*)(xd3_stream*, xd3_sec_stream*, xd3_output*,
767 xd3_output*, xd3_sec_cfg*)) xd3_encode_huff)
772 #include "xdelta3-lzma.h"
773 const xd3_sec_type lzma_sec_type =
778 (xd3_sec_stream* (*)(xd3_stream*)) xd3_lzma_alloc,
779 (void (*)(xd3_stream*, xd3_sec_stream*)) xd3_lzma_destroy,
780 (int (*)(xd3_stream*, xd3_sec_stream*, int)) xd3_lzma_init,
781 (int (*)(xd3_stream*, xd3_sec_stream*, const uint8_t**, const uint8_t*,
782 uint8_t**, const uint8_t*)) xd3_decode_lzma,
783 IF_ENCODER((int (*)(xd3_stream*, xd3_sec_stream*, xd3_output*,
784 xd3_output*, xd3_sec_cfg*)) xd3_encode_lzma)
788 #if XD3_MAIN || PYTHON_MODULE || SWIG_MODULE || NOT_MAIN
789 #include "xdelta3-main.h"
793 #include "xdelta3-test.h"
796 #endif /* __XDELTA3_C_HEADER_PASS__ */
797 #ifdef __XDELTA3_C_INLINE_PASS__
799 const uint16_t __single_hash[256] =
801 /* Random numbers generated using SLIB's pseudo-random number generator.
802 * This hashes the input alphabet. */
803 0xbcd1, 0xbb65, 0x42c2, 0xdffe, 0x9666, 0x431b, 0x8504, 0xeb46,
804 0x6379, 0xd460, 0xcf14, 0x53cf, 0xdb51, 0xdb08, 0x12c8, 0xf602,
805 0xe766, 0x2394, 0x250d, 0xdcbb, 0xa678, 0x02af, 0xa5c6, 0x7ea6,
806 0xb645, 0xcb4d, 0xc44b, 0xe5dc, 0x9fe6, 0x5b5c, 0x35f5, 0x701a,
807 0x220f, 0x6c38, 0x1a56, 0x4ca3, 0xffc6, 0xb152, 0x8d61, 0x7a58,
808 0x9025, 0x8b3d, 0xbf0f, 0x95a3, 0xe5f4, 0xc127, 0x3bed, 0x320b,
809 0xb7f3, 0x6054, 0x333c, 0xd383, 0x8154, 0x5242, 0x4e0d, 0x0a94,
810 0x7028, 0x8689, 0x3a22, 0x0980, 0x1847, 0xb0f1, 0x9b5c, 0x4176,
811 0xb858, 0xd542, 0x1f6c, 0x2497, 0x6a5a, 0x9fa9, 0x8c5a, 0x7743,
812 0xa8a9, 0x9a02, 0x4918, 0x438c, 0xc388, 0x9e2b, 0x4cad, 0x01b6,
813 0xab19, 0xf777, 0x365f, 0x1eb2, 0x091e, 0x7bf8, 0x7a8e, 0x5227,
814 0xeab1, 0x2074, 0x4523, 0xe781, 0x01a3, 0x163d, 0x3b2e, 0x287d,
815 0x5e7f, 0xa063, 0xb134, 0x8fae, 0x5e8e, 0xb7b7, 0x4548, 0x1f5a,
816 0xfa56, 0x7a24, 0x900f, 0x42dc, 0xcc69, 0x02a0, 0x0b22, 0xdb31,
817 0x71fe, 0x0c7d, 0x1732, 0x1159, 0xcb09, 0xe1d2, 0x1351, 0x52e9,
818 0xf536, 0x5a4f, 0xc316, 0x6bf9, 0x8994, 0xb774, 0x5f3e, 0xf6d6,
819 0x3a61, 0xf82c, 0xcc22, 0x9d06, 0x299c, 0x09e5, 0x1eec, 0x514f,
820 0x8d53, 0xa650, 0x5c6e, 0xc577, 0x7958, 0x71ac, 0x8916, 0x9b4f,
821 0x2c09, 0x5211, 0xf6d8, 0xcaaa, 0xf7ef, 0x287f, 0x7a94, 0xab49,
822 0xfa2c, 0x7222, 0xe457, 0xd71a, 0x00c3, 0x1a76, 0xe98c, 0xc037,
823 0x8208, 0x5c2d, 0xdfda, 0xe5f5, 0x0b45, 0x15ce, 0x8a7e, 0xfcad,
824 0xaa2d, 0x4b5c, 0xd42e, 0xb251, 0x907e, 0x9a47, 0xc9a6, 0xd93f,
825 0x085e, 0x35ce, 0xa153, 0x7e7b, 0x9f0b, 0x25aa, 0x5d9f, 0xc04d,
826 0x8a0e, 0x2875, 0x4a1c, 0x295f, 0x1393, 0xf760, 0x9178, 0x0f5b,
827 0xfa7d, 0x83b4, 0x2082, 0x721d, 0x6462, 0x0368, 0x67e2, 0x8624,
828 0x194d, 0x22f6, 0x78fb, 0x6791, 0xb238, 0xb332, 0x7276, 0xf272,
829 0x47ec, 0x4504, 0xa961, 0x9fc8, 0x3fdc, 0xb413, 0x007a, 0x0806,
830 0x7458, 0x95c6, 0xccaa, 0x18d6, 0xe2ae, 0x1b06, 0xf3f6, 0x5050,
831 0xc8e8, 0xf4ac, 0xc04c, 0xf41c, 0x992f, 0xae44, 0x5f1b, 0x1113,
832 0x1738, 0xd9a8, 0x19ea, 0x2d33, 0x9698, 0x2fe9, 0x323f, 0xcde2,
833 0x6d71, 0xe37d, 0xb697, 0x2c4f, 0x4373, 0x9102, 0x075d, 0x8e25,
834 0x1672, 0xec28, 0x6acb, 0x86cc, 0x186e, 0x9414, 0xd674, 0xd1a5
837 /****************************************************************
839 *****************************************************************/
841 /* The following code implements a parametrized description of the
842 * code table given above for a few reasons. It is not necessary for
843 * implementing the standard, to support compression with variable
844 * tables, so an implementation is only required to know the default
845 * code table to begin decompression. (If the encoder uses an
846 * alternate table, the table is included in compressed form inside
849 * Before adding variable-table support there were two functions which
850 * were hard-coded to the default table above.
851 * xd3_compute_default_table() would create the default table by
852 * filling a 256-elt array of xd3_dinst values. The corresponding
853 * function, xd3_choose_instruction(), would choose an instruction
854 * based on the hard-coded parameters of the default code table.
856 * Notes: The parametrized code table description here only generates
857 * tables of a certain regularity similar to the default table by
858 * allowing to vary the distribution of single- and
859 * double-instructions and change the number of near and same copy
860 * modes. More exotic tables are only possible by extending this
863 * For performance reasons, both the parametrized and non-parametrized
864 * versions of xd3_choose_instruction remain. The parametrized
865 * version is only needed for testing multi-table decoding support.
866 * If ever multi-table encoding is required, this can be optimized by
867 * compiling static functions for each table.
870 /* The XD3_CHOOSE_INSTRUCTION calls xd3_choose_instruction with the
871 * table description when GENERIC_ENCODE_TABLES are in use. The
872 * IF_GENCODETBL macro enables generic-code-table specific code
873 * (removed 10/2014). */
874 #define XD3_CHOOSE_INSTRUCTION(stream,prev,inst) \
875 xd3_choose_instruction (prev, inst)
877 /* This structure maintains information needed by
878 * xd3_choose_instruction to compute the code for a double instruction
879 * by first indexing an array of code_table_sizes by copy mode, then
880 * using (offset + (muliplier * X)) */
881 struct _xd3_code_table_sizes {
887 /* This contains a complete description of a code table. */
888 struct _xd3_code_table_desc
890 /* Assumes a single RUN instruction */
891 /* Assumes that MIN_MATCH is 4 */
893 uint8_t add_sizes; /* Number of immediate-size single adds (default 17) */
894 uint8_t near_modes; /* Number of near copy modes (default 4) */
895 uint8_t same_modes; /* Number of same copy modes (default 3) */
896 uint8_t cpy_sizes; /* Number of immediate-size single copies (default 15) */
898 uint8_t addcopy_add_max; /* Maximum add size for an add-copy double instruction,
899 all modes (default 4) */
900 uint8_t addcopy_near_cpy_max; /* Maximum cpy size for an add-copy double instruction,
901 up through VCD_NEAR modes (default 6) */
902 uint8_t addcopy_same_cpy_max; /* Maximum cpy size for an add-copy double instruction,
903 VCD_SAME modes (default 4) */
905 uint8_t copyadd_add_max; /* Maximum add size for a copy-add double instruction,
906 all modes (default 1) */
907 uint8_t copyadd_near_cpy_max; /* Maximum cpy size for a copy-add double instruction,
908 up through VCD_NEAR modes (default 4) */
909 uint8_t copyadd_same_cpy_max; /* Maximum cpy size for a copy-add double instruction,
910 VCD_SAME modes (default 4) */
912 xd3_code_table_sizes addcopy_max_sizes[MAX_MODES];
913 xd3_code_table_sizes copyadd_max_sizes[MAX_MODES];
916 /* The rfc3284 code table is represented: */
917 static const xd3_code_table_desc __rfc3284_code_table_desc = {
923 4, /* add-copy max add */
924 6, /* add-copy max cpy, near */
925 4, /* add-copy max cpy, same */
927 1, /* copy-add max add */
928 4, /* copy-add max cpy, near */
929 4, /* copy-add max cpy, same */
932 { {6,163,3},{6,175,3},{6,187,3},{6,199,3},{6,211,3},{6,223,3},{4,235,1},{4,239,1},{4,243,1} },
934 { {4,247,1},{4,248,1},{4,249,1},{4,250,1},{4,251,1},{4,252,1},{4,253,1},{4,254,1},{4,255,1} },
937 /* Computes code table entries of TBL using the specified description. */
939 xd3_build_code_table (const xd3_code_table_desc *desc, xd3_dinst *tbl)
941 usize_t size1, size2, mode;
942 usize_t cpy_modes = 2 + desc->near_modes + desc->same_modes;
945 (d++)->type1 = XD3_RUN;
946 (d++)->type1 = XD3_ADD;
948 for (size1 = 1; size1 <= desc->add_sizes; size1 += 1, d += 1)
954 for (mode = 0; mode < cpy_modes; mode += 1)
956 (d++)->type1 = XD3_CPY + mode;
958 for (size1 = MIN_MATCH; size1 < MIN_MATCH + desc->cpy_sizes; size1 += 1, d += 1)
960 d->type1 = XD3_CPY + mode;
965 for (mode = 0; mode < cpy_modes; mode += 1)
967 for (size1 = 1; size1 <= desc->addcopy_add_max; size1 += 1)
969 usize_t max = (mode < 2U + desc->near_modes) ?
970 desc->addcopy_near_cpy_max :
971 desc->addcopy_same_cpy_max;
973 for (size2 = MIN_MATCH; size2 <= max; size2 += 1, d += 1)
977 d->type2 = XD3_CPY + mode;
983 for (mode = 0; mode < cpy_modes; mode += 1)
985 usize_t max = (mode < 2U + desc->near_modes) ?
986 desc->copyadd_near_cpy_max :
987 desc->copyadd_same_cpy_max;
989 for (size1 = MIN_MATCH; size1 <= max; size1 += 1)
991 for (size2 = 1; size2 <= desc->copyadd_add_max; size2 += 1, d += 1)
993 d->type1 = XD3_CPY + mode;
1001 XD3_ASSERT (d - tbl == 256);
1004 /* This function generates the static default code table. */
1005 static const xd3_dinst*
1006 xd3_rfc3284_code_table (void)
1008 static xd3_dinst __rfc3284_code_table[256];
1010 if (__rfc3284_code_table[0].type1 != XD3_RUN)
1012 xd3_build_code_table (& __rfc3284_code_table_desc, __rfc3284_code_table);
1015 return __rfc3284_code_table;
1019 /* This version of xd3_choose_instruction is hard-coded for the default
1022 xd3_choose_instruction (xd3_rinst *prev, xd3_rinst *inst)
1033 if (inst->size <= 17)
1035 inst->code1 += inst->size;
1037 if ( (inst->size == 1) &&
1039 (prev->size == 4) &&
1040 (prev->type >= XD3_CPY) )
1042 prev->code2 = 247 + (prev->type - XD3_CPY);
1050 int mode = inst->type - XD3_CPY;
1052 XD3_ASSERT (inst->type >= XD3_CPY && inst->type < 12);
1054 inst->code1 = 19 + 16 * mode;
1056 if (inst->size <= 18 && inst->size >= 4)
1058 inst->code1 += inst->size - 3;
1060 if ( (prev != NULL) &&
1061 (prev->type == XD3_ADD) &&
1064 if ( (inst->size <= 6) &&
1067 prev->code2 = 163 + (mode * 12) + (3 * (prev->size - 1)) + (inst->size - 4);
1069 XD3_ASSERT (prev->code2 <= 234);
1071 else if ( (inst->size == 4) &&
1074 prev->code2 = 235 + ((mode - 6) * 4) + (prev->size - 1);
1076 XD3_ASSERT (prev->code2 <= 246);
1081 XD3_ASSERT (inst->code1 <= 162);
1086 #endif /* XD3_ENCODER */
1088 /***********************************************************************/
1091 xd3_swap_uint8p (uint8_t** p1, uint8_t** p2)
1099 xd3_swap_usize_t (usize_t* p1, usize_t* p2)
1106 /* It's not constant time, but it computes the log. */
1108 xd3_check_pow2 (xoff_t value, usize_t *logof)
1112 if (logof == NULL) {
1118 for (; x != 0; x <<= 1, *logof += 1)
1126 return XD3_INTERNAL;
1130 xd3_pow2_roundup (size_t x)
1140 xd3_xoff_roundup (xoff_t x)
1150 xd3_round_blksize (usize_t sz, usize_t blksz)
1152 usize_t mod = sz & (blksz-1);
1154 XD3_ASSERT (xd3_check_pow2 (blksz, NULL) == 0);
1156 return mod ? (sz + (blksz - mod)) : sz;
1159 /***********************************************************************
1160 Adler32 stream function: code copied from Zlib, defined in RFC1950
1161 ***********************************************************************/
1163 #define A32_BASE 65521L /* Largest prime smaller than 2^16 */
1164 #define A32_NMAX 5552 /* NMAX is the largest n such that 255n(n+1)/2 + (n+1)(BASE-1) <= 2^32-1 */
1166 #define A32_DO1(buf,i) {s1 += buf[i]; s2 += s1;}
1167 #define A32_DO2(buf,i) A32_DO1(buf,i); A32_DO1(buf,i+1);
1168 #define A32_DO4(buf,i) A32_DO2(buf,i); A32_DO2(buf,i+2);
1169 #define A32_DO8(buf,i) A32_DO4(buf,i); A32_DO4(buf,i+4);
1170 #define A32_DO16(buf) A32_DO8(buf,0); A32_DO8(buf,8);
1172 static unsigned long adler32 (unsigned long adler, const uint8_t *buf,
1175 unsigned long s1 = adler & 0xffff;
1176 unsigned long s2 = (adler >> 16) & 0xffff;
1181 k = (len < A32_NMAX) ? len : A32_NMAX;
1205 return (s2 << 16) | s1;
1208 /***********************************************************************
1210 ***********************************************************************/
1214 xd3_comprun (const uint8_t *seg, usize_t slook, uint8_t *run_cp)
1220 for (i = 0; i < slook; i += 1)
1231 /***********************************************************************
1232 Basic encoder/decoder functions
1233 ***********************************************************************/
1236 xd3_decode_byte (xd3_stream *stream, usize_t *val)
1238 if (stream->avail_in == 0)
1240 stream->msg = "further input required";
1244 (*val) = stream->next_in[0];
1251 xd3_decode_bytes (xd3_stream *stream, uint8_t *buf, usize_t *pos, usize_t size)
1256 /* Note: The case where (*pos == size) happens when a zero-length
1257 * appheader or code table is transmitted, but there is nothing in
1258 * the standard against that. */
1261 if (stream->avail_in == 0)
1263 stream->msg = "further input required";
1268 take = min (want, stream->avail_in);
1270 memcpy (buf + *pos, stream->next_in, (size_t) take);
1272 DECODE_INPUT (take);
1281 xd3_emit_byte (xd3_stream *stream,
1282 xd3_output **outputp,
1285 xd3_output *output = (*outputp);
1287 if (output->next == output->avail)
1289 xd3_output *aoutput;
1291 if ((aoutput = xd3_alloc_output (stream, output)) == NULL)
1296 output = (*outputp) = aoutput;
1299 output->base[output->next++] = code;
1305 xd3_emit_bytes (xd3_stream *stream,
1306 xd3_output **outputp,
1307 const uint8_t *base,
1310 xd3_output *output = (*outputp);
1316 if (output->next == output->avail)
1318 xd3_output *aoutput;
1320 if ((aoutput = xd3_alloc_output (stream, output)) == NULL)
1325 output = (*outputp) = aoutput;
1328 take = min (output->avail - output->next, size);
1330 memcpy (output->base + output->next, base, (size_t) take);
1332 output->next += take;
1340 #endif /* XD3_ENCODER */
1342 /*********************************************************************
1343 Integer encoder/decoder functions
1344 **********************************************************************/
1346 #define DECODE_INTEGER_TYPE(PART,OFLOW) \
1347 while (stream->avail_in != 0) \
1349 usize_t next = stream->next_in[0]; \
1355 stream->msg = "overflow in decode_integer"; \
1356 return XD3_INVALID_INPUT; \
1359 PART = (PART << 7) | (next & 127); \
1361 if ((next & 128) == 0) \
1369 stream->msg = "further input required"; \
1372 #define READ_INTEGER_TYPE(TYPE, OFLOW) \
1374 const uint8_t *inp = (*inpp); \
1381 stream->msg = "end-of-input in read_integer"; \
1382 return XD3_INVALID_INPUT; \
1387 stream->msg = "overflow in read_intger"; \
1388 return XD3_INVALID_INPUT; \
1392 val = (val << 7) | (next & 127); \
1394 while (next & 128); \
1401 #define EMIT_INTEGER_TYPE() \
1402 /* max 64-bit value in base-7 encoding is 9.1 bytes */ \
1404 usize_t bufi = 10; \
1406 /* This loop performs division and turns on all MSBs. */ \
1409 buf[--bufi] = (num & 127) | 128; \
1414 /* Turn off MSB of the last byte. */ \
1417 return xd3_emit_bytes (stream, output, buf + bufi, 10 - bufi)
1419 #define IF_SIZEOF32(x) if (num < (1U << (7 * (x)))) return (x);
1420 #define IF_SIZEOF64(x) if (num < (1ULL << (7 * (x)))) return (x);
1423 static inline uint32_t
1424 xd3_sizeof_uint32_t (uint32_t num)
1434 xd3_decode_uint32_t (xd3_stream *stream, uint32_t *val)
1435 { DECODE_INTEGER_TYPE (stream->dec_32part, UINT32_OFLOW_MASK); }
1438 xd3_read_uint32_t (xd3_stream *stream, const uint8_t **inpp,
1439 const uint8_t *max, uint32_t *valp)
1440 { READ_INTEGER_TYPE (uint32_t, UINT32_OFLOW_MASK); }
1444 xd3_emit_uint32_t (xd3_stream *stream, xd3_output **output, uint32_t num)
1445 { EMIT_INTEGER_TYPE (); }
1451 xd3_decode_uint64_t (xd3_stream *stream, uint64_t *val)
1452 { DECODE_INTEGER_TYPE (stream->dec_64part, UINT64_OFLOW_MASK); }
1456 xd3_emit_uint64_t (xd3_stream *stream, xd3_output **output, uint64_t num)
1457 { EMIT_INTEGER_TYPE (); }
1460 /* These are tested but not used */
1463 xd3_read_uint64_t (xd3_stream *stream, const uint8_t **inpp,
1464 const uint8_t *max, uint64_t *valp)
1465 { READ_INTEGER_TYPE (uint64_t, UINT64_OFLOW_MASK); }
1468 xd3_sizeof_uint64_t (uint64_t num)
1486 /***********************************************************************
1488 ***********************************************************************/
1491 xd3_alloc_cache (xd3_stream *stream)
1493 if (stream->acache.near_array != NULL)
1495 xd3_free (stream, stream->acache.near_array);
1498 if (stream->acache.same_array != NULL)
1500 xd3_free (stream, stream->acache.same_array);
1503 if (((stream->acache.s_near > 0) &&
1504 (stream->acache.near_array = (usize_t*)
1505 xd3_alloc (stream, stream->acache.s_near,
1506 (usize_t) sizeof (usize_t)))
1508 ((stream->acache.s_same > 0) &&
1509 (stream->acache.same_array = (usize_t*)
1510 xd3_alloc (stream, stream->acache.s_same * 256,
1511 (usize_t) sizeof (usize_t)))
1521 xd3_init_cache (xd3_addr_cache* acache)
1523 if (acache->s_near > 0)
1525 memset (acache->near_array, 0, acache->s_near * sizeof (usize_t));
1526 acache->next_slot = 0;
1529 if (acache->s_same > 0)
1531 memset (acache->same_array, 0, acache->s_same * 256 * sizeof (usize_t));
1536 xd3_update_cache (xd3_addr_cache* acache, usize_t addr)
1538 if (acache->s_near > 0)
1540 acache->near_array[acache->next_slot] = addr;
1541 acache->next_slot = (acache->next_slot + 1) % acache->s_near;
1544 if (acache->s_same > 0)
1546 acache->same_array[addr % (acache->s_same*256)] = addr;
1551 /* OPT: this gets called a lot, can it be optimized? */
1553 xd3_encode_address (xd3_stream *stream,
1559 usize_t i, bestm, ret;
1560 xd3_addr_cache* acache = & stream->acache;
1562 #define SMALLEST_INT(x) do { if (((x) & ~127U) == 0) { goto good; } } while (0)
1564 /* Attempt to find the address mode that yields the smallest integer value
1565 * for "d", the encoded address value, thereby minimizing the encoded size
1566 * of the address. */
1570 XD3_ASSERT (addr < here);
1572 SMALLEST_INT (bestd);
1574 if ((d = here-addr) < bestd)
1579 SMALLEST_INT (bestd);
1582 for (i = 0; i < acache->s_near; i += 1)
1584 /* Note: If we used signed computation here, we'd could compte d
1585 * and then check (d >= 0 && d < bestd). */
1586 if (addr >= acache->near_array[i])
1588 d = addr - acache->near_array[i];
1593 bestm = i+2; /* 2 counts the VCD_SELF, VCD_HERE modes */
1595 SMALLEST_INT (bestd);
1600 if (acache->s_same > 0 &&
1601 acache->same_array[d = addr%(acache->s_same*256)] == addr)
1604 /* 2 + s_near offsets past the VCD_NEAR modes */
1605 bestm = acache->s_near + 2 + d/256;
1607 if ((ret = xd3_emit_byte (stream, & ADDR_TAIL (stream), bestd)))
1616 if ((ret = xd3_emit_size (stream, & ADDR_TAIL (stream), bestd)))
1622 xd3_update_cache (acache, addr);
1631 xd3_decode_address (xd3_stream *stream, usize_t here,
1632 usize_t mode, const uint8_t **inpp,
1633 const uint8_t *max, uint32_t *valp)
1636 usize_t same_start = 2 + stream->acache.s_near;
1638 if (mode < same_start)
1640 if ((ret = xd3_read_size (stream, inpp, max, valp))) { return ret; }
1647 (*valp) = here - (*valp);
1650 (*valp) += stream->acache.near_array[mode - 2];
1658 stream->msg = "address underflow";
1659 return XD3_INVALID_INPUT;
1664 (*valp) = stream->acache.same_array[mode*256 + (**inpp)];
1669 xd3_update_cache (& stream->acache, *valp);
1674 /***********************************************************************
1676 ***********************************************************************/
1679 __xd3_alloc_func (void* opaque, size_t items, usize_t size)
1681 return malloc (items * (size_t) size);
1685 __xd3_free_func (void* opaque, void* address)
1691 xd3_alloc (xd3_stream *stream,
1695 void *a = stream->alloc (stream->opaque, elts, size);
1699 IF_DEBUG (stream->alloc_cnt += 1);
1700 IF_DEBUG2 (DP(RINT "[stream %p malloc] size %u ptr %p\n",
1701 stream, elts * size, a));
1705 stream->msg = "out of memory";
1712 xd3_free (xd3_stream *stream,
1717 IF_DEBUG (stream->free_cnt += 1);
1718 XD3_ASSERT (stream->free_cnt <= stream->alloc_cnt);
1719 IF_DEBUG2 (DP(RINT "[stream %p free] %p\n",
1721 stream->free (stream->opaque, ptr);
1727 xd3_alloc0 (xd3_stream *stream,
1731 void *a = xd3_alloc (stream, elts, size);
1735 memset (a, 0, (size_t) (elts * size));
1742 xd3_alloc_output (xd3_stream *stream,
1743 xd3_output *old_output)
1748 if (stream->enc_free != NULL)
1750 output = stream->enc_free;
1751 stream->enc_free = output->next_page;
1755 if ((output = (xd3_output*) xd3_alloc (stream, 1,
1756 (usize_t) sizeof (xd3_output)))
1762 if ((base = (uint8_t*) xd3_alloc (stream, XD3_ALLOCSIZE,
1763 sizeof (uint8_t))) == NULL)
1765 xd3_free (stream, output);
1769 output->base = base;
1770 output->avail = XD3_ALLOCSIZE;
1777 old_output->next_page = output;
1780 output->next_page = NULL;
1786 xd3_sizeof_output (xd3_output *output)
1790 for (; output; output = output->next_page)
1799 xd3_freelist_output (xd3_stream *stream,
1807 output = output->next_page;
1810 tmp->next_page = stream->enc_free;
1811 stream->enc_free = tmp;
1816 xd3_free_output (xd3_stream *stream,
1827 next = output->next_page;
1829 xd3_free (stream, output->base);
1830 xd3_free (stream, output);
1835 #endif /* XD3_ENCODER */
1838 xd3_free_stream (xd3_stream *stream)
1840 xd3_iopt_buflist *blist = stream->iopt_alloc;
1842 while (blist != NULL)
1844 xd3_iopt_buflist *tmp = blist;
1845 blist = blist->next;
1846 xd3_free (stream, tmp->buffer);
1847 xd3_free (stream, tmp);
1850 xd3_free (stream, stream->large_table);
1851 xd3_free (stream, stream->small_table);
1852 xd3_free (stream, stream->small_prev);
1857 for (i = 0; i < ENC_SECTS; i += 1)
1859 xd3_free_output (stream, stream->enc_heads[i]);
1861 xd3_free_output (stream, stream->enc_free);
1865 xd3_free (stream, stream->acache.near_array);
1866 xd3_free (stream, stream->acache.same_array);
1868 xd3_free (stream, stream->inst_sect.copied1);
1869 xd3_free (stream, stream->addr_sect.copied1);
1870 xd3_free (stream, stream->data_sect.copied1);
1872 xd3_free (stream, stream->dec_buffer);
1873 xd3_free (stream, (uint8_t*) stream->dec_lastwin);
1875 xd3_free (stream, stream->buf_in);
1876 xd3_free (stream, stream->dec_appheader);
1877 xd3_free (stream, stream->dec_codetbl);
1878 xd3_free (stream, stream->code_table_alloc);
1881 xd3_free (stream, stream->inst_sect.copied2);
1882 xd3_free (stream, stream->addr_sect.copied2);
1883 xd3_free (stream, stream->data_sect.copied2);
1885 if (stream->sec_type != NULL)
1887 stream->sec_type->destroy (stream, stream->sec_stream_d);
1888 stream->sec_type->destroy (stream, stream->sec_stream_i);
1889 stream->sec_type->destroy (stream, stream->sec_stream_a);
1893 xd3_free (stream, stream->whole_target.adds);
1894 xd3_free (stream, stream->whole_target.inst);
1895 xd3_free (stream, stream->whole_target.wininfo);
1897 XD3_ASSERT (stream->alloc_cnt == stream->free_cnt);
1899 memset (stream, 0, sizeof (xd3_stream));
1902 #if (XD3_DEBUG > 1 || VCDIFF_TOOLS)
1904 xd3_rtype_to_string (xd3_rtype type, int print_mode)
1922 case XD3_CPY + 0: return "CPY_0";
1923 case XD3_CPY + 1: return "CPY_1";
1924 case XD3_CPY + 2: return "CPY_2";
1925 case XD3_CPY + 3: return "CPY_3";
1926 case XD3_CPY + 4: return "CPY_4";
1927 case XD3_CPY + 5: return "CPY_5";
1928 case XD3_CPY + 6: return "CPY_6";
1929 case XD3_CPY + 7: return "CPY_7";
1930 case XD3_CPY + 8: return "CPY_8";
1931 case XD3_CPY + 9: return "CPY_9";
1932 default: return "CPY>9";
1937 /****************************************************************
1938 Stream configuration
1939 ******************************************************************/
1942 xd3_config_stream(xd3_stream *stream,
1947 xd3_smatcher *smatcher = &stream->smatcher;
1952 memset (config, 0, sizeof (*config));
1955 /* Initial setup: no error checks yet */
1956 memset (stream, 0, sizeof (*stream));
1958 stream->winsize = config->winsize ? config->winsize : XD3_DEFAULT_WINSIZE;
1959 stream->sprevsz = config->sprevsz ? config->sprevsz : XD3_DEFAULT_SPREVSZ;
1961 if (config->iopt_size == 0)
1963 stream->iopt_size = XD3_ALLOCSIZE / sizeof(xd3_rinst);
1964 stream->iopt_unlimited = 1;
1968 stream->iopt_size = config->iopt_size;
1971 stream->getblk = config->getblk;
1972 stream->alloc = config->alloc ? config->alloc : __xd3_alloc_func;
1973 stream->free = config->freef ? config->freef : __xd3_free_func;
1974 stream->opaque = config->opaque;
1975 stream->flags = config->flags;
1977 /* Secondary setup. */
1978 stream->sec_data = config->sec_data;
1979 stream->sec_inst = config->sec_inst;
1980 stream->sec_addr = config->sec_addr;
1982 stream->sec_data.data_type = DATA_SECTION;
1983 stream->sec_inst.data_type = INST_SECTION;
1984 stream->sec_addr.data_type = ADDR_SECTION;
1986 /* Check static sizes. */
1987 if (sizeof (usize_t) != SIZEOF_USIZE_T ||
1988 sizeof (xoff_t) != SIZEOF_XOFF_T ||
1989 (ret = xd3_check_pow2(XD3_ALLOCSIZE, NULL)))
1991 stream->msg = "incorrect compilation: wrong integer sizes";
1992 return XD3_INTERNAL;
1995 /* Check/set secondary compressor. */
1996 switch (stream->flags & XD3_SEC_TYPE)
1999 if (stream->flags & XD3_SEC_NOALL)
2001 stream->msg = "XD3_SEC flags require a secondary compressor type";
2002 return XD3_INTERNAL;
2012 stream->msg = "too many secondary compressor types set";
2013 return XD3_INTERNAL;
2016 stream->code_table_desc = & __rfc3284_code_table_desc;
2017 stream->code_table_func = xd3_rfc3284_code_table;
2020 if (smatcher->small_chain == 1 &&
2021 smatcher->small_lchain == 1)
2023 stream->sprevsz = 0;
2027 if ((ret = xd3_check_pow2 (stream->sprevsz, NULL)))
2029 stream->msg = "sprevsz is required to be a power of two";
2030 return XD3_INTERNAL;
2033 stream->sprevmask = stream->sprevsz - 1;
2036 /* Default scanner settings. */
2038 switch (config->smatch_cfg)
2040 IF_BUILD_SOFT(case XD3_SMATCH_SOFT:
2042 *smatcher = config->smatcher_soft;
2043 smatcher->string_match = __smatcher_soft.string_match;
2044 smatcher->name = __smatcher_soft.name;
2045 if (smatcher->large_look < MIN_MATCH ||
2046 smatcher->large_step < 1 ||
2047 smatcher->small_look < MIN_MATCH)
2049 stream->msg = "invalid soft string-match config";
2055 IF_BUILD_DEFAULT(case XD3_SMATCH_DEFAULT:
2056 *smatcher = __smatcher_default;
2058 IF_BUILD_SLOW(case XD3_SMATCH_SLOW:
2059 *smatcher = __smatcher_slow;
2061 IF_BUILD_FASTEST(case XD3_SMATCH_FASTEST:
2062 *smatcher = __smatcher_fastest;
2064 IF_BUILD_FASTER(case XD3_SMATCH_FASTER:
2065 *smatcher = __smatcher_faster;
2067 IF_BUILD_FAST(case XD3_SMATCH_FAST:
2068 *smatcher = __smatcher_fast;
2071 stream->msg = "invalid string match config type";
2072 return XD3_INTERNAL;
2075 if (config->smatch_cfg == XD3_SMATCH_DEFAULT &&
2076 (stream->flags & XD3_COMPLEVEL_MASK) != 0)
2078 int level = (stream->flags & XD3_COMPLEVEL_MASK) >> XD3_COMPLEVEL_SHIFT;
2083 IF_BUILD_FASTEST(*smatcher = __smatcher_fastest;
2086 IF_BUILD_FASTER(*smatcher = __smatcher_faster;
2088 case 3: case 4: case 5:
2089 IF_BUILD_FAST(*smatcher = __smatcher_fast;
2092 IF_BUILD_DEFAULT(*smatcher = __smatcher_default;
2095 IF_BUILD_SLOW(*smatcher = __smatcher_slow;
2097 IF_BUILD_DEFAULT(*smatcher = __smatcher_default;
2099 IF_BUILD_FAST(*smatcher = __smatcher_fast;
2101 IF_BUILD_FASTER(*smatcher = __smatcher_faster;
2103 IF_BUILD_FASTEST(*smatcher = __smatcher_fastest;
2112 /***********************************************************
2114 ***********************************************************/
2117 xoff_t xd3_source_eof(const xd3_source *src)
2119 xoff_t r = (src->blksize * src->max_blkno) + (xoff_t)src->onlastblk;
2124 usize_t xd3_bytes_on_srcblk (xd3_source *src, xoff_t blkno)
2126 usize_t r = (blkno == src->max_blkno ?
2132 /* This function interfaces with the client getblk function, checks
2133 * its results, updates frontier_blkno, max_blkno, onlastblk, eof_known. */
2135 xd3_getblk (xd3_stream *stream, xoff_t blkno)
2138 xd3_source *source = stream->src;
2140 if (source->curblk == NULL || blkno != source->curblkno)
2142 source->getblkno = blkno;
2144 if (stream->getblk == NULL)
2146 stream->msg = "getblk source input";
2147 return XD3_GETSRCBLK;
2150 ret = stream->getblk (stream, source, blkno);
2153 IF_DEBUG1 (DP(RINT "[getblk] app error blkno %"Q"u: %s\n",
2154 blkno, xd3_strerror (ret)));
2157 IF_DEBUG2 (DP(RINT "[getblk] read source block %"Q"u onblk "
2158 "%u blksize %u\n", blkno, source->onblk, source->blksize));
2161 if (blkno >= source->frontier_blkno)
2163 if (blkno > source->max_blkno)
2165 source->max_blkno = blkno;
2166 source->onlastblk = source->onblk;
2169 if (source->onblk == source->blksize)
2171 source->frontier_blkno = blkno + 1;
2173 IF_DEBUG2 (DP(RINT "[getblk] full source blkno %"Q"u: "
2174 "source length unknown %"Q"u\n",
2176 xd3_source_eof (source)));
2180 if (!source->eof_known)
2182 IF_DEBUG2 (DP(RINT "[getblk] eof block has %d bytes; "
2183 "source length known %"Q"u\n",
2184 xd3_bytes_on_srcblk (source, blkno),
2185 xd3_source_eof (source)));
2186 source->eof_known = 1;
2189 source->frontier_blkno = blkno;
2193 XD3_ASSERT (source->curblk != NULL);
2195 if (blkno == source->max_blkno)
2197 /* In case the application sets the source as 1 block w/ a
2199 source->onlastblk = source->onblk;
2201 if (source->onblk == source->blksize)
2203 source->frontier_blkno = blkno + 1;
2209 /***********************************************************
2211 ***************************************************************/
2214 xd3_set_source (xd3_stream *stream,
2223 /* Enforce power-of-two blocksize so that source-block number
2224 * calculations are cheap. */
2225 if (xd3_check_pow2 (src->blksize, &shiftby) != 0)
2227 src->blksize = xd3_pow2_roundup(src->blksize);
2228 xd3_check_pow2 (src->blksize, &shiftby);
2229 IF_DEBUG1 (DP(RINT "raising src_blksz to %u\n", src->blksize));
2232 src->shiftby = shiftby;
2233 src->maskby = (1 << shiftby) - 1;
2235 if (xd3_check_pow2 (src->max_winsize, NULL) != 0)
2237 src->max_winsize = xd3_xoff_roundup(src->max_winsize);
2238 IF_DEBUG1 (DP(RINT "raising src_maxsize to %u\n", src->blksize));
2240 src->max_winsize = max(src->max_winsize, XD3_ALLOCSIZE);
2246 xd3_set_source_and_size (xd3_stream *stream,
2247 xd3_source *user_source,
2248 xoff_t source_size) {
2249 int ret = xd3_set_source (stream, user_source);
2252 stream->src->eof_known = 1;
2253 IF_DEBUG2 (DP(RINT "[set source] size known %"Q"u\n",
2256 xd3_blksize_div(source_size,
2258 &stream->src->max_blkno,
2259 &stream->src->onlastblk);
2265 xd3_abort_stream (xd3_stream *stream)
2267 stream->dec_state = DEC_ABORTED;
2268 stream->enc_state = ENC_ABORTED;
2272 xd3_close_stream (xd3_stream *stream)
2274 if (stream->enc_state != 0 && stream->enc_state != ENC_ABORTED)
2276 if (stream->buf_leftover != NULL)
2278 stream->msg = "encoding is incomplete";
2279 return XD3_INTERNAL;
2282 if (stream->enc_state == ENC_POSTWIN)
2285 xd3_encode_reset (stream);
2287 stream->current_window += 1;
2288 stream->enc_state = ENC_INPUT;
2291 /* If encoding, should be ready for more input but not actually
2293 if (stream->enc_state != ENC_INPUT || stream->avail_in != 0)
2295 stream->msg = "encoding is incomplete";
2296 return XD3_INTERNAL;
2301 switch (stream->dec_state)
2305 /* TODO: Address the zero-byte ambiguity. Does the encoder
2306 * emit a window or not? If so, then catch an error here.
2307 * If not, need another routine to say
2308 * decode_at_least_one_if_empty. */
2312 /* If decoding, should be ready for the next window. */
2313 stream->msg = "EOF in decode";
2314 return XD3_INTERNAL;
2321 /**************************************************************
2323 ****************************************************************/
2326 xd3_get_appheader (xd3_stream *stream,
2330 if (stream->dec_state < DEC_WININD)
2332 stream->msg = "application header not available";
2333 return XD3_INTERNAL;
2336 (*data) = stream->dec_appheader;
2337 (*size) = stream->dec_appheadsz;
2341 /**********************************************************
2343 *************************************************/
2345 #include "xdelta3-decode.h"
2347 /****************************************************************
2349 *****************************************************************/
2353 xd3_set_appheader (xd3_stream *stream,
2354 const uint8_t *data,
2357 stream->enc_appheader = data;
2358 stream->enc_appheadsz = size;
2363 xd3_iopt_check (xd3_stream *stream)
2365 usize_t ul = xd3_rlist_length (& stream->iopt_used);
2366 usize_t fl = xd3_rlist_length (& stream->iopt_free);
2368 return (ul + fl + (stream->iout ? 1 : 0)) == stream->iopt_size;
2373 xd3_iopt_free (xd3_stream *stream, xd3_rinst *i)
2375 xd3_rinst *n = xd3_rlist_remove (i);
2376 xd3_rlist_push_back (& stream->iopt_free, i);
2381 xd3_iopt_free_nonadd (xd3_stream *stream, xd3_rinst *i)
2383 if (i->type != XD3_ADD)
2385 xd3_rlist_push_back (& stream->iopt_free, i);
2389 /* When an instruction is ready to flush from the iopt buffer, this
2390 * function is called to produce an encoding. It writes the
2391 * instruction plus size, address, and data to the various encoding
2394 xd3_iopt_finish_encoding (xd3_stream *stream, xd3_rinst *inst)
2398 /* Check for input overflow. */
2399 XD3_ASSERT (inst->pos + inst->size <= stream->avail_in);
2405 /* the address may have an offset if there is a source window. */
2407 xd3_source *src = stream->src;
2411 /* If there is a source copy, the source must have its
2412 * source window decided before we can encode. This can
2413 * be bad -- we have to make this decision even if no
2414 * source matches have been found. */
2415 if (stream->srcwin_decided == 0)
2417 if ((ret = xd3_srcwin_setup (stream))) { return ret; }
2421 stream->srcwin_decided_early = (!stream->src->eof_known ||
2422 (stream->srcwin_cksum_pos <
2423 xd3_source_eof (stream->src)));
2426 /* xtra field indicates the copy is from the source */
2429 XD3_ASSERT (inst->addr >= src->srcbase);
2430 XD3_ASSERT (inst->addr + inst->size <=
2431 src->srcbase + src->srclen);
2432 addr = (usize_t)(inst->addr - src->srcbase);
2433 stream->n_scpy += 1;
2434 stream->l_scpy += (xoff_t) inst->size;
2438 /* with source window: target copy address is offset
2440 addr = stream->taroff + (usize_t) inst->addr;
2441 stream->n_tcpy += 1;
2442 stream->l_tcpy += (xoff_t) inst->size;
2447 addr = (usize_t) inst->addr;
2448 stream->n_tcpy += 1;
2449 stream->l_tcpy += inst->size;
2452 /* Note: used to assert inst->size >= MIN_MATCH, but not true
2453 * for merge operations & identical match heuristics. */
2454 /* the "here" position is always offset by taroff */
2455 if ((ret = xd3_encode_address (stream, addr, inst->pos + stream->taroff,
2463 DP(RINT "[iopt copy:%d] pos %"Q"u-%"Q"u addr %"Q"u-%"Q"u size %u\n",
2465 stream->total_in + inst->pos,
2466 stream->total_in + inst->pos + inst->size,
2467 inst->addr, inst->addr + inst->size, inst->size);
2473 XD3_ASSERT (inst->size >= MIN_MATCH);
2475 if ((ret = xd3_emit_byte (stream, & DATA_TAIL (stream), inst->xtra))) { return ret; }
2478 stream->l_run += inst->size;
2482 DP(RINT "[iopt run:%d] pos %"Q"u size %u\n", cnt++, stream->total_in + inst->pos, inst->size);
2488 if ((ret = xd3_emit_bytes (stream, & DATA_TAIL (stream),
2489 stream->next_in + inst->pos, inst->size))) { return ret; }
2492 stream->l_add += inst->size;
2496 DP(RINT "[iopt add:%d] pos %"Q"u size %u\n", cnt++, stream->total_in + inst->pos, inst->size);
2503 /* This is the only place stream->unencoded_offset is incremented. */
2504 XD3_ASSERT (stream->unencoded_offset == inst->pos);
2505 stream->unencoded_offset += inst->size;
2509 XD3_CHOOSE_INSTRUCTION (stream, stream->iout, inst);
2511 if (stream->iout != NULL)
2513 if (stream->iout->code2 != 0)
2515 if ((ret = xd3_emit_double (stream, stream->iout, inst, stream->iout->code2))) { return ret; }
2517 xd3_iopt_free_nonadd (stream, stream->iout);
2518 xd3_iopt_free_nonadd (stream, inst);
2519 stream->iout = NULL;
2524 if ((ret = xd3_emit_single (stream, stream->iout, stream->iout->code1))) { return ret; }
2526 xd3_iopt_free_nonadd (stream, stream->iout);
2530 stream->iout = inst;
2535 /* This possibly encodes an add instruction, iadd, which must remain
2536 * on the stack until the following call to
2537 * xd3_iopt_finish_encoding. */
2539 xd3_iopt_add (xd3_stream *stream, usize_t pos, xd3_rinst *iadd)
2542 usize_t off = stream->unencoded_offset;
2546 iadd->type = XD3_ADD;
2548 iadd->size = pos - off;
2550 if ((ret = xd3_iopt_finish_encoding (stream, iadd))) { return ret; }
2556 /* This function calls xd3_iopt_finish_encoding to finish encoding an
2557 * instruction, and it may also produce an add instruction for an
2558 * unmatched region. */
2560 xd3_iopt_add_encoding (xd3_stream *stream, xd3_rinst *inst)
2565 if ((ret = xd3_iopt_add (stream, inst->pos, & iadd))) { return ret; }
2567 if ((ret = xd3_iopt_finish_encoding (stream, inst))) { return ret; }
2572 /* Generates a final add instruction to encode the remaining input. */
2574 xd3_iopt_add_finalize (xd3_stream *stream)
2579 if ((ret = xd3_iopt_add (stream, stream->avail_in, & iadd))) { return ret; }
2583 if ((ret = xd3_emit_single (stream, stream->iout, stream->iout->code1))) { return ret; }
2585 xd3_iopt_free_nonadd (stream, stream->iout);
2586 stream->iout = NULL;
2592 /* Compact the instruction buffer by choosing the best non-overlapping
2593 * instructions when lazy string-matching. There are no ADDs in the
2594 * iopt buffer because those are synthesized in xd3_iopt_add_encoding
2595 * and during xd3_iopt_add_finalize. */
2597 xd3_iopt_flush_instructions (xd3_stream *stream, int force)
2599 xd3_rinst *r1 = xd3_rlist_front (& stream->iopt_used);
2610 XD3_ASSERT (xd3_iopt_check (stream));
2612 /* Note: once tried to skip this step if it's possible to assert
2613 * there are no overlapping instructions. Doesn't work because
2614 * xd3_opt_erase leaves overlapping instructions. */
2615 while (! xd3_rlist_end (& stream->iopt_used, r1) &&
2616 ! xd3_rlist_end (& stream->iopt_used, r2 = xd3_rlist_next (r1)))
2618 r1end = r1->pos + r1->size;
2620 /* If the instructions do not overlap, continue. */
2621 if (r1end <= r2->pos)
2627 r2end = r2->pos + r2->size;
2629 /* The min_match adjustments prevent this. */
2630 XD3_ASSERT (r2end > (r1end + LEAST_MATCH_INCR));
2632 /* If r3 is available... */
2633 if (! xd3_rlist_end (& stream->iopt_used, r3 = xd3_rlist_next (r2)))
2635 /* If r3 starts before r1 finishes or just about, r2 is irrelevant */
2636 if (r3->pos <= r1end + 1)
2638 xd3_iopt_free (stream, r2);
2644 /* Unless force, end the loop when r3 is not available. */
2648 r2off = r2->pos - r1->pos;
2649 r2moff = r2end - r1end;
2650 gap = r2end - r1->pos;
2652 /* If the two matches overlap almost entirely, choose the better match
2653 * and discard the other. The else branch can still create inefficient
2654 * copies, e.g., a 4-byte copy that takes 4 bytes to encode, which
2655 * xd3_smatch() wouldn't allow by its crude efficiency check. However,
2656 * in this case there are adjacent copies which mean the add would cost
2657 * one extra byte. Allow the inefficiency here. */
2658 if (gap < 2*MIN_MATCH || r2moff <= 2 || r2off <= 2)
2660 /* Only one match should be used, choose the longer one. */
2661 if (r1->size < r2->size)
2663 xd3_iopt_free (stream, r1);
2668 /* We are guaranteed that r1 does not overlap now, so advance past r2 */
2669 r1 = xd3_iopt_free (stream, r2);
2675 /* Shorten one of the instructions -- could be optimized
2676 * based on the address cache. */
2681 XD3_ASSERT (r1end > r2->pos && r2end > r1->pos);
2683 /* Try to balance the length of both instructions, but avoid
2684 * making both longer than MAX_MATCH_SPLIT . */
2686 newsize = min (MAX_MATCH_SPLIT, gap - average);
2688 /* Should be possible to simplify this code. */
2689 if (newsize > r1->size)
2692 adjust1 = r1end - r2->pos;
2694 else if (newsize > r2->size)
2697 adjust1 = r1end - r2->pos;
2699 XD3_ASSERT (r1->size > adjust1);
2701 r1->size -= adjust1;
2703 /* don't shorten r2 */
2709 adjust1 = r1->size - newsize;
2711 if (r2->pos > r1end - adjust1)
2713 adjust1 -= r2->pos - (r1end - adjust1);
2716 XD3_ASSERT (r1->size > adjust1);
2718 r1->size -= adjust1;
2721 XD3_ASSERT (r1->pos + r1->size >= r2->pos);
2723 adjust1 = r1->pos + r1->size - r2->pos;
2726 /* Fallthrough above if-else, shorten r2 */
2727 XD3_ASSERT (r2->size > adjust1);
2729 r2->size -= adjust1;
2731 r2->addr += adjust1;
2733 XD3_ASSERT (r1->size >= MIN_MATCH);
2734 XD3_ASSERT (r2->size >= MIN_MATCH);
2740 XD3_ASSERT (xd3_iopt_check (stream));
2742 /* If forcing, pick instructions until the list is empty, otherwise
2743 * this empties 50% of the queue. */
2744 for (flushed = 0; ! xd3_rlist_empty (& stream->iopt_used); )
2746 xd3_rinst *renc = xd3_rlist_pop_front (& stream->iopt_used);
2747 if ((ret = xd3_iopt_add_encoding (stream, renc)))
2754 if (++flushed > stream->iopt_size / 2)
2759 /* If there are only two instructions remaining, break,
2760 * because they were not optimized. This means there were
2761 * more than 50% eliminated by the loop above. */
2762 r1 = xd3_rlist_front (& stream->iopt_used);
2763 if (xd3_rlist_end(& stream->iopt_used, r1) ||
2764 xd3_rlist_end(& stream->iopt_used, r2 = xd3_rlist_next (r1)) ||
2765 xd3_rlist_end(& stream->iopt_used, r3 = xd3_rlist_next (r2)))
2772 XD3_ASSERT (xd3_iopt_check (stream));
2774 XD3_ASSERT (!force || xd3_rlist_length (& stream->iopt_used) == 0);
2780 xd3_iopt_get_slot (xd3_stream *stream, xd3_rinst** iptr)
2785 if (xd3_rlist_empty (& stream->iopt_free))
2787 if (stream->iopt_unlimited)
2789 usize_t elts = XD3_ALLOCSIZE / sizeof(xd3_rinst);
2791 if ((ret = xd3_alloc_iopt (stream, elts)))
2796 stream->iopt_size += elts;
2800 if ((ret = xd3_iopt_flush_instructions (stream, 0))) { return ret; }
2802 XD3_ASSERT (! xd3_rlist_empty (& stream->iopt_free));
2806 i = xd3_rlist_pop_back (& stream->iopt_free);
2808 xd3_rlist_push_back (& stream->iopt_used, i);
2812 ++stream->i_slots_used;
2817 /* A copy is about to be emitted that extends backwards to POS,
2818 * therefore it may completely cover some existing instructions in the
2819 * buffer. If an instruction is completely covered by this new match,
2820 * erase it. If the new instruction is covered by the previous one,
2821 * return 1 to skip it. */
2823 xd3_iopt_erase (xd3_stream *stream, usize_t pos, usize_t size)
2825 while (! xd3_rlist_empty (& stream->iopt_used))
2827 xd3_rinst *r = xd3_rlist_back (& stream->iopt_used);
2829 /* Verify that greedy is working. The previous instruction
2830 * should end before the new one begins. */
2831 XD3_ASSERT ((stream->flags & XD3_BEGREEDY) == 0 || (r->pos + r->size <= pos));
2832 /* Verify that min_match is working. The previous instruction
2833 * should end before the new one ends. */
2834 XD3_ASSERT ((stream->flags & XD3_BEGREEDY) != 0 || (r->pos + r->size < pos + size));
2836 /* See if the last instruction starts before the new
2837 * instruction. If so, there is nothing to erase. */
2843 /* Otherwise, the new instruction covers the old one, delete it
2845 xd3_rlist_remove (r);
2846 xd3_rlist_push_back (& stream->iopt_free, r);
2847 --stream->i_slots_used;
2851 /* This function tells the last matched input position. */
2853 xd3_iopt_last_matched (xd3_stream *stream)
2857 if (xd3_rlist_empty (& stream->iopt_used))
2862 r = xd3_rlist_back (& stream->iopt_used);
2864 return r->pos + r->size;
2867 /*********************************************************
2869 ***********************************************************/
2872 xd3_emit_single (xd3_stream *stream, xd3_rinst *single, usize_t code)
2874 int has_size = stream->code_table[code].size1 == 0;
2877 IF_DEBUG2 (DP(RINT "[emit1] %u %s (%u) code %u\n",
2879 xd3_rtype_to_string ((xd3_rtype) single->type, 0),
2883 if ((ret = xd3_emit_byte (stream, & INST_TAIL (stream), code)))
2890 if ((ret = xd3_emit_size (stream, & INST_TAIL (stream), single->size)))
2900 xd3_emit_double (xd3_stream *stream, xd3_rinst *first,
2901 xd3_rinst *second, usize_t code)
2905 /* All double instructions use fixed sizes, so all we need to do is
2906 * output the instruction code, no sizes. */
2907 XD3_ASSERT (stream->code_table[code].size1 != 0 &&
2908 stream->code_table[code].size2 != 0);
2910 if ((ret = xd3_emit_byte (stream, & INST_TAIL (stream), code)))
2915 IF_DEBUG2 (DP(RINT "[emit2]: %u %s (%u) %s (%u) code %u\n",
2917 xd3_rtype_to_string ((xd3_rtype) first->type, 0),
2919 xd3_rtype_to_string ((xd3_rtype) second->type, 0),
2926 /* This enters a potential run instruction into the iopt buffer. The
2927 * position argument is relative to the target window. */
2929 xd3_emit_run (xd3_stream *stream, usize_t pos, usize_t size, uint8_t *run_c)
2934 if ((ret = xd3_iopt_get_slot (stream, & ri))) { return ret; }
2944 /* This enters a potential copy instruction into the iopt buffer. The
2945 * position argument is relative to the target window.. */
2947 xd3_found_match (xd3_stream *stream, usize_t pos,
2948 usize_t size, xoff_t addr, int is_source)
2953 if ((ret = xd3_iopt_get_slot (stream, & ri))) { return ret; }
2956 ri->xtra = is_source;
2965 xd3_emit_hdr (xd3_stream *stream)
2968 int use_secondary = stream->sec_type != NULL;
2969 int use_adler32 = stream->flags & (XD3_ADLER32 | XD3_ADLER32_RECODE);
2970 int vcd_source = xd3_encoder_used_source (stream);
2971 usize_t win_ind = 0;
2972 usize_t del_ind = 0;
2979 if (stream->current_window == 0)
2981 usize_t hdr_ind = 0;
2982 int use_appheader = stream->enc_appheader != NULL;
2984 if (use_secondary) { hdr_ind |= VCD_SECONDARY; }
2985 if (use_appheader) { hdr_ind |= VCD_APPHEADER; }
2987 if ((ret = xd3_emit_byte (stream, & HDR_TAIL (stream),
2988 VCDIFF_MAGIC1)) != 0 ||
2989 (ret = xd3_emit_byte (stream, & HDR_TAIL (stream),
2990 VCDIFF_MAGIC2)) != 0 ||
2991 (ret = xd3_emit_byte (stream, & HDR_TAIL (stream),
2992 VCDIFF_MAGIC3)) != 0 ||
2993 (ret = xd3_emit_byte (stream, & HDR_TAIL (stream),
2994 VCDIFF_VERSION)) != 0 ||
2995 (ret = xd3_emit_byte (stream, & HDR_TAIL (stream), hdr_ind)) != 0)
3000 /* Secondary compressor ID */
3002 if (use_secondary &&
3003 (ret = xd3_emit_byte (stream, & HDR_TAIL (stream),
3004 stream->sec_type->id)))
3010 /* Application header */
3013 if ((ret = xd3_emit_size (stream, & HDR_TAIL (stream),
3014 stream->enc_appheadsz)) ||
3015 (ret = xd3_emit_bytes (stream, & HDR_TAIL (stream),
3016 stream->enc_appheader,
3017 stream->enc_appheadsz)))
3024 /* try to compress this window */
3032 # define ENCODE_SECONDARY_SECTION(UPPER,LOWER) \
3033 ((stream->flags & XD3_SEC_NO ## UPPER) == 0 && \
3034 (ret = xd3_encode_secondary (stream, \
3035 & UPPER ## _HEAD (stream), \
3036 & UPPER ## _TAIL (stream), \
3037 & xd3_sec_ ## LOWER (stream), \
3038 & stream->sec_ ## LOWER, \
3041 if (ENCODE_SECONDARY_SECTION (DATA, data) ||
3042 ENCODE_SECONDARY_SECTION (INST, inst) ||
3043 ENCODE_SECONDARY_SECTION (ADDR, addr))
3048 del_ind |= (data_sec ? VCD_DATACOMP : 0);
3049 del_ind |= (inst_sec ? VCD_INSTCOMP : 0);
3050 del_ind |= (addr_sec ? VCD_ADDRCOMP : 0);
3054 /* if (vcd_target) { win_ind |= VCD_TARGET; } */
3055 if (vcd_source) { win_ind |= VCD_SOURCE; }
3056 if (use_adler32) { win_ind |= VCD_ADLER32; }
3058 /* window indicator */
3059 if ((ret = xd3_emit_byte (stream, & HDR_TAIL (stream), win_ind)))
3067 /* or (vcd_target) { ... } */
3068 if ((ret = xd3_emit_size (stream, & HDR_TAIL (stream),
3069 stream->src->srclen)) ||
3070 (ret = xd3_emit_offset (stream, & HDR_TAIL (stream),
3071 stream->src->srcbase))) { return ret; }
3074 tgt_len = stream->avail_in;
3075 data_len = xd3_sizeof_output (DATA_HEAD (stream));
3076 inst_len = xd3_sizeof_output (INST_HEAD (stream));
3077 addr_len = xd3_sizeof_output (ADDR_HEAD (stream));
3079 /* The enc_len field is a redundency for future extensions.*/
3080 enc_len = (1 + (xd3_sizeof_size (tgt_len) +
3081 xd3_sizeof_size (data_len) +
3082 xd3_sizeof_size (inst_len) +
3083 xd3_sizeof_size (addr_len)) +
3087 (use_adler32 ? 4 : 0));
3089 if ((ret = xd3_emit_size (stream, & HDR_TAIL (stream), enc_len)) ||
3090 (ret = xd3_emit_size (stream, & HDR_TAIL (stream), tgt_len)) ||
3091 (ret = xd3_emit_byte (stream, & HDR_TAIL (stream), del_ind)) ||
3092 (ret = xd3_emit_size (stream, & HDR_TAIL (stream), data_len)) ||
3093 (ret = xd3_emit_size (stream, & HDR_TAIL (stream), inst_len)) ||
3094 (ret = xd3_emit_size (stream, & HDR_TAIL (stream), addr_len)))
3104 if (stream->flags & XD3_ADLER32)
3106 a32 = adler32 (1L, stream->next_in, stream->avail_in);
3110 a32 = stream->recode_adler32;
3114 send[0] = (uint8_t) (a32 >> 24);
3115 send[1] = (uint8_t) (a32 >> 16);
3116 send[2] = (uint8_t) (a32 >> 8);
3117 send[3] = (uint8_t) (a32 & 0x000000FFU);
3119 if ((ret = xd3_emit_bytes (stream, & HDR_TAIL (stream), send, 4)))
3128 /****************************************************************
3130 ****************************************************************/
3133 xd3_encode_buffer_leftover (xd3_stream *stream)
3138 /* Allocate the buffer. */
3139 if (stream->buf_in == NULL &&
3140 (stream->buf_in = (uint8_t*) xd3_alloc (stream, stream->winsize, 1)) == NULL)
3145 IF_DEBUG2 (DP(RINT "[leftover] flush?=%s\n", (stream->flags & XD3_FLUSH) ? "yes" : "no"));
3147 /* Take leftover input first. */
3148 if (stream->buf_leftover != NULL)
3150 XD3_ASSERT (stream->buf_avail == 0);
3151 XD3_ASSERT (stream->buf_leftavail < stream->winsize);
3153 IF_DEBUG2 (DP(RINT "[leftover] previous %u avail %u\n", stream->buf_leftavail, stream->avail_in));
3155 memcpy (stream->buf_in, stream->buf_leftover, stream->buf_leftavail);
3157 stream->buf_leftover = NULL;
3158 stream->buf_avail = stream->buf_leftavail;
3161 /* Copy into the buffer. */
3162 room = stream->winsize - stream->buf_avail;
3163 take = min (room, stream->avail_in);
3165 memcpy (stream->buf_in + stream->buf_avail, stream->next_in, take);
3167 stream->buf_avail += take;
3169 if (take < stream->avail_in)
3171 /* Buffer is full */
3172 stream->buf_leftover = stream->next_in + take;
3173 stream->buf_leftavail = stream->avail_in - take;
3175 else if ((stream->buf_avail < stream->winsize) && !(stream->flags & XD3_FLUSH))
3177 /* Buffer has space */
3178 IF_DEBUG2 (DP(RINT "[leftover] emptied %u\n", take));
3182 /* Use the buffer: */
3183 IF_DEBUG2 (DP(RINT "[leftover] take %u remaining %u\n", take, stream->buf_leftavail));
3184 stream->next_in = stream->buf_in;
3185 stream->avail_in = stream->buf_avail;
3186 stream->buf_avail = 0;
3191 /* Allocates one block of xd3_rlist elements */
3193 xd3_alloc_iopt (xd3_stream *stream, usize_t elts)
3196 xd3_iopt_buflist* last =
3197 (xd3_iopt_buflist*) xd3_alloc (stream, sizeof (xd3_iopt_buflist), 1);
3200 (last->buffer = (xd3_rinst*) xd3_alloc (stream, sizeof (xd3_rinst), elts)) == NULL)
3205 last->next = stream->iopt_alloc;
3206 stream->iopt_alloc = last;
3208 for (i = 0; i < elts; i += 1)
3210 xd3_rlist_push_back (& stream->iopt_free, & last->buffer[i]);
3216 /* This function allocates all memory initially used by the encoder. */
3218 xd3_encode_init (xd3_stream *stream, int full_init)
3224 int large_comp = (stream->src != NULL);
3225 int small_comp = ! (stream->flags & XD3_NOCOMPRESS);
3227 /* Memory allocations for checksum tables are delayed until
3228 * xd3_string_match_init in the first call to string_match--that way
3229 * identical or short inputs require no table allocation. */
3232 usize_t hash_values = stream->src->max_winsize /
3233 stream->smatcher.large_step;
3235 xd3_size_hashtable (stream,
3237 & stream->large_hash);
3242 /* TODO: This is under devel: used to have min(sprevsz) here, which sort
3243 * of makes sense, but observed fast performance w/ larger tables, which
3244 * also sort of makes sense. @@@ */
3245 usize_t hash_values = stream->winsize;
3247 xd3_size_hashtable (stream,
3249 & stream->small_hash);
3254 for (i = 0; i < ENC_SECTS; i += 1)
3256 if ((stream->enc_heads[i] =
3257 stream->enc_tails[i] =
3258 xd3_alloc_output (stream, NULL)) == NULL)
3265 xd3_rlist_init (& stream->iopt_used);
3266 xd3_rlist_init (& stream->iopt_free);
3268 if (xd3_alloc_iopt (stream, stream->iopt_size) != 0) { goto fail; }
3270 XD3_ASSERT (xd3_rlist_length (& stream->iopt_free) == stream->iopt_size);
3271 XD3_ASSERT (xd3_rlist_length (& stream->iopt_used) == 0);
3273 /* address cache, code table */
3274 stream->acache.s_near = stream->code_table_desc->near_modes;
3275 stream->acache.s_same = stream->code_table_desc->same_modes;
3276 stream->code_table = stream->code_table_func ();
3278 return xd3_alloc_cache (stream);
3286 xd3_encode_init_full (xd3_stream *stream)
3288 return xd3_encode_init (stream, 1);
3292 xd3_encode_init_partial (xd3_stream *stream)
3294 return xd3_encode_init (stream, 0);
3297 /* Called after the ENC_POSTOUT state, this puts the output buffers
3298 * back into separate lists and re-initializes some variables. (The
3299 * output lists were spliced together during the ENC_FLUSH state.) */
3301 xd3_encode_reset (xd3_stream *stream)
3306 stream->avail_in = 0;
3307 stream->small_reset = 1;
3308 stream->i_slots_used = 0;
3310 if (stream->src != NULL)
3312 stream->src->srcbase = 0;
3313 stream->src->srclen = 0;
3314 stream->srcwin_decided = 0;
3315 stream->srcwin_decided_early = 0;
3316 stream->match_minaddr = 0;
3317 stream->match_maxaddr = 0;
3321 /* Reset output chains. */
3322 olist = stream->enc_heads[0];
3324 for (i = 0; i < ENC_SECTS; i += 1)
3326 XD3_ASSERT (olist != NULL);
3328 stream->enc_heads[i] = olist;
3329 stream->enc_tails[i] = olist;
3330 olist = olist->next_page;
3332 stream->enc_heads[i]->next = 0;
3333 stream->enc_heads[i]->next_page = NULL;
3335 stream->enc_tails[i]->next_page = NULL;
3336 stream->enc_tails[i] = stream->enc_heads[i];
3339 xd3_freelist_output (stream, olist);
3342 /* The main encoding routine. */
3344 xd3_encode_input (xd3_stream *stream)
3348 if (stream->dec_state != 0)
3350 stream->msg = "encoder/decoder transition";
3351 return XD3_INTERNAL;
3354 switch (stream->enc_state)
3357 /* Only reached on first time through: memory setup. */
3358 if ((ret = xd3_encode_init_full (stream))) { return ret; }
3360 stream->enc_state = ENC_INPUT;
3364 /* If there is no input yet, just return. This checks for
3365 * next_in == NULL, not avail_in == 0 since zero bytes is a
3366 * valid input. There is an assertion in xd3_avail_input() that
3367 * next_in != NULL for this reason. By returning right away we
3368 * avoid creating an input buffer before the caller has supplied
3369 * its first data. It is possible for xd3_avail_input to be
3370 * called both before and after the first call to
3371 * xd3_encode_input(). */
3372 if (stream->next_in == NULL)
3378 /* See if we should buffer the input: either if there is already
3379 * a leftover buffer, or if the input is short of winsize
3380 * without flush. The label at this point is reached by a goto
3381 * below, when there is leftover input after postout. */
3382 if ((stream->buf_leftover != NULL) ||
3383 (stream->buf_avail != 0) ||
3384 (stream->avail_in < stream->winsize && ! (stream->flags & XD3_FLUSH)))
3386 if ((ret = xd3_encode_buffer_leftover (stream))) { return ret; }
3389 /* Initalize the address cache before each window. */
3390 xd3_init_cache (& stream->acache);
3392 stream->input_position = 0;
3393 stream->min_match = MIN_MATCH;
3394 stream->unencoded_offset = 0;
3396 stream->enc_state = ENC_SEARCH;
3398 IF_DEBUG2 (DP(RINT "[WINSTART:%"Q"u] input bytes %u offset %"Q"u\n",
3399 stream->current_window, stream->avail_in,
3401 return XD3_WINSTART;
3404 IF_DEBUG2 (DP(RINT "[SEARCH] match_state %d avail_in %u %s\n",
3405 stream->match_state, stream->avail_in,
3406 stream->src ? "source" : "no source"));
3408 /* Reentrant matching. */
3409 if (stream->src != NULL)
3411 switch (stream->match_state)
3414 /* Try matching forward at the start of the target.
3415 * This is entered the first time through, to check for
3416 * a perfect match, and whenever there is a source match
3417 * that extends to the end of the previous window. The
3418 * match_srcpos field is initially zero and later set
3419 * during xd3_source_extend_match. */
3421 if (stream->avail_in > 0)
3423 /* This call can't fail because the source window is
3425 ret = xd3_source_match_setup (stream, stream->match_srcpos);
3426 XD3_ASSERT (ret == 0);
3427 stream->match_state = MATCH_FORWARD;
3431 stream->match_state = MATCH_SEARCHING;
3432 stream->match_fwd = 0;
3434 XD3_ASSERT (stream->match_fwd == 0);
3437 case MATCH_BACKWARD:
3438 if (stream->avail_in != 0)
3440 if ((ret = xd3_source_extend_match (stream)) != 0)
3445 /* The search has to make forward progress here
3446 * or else it can get stuck in a match-backward
3447 * (getsrcblk) then match-forward (getsrcblk),
3448 * find insufficient match length, then repeat
3449 * exactly the same search.
3451 stream->input_position += stream->match_fwd;
3454 case MATCH_SEARCHING:
3455 /* Continue string matching. (It's possible that the
3456 * initial match continued through the entire input, in
3457 * which case we're still in MATCH_FORWARD and should
3458 * remain so for the next input window.) */
3463 /* String matching... */
3464 if (stream->avail_in != 0 &&
3465 (ret = stream->smatcher.string_match (stream)))
3470 stream->enc_state = ENC_INSTR;
3473 /* Note: Jump here to encode VCDIFF deltas w/o using this
3474 * string-matching code. Merging code code enters here. */
3476 /* Flush the instrution buffer, then possibly add one more
3477 * instruction, then emit the header. */
3478 if ((ret = xd3_iopt_flush_instructions (stream, 1)) ||
3479 (ret = xd3_iopt_add_finalize (stream)))
3484 stream->enc_state = ENC_FLUSH;
3487 /* Note: main_recode_func() bypasses string-matching by setting
3489 if ((ret = xd3_emit_hdr (stream)))
3495 stream->enc_current = HDR_HEAD (stream);
3497 /* Chain all the outputs together. After doing this, it looks
3498 * as if there is only one section. The other enc_heads are set
3499 * to NULL to avoid freeing them more than once. */
3500 for (i = 1; i < ENC_SECTS; i += 1)
3502 stream->enc_tails[i-1]->next_page = stream->enc_heads[i];
3503 stream->enc_heads[i] = NULL;
3508 stream->enc_state = ENC_POSTOUT;
3509 stream->next_out = stream->enc_current->base;
3510 stream->avail_out = stream->enc_current->next;
3511 stream->total_out += (xoff_t) stream->avail_out;
3513 /* If there is any output in this buffer, return it, otherwise
3514 * fall through to handle the next buffer or finish the window
3515 * after all buffers have been output. */
3516 if (stream->avail_out > 0)
3518 /* This is the only place xd3_encode returns XD3_OUTPUT */
3524 if (stream->avail_out != 0)
3526 stream->msg = "missed call to consume output";
3527 return XD3_INTERNAL;
3530 /* Continue outputting one buffer at a time, until the next is NULL. */
3531 if ((stream->enc_current = stream->enc_current->next_page) != NULL)
3536 stream->total_in += (xoff_t) stream->avail_in;
3537 stream->enc_state = ENC_POSTWIN;
3539 IF_DEBUG2 (DP(RINT "[WINFINISH:%"Q"u] in=%"Q"u\n",
3540 stream->current_window,
3542 return XD3_WINFINISH;
3546 xd3_encode_reset (stream);
3548 stream->current_window += 1;
3549 stream->enc_state = ENC_INPUT;
3551 /* If there is leftover input to flush, repeat. */
3552 if (stream->buf_leftover != NULL)
3557 /* Ready for more input. */
3561 stream->msg = "invalid state";
3562 return XD3_INTERNAL;
3565 #endif /* XD3_ENCODER */
3567 /*****************************************************************
3568 Client convenience functions
3569 ******************************************************************/
3572 xd3_process_stream (int is_encode,
3574 int (*func) (xd3_stream *),
3576 const uint8_t *input,
3579 usize_t *output_size,
3580 usize_t output_size_max)
3583 usize_t n = min(stream->winsize, input_size);
3587 stream->flags |= XD3_FLUSH;
3589 xd3_avail_input (stream, input + ipos, n);
3595 switch ((ret = func (stream)))
3597 case XD3_OUTPUT: { /* memcpy below */ break; }
3599 n = min(stream->winsize, input_size - ipos);
3604 xd3_avail_input (stream, input + ipos, n);
3608 case XD3_GOTHEADER: { /* ignore */ continue; }
3609 case XD3_WINSTART: { /* ignore */ continue; }
3610 case XD3_WINFINISH: { /* ignore */ continue; }
3613 /* When the getblk function is NULL, it is necessary to
3614 * provide the complete source as a single block using
3615 * xd3_set_source_and_size, otherwise this error. The
3616 * library should never ask for another source block. */
3617 stream->msg = "library requested source block";
3618 return XD3_INTERNAL;
3622 /* xd3_encode_input/xd3_decode_input never return 0 */
3623 stream->msg = "invalid return: 0";
3624 return XD3_INTERNAL;
3630 if (*output_size + stream->avail_out > output_size_max)
3632 stream->msg = "insufficient output space";
3636 memcpy (output + *output_size, stream->next_out, stream->avail_out);
3638 *output_size += stream->avail_out;
3640 xd3_consume_output (stream);
3643 return (close_stream == 0) ? 0 : xd3_close_stream (stream);
3647 xd3_process_memory (int is_encode,
3648 int (*func) (xd3_stream *),
3649 const uint8_t *input,
3651 const uint8_t *source,
3652 usize_t source_size,
3654 usize_t *output_size,
3655 usize_t output_size_max,
3662 memset (& stream, 0, sizeof (stream));
3663 memset (& config, 0, sizeof (config));
3665 if (input == NULL || output == NULL) {
3666 stream.msg = "invalid input/output buffer";
3671 config.flags = flags;
3675 config.winsize = min(input_size, (usize_t) XD3_DEFAULT_WINSIZE);
3676 config.sprevsz = xd3_pow2_roundup (config.winsize);
3679 if ((ret = xd3_config_stream (&stream, &config)) != 0)
3686 memset (& src, 0, sizeof (src));
3688 src.blksize = source_size;
3689 src.onblk = source_size;
3690 src.curblk = source;
3692 src.max_winsize = source_size;
3694 if ((ret = xd3_set_source_and_size (&stream, &src, source_size)) != 0)
3700 if ((ret = xd3_process_stream (is_encode,
3706 output_size_max)) != 0)
3714 IF_DEBUG2 (DP(RINT "process_memory: %d: %s\n", ret, stream.msg));
3716 xd3_free_stream(&stream);
3721 xd3_decode_stream (xd3_stream *stream,
3722 const uint8_t *input,
3725 usize_t *output_size,
3726 usize_t output_size_max)
3728 return xd3_process_stream (0, stream, & xd3_decode_input, 1,
3730 output, output_size, output_size_max);
3734 xd3_decode_memory (const uint8_t *input,
3736 const uint8_t *source,
3737 usize_t source_size,
3739 usize_t *output_size,
3740 usize_t output_size_max,
3742 return xd3_process_memory (0, & xd3_decode_input,
3744 source, source_size,
3745 output, output_size, output_size_max,
3752 xd3_encode_stream (xd3_stream *stream,
3753 const uint8_t *input,
3756 usize_t *output_size,
3757 usize_t output_size_max)
3759 return xd3_process_stream (1, stream, & xd3_encode_input, 1,
3761 output, output_size, output_size_max);
3765 xd3_encode_memory (const uint8_t *input,
3767 const uint8_t *source,
3768 usize_t source_size,
3770 usize_t *output_size,
3771 usize_t output_size_max,
3773 return xd3_process_memory (1, & xd3_encode_input,
3775 source, source_size,
3776 output, output_size, output_size_max,
3782 /*************************************************************
3783 String matching helpers
3784 *************************************************************/
3787 /* Do the initial xd3_string_match() checksum table setup.
3788 * Allocations are delayed until first use to avoid allocation
3789 * sometimes (e.g., perfect matches, zero-length inputs). */
3791 xd3_string_match_init (xd3_stream *stream)
3793 const int DO_SMALL = ! (stream->flags & XD3_NOCOMPRESS);
3794 const int DO_LARGE = (stream->src != NULL);
3796 if (DO_LARGE && stream->large_table == NULL)
3798 if ((stream->large_table =
3799 (usize_t*) xd3_alloc0 (stream, stream->large_hash.size, sizeof (usize_t))) == NULL)
3807 /* Subsequent calls can return immediately after checking reset. */
3808 if (stream->small_table != NULL)
3810 /* The target hash table is reinitialized once per window. */
3811 /* TODO: This would not have to be reinitialized if absolute
3812 * offsets were being stored. */
3813 if (stream->small_reset)
3815 stream->small_reset = 0;
3816 memset (stream->small_table, 0,
3817 sizeof (usize_t) * stream->small_hash.size);
3823 if ((stream->small_table =
3824 (usize_t*) xd3_alloc0 (stream,
3825 stream->small_hash.size,
3826 sizeof (usize_t))) == NULL)
3831 /* If there is a previous table needed. */
3832 if (stream->smatcher.small_lchain > 1 ||
3833 stream->smatcher.small_chain > 1)
3835 if ((stream->small_prev =
3836 (xd3_slist*) xd3_alloc (stream,
3838 sizeof (xd3_slist))) == NULL)
3848 #if XD3_USE_LARGEFILE64
3849 /* This function handles the 32/64bit ambiguity -- file positions are 64bit
3850 * but the hash table for source-offsets is 32bit. */
3852 xd3_source_cksum_offset(xd3_stream *stream, usize_t low)
3854 xoff_t scp = stream->srcwin_cksum_pos;
3855 xoff_t s0 = scp >> 32;
3857 usize_t sr = (usize_t) scp;
3863 /* This should not be >= because srcwin_cksum_pos is the next
3864 * position to index. */
3866 return (--s0 << 32) | low;
3869 return (s0 << 32) | low;
3873 xd3_source_cksum_offset(xd3_stream *stream, usize_t low)
3875 return (xoff_t) low;
3879 /* This function sets up the stream->src fields srcbase, srclen. The
3880 * call is delayed until these values are needed to encode a copy
3881 * address. At this point the decision has to be made. */
3883 xd3_srcwin_setup (xd3_stream *stream)
3885 xd3_source *src = stream->src;
3888 /* Check the undecided state. */
3889 XD3_ASSERT (src->srclen == 0 && src->srcbase == 0);
3891 /* Avoid repeating this call. */
3892 stream->srcwin_decided = 1;
3894 /* If the stream is flushing, then the iopt buffer was able to
3895 * contain the complete encoding. If no copies were issued no
3896 * source window is actually needed. This prevents the VCDIFF
3897 * header from including source base/len. xd3_emit_hdr checks for
3899 if (stream->enc_state == ENC_INSTR && stream->match_maxaddr == 0)
3904 /* Check for overflow, srclen is usize_t - this can't happen unless
3905 * XD3_DEFAULT_SRCBACK and related parameters are extreme - should
3906 * use smaller windows. */
3907 length = stream->match_maxaddr - stream->match_minaddr;
3909 x = (xoff_t) USIZE_T_MAX;
3912 stream->msg = "source window length overflow (not 64bit)";
3913 return XD3_INTERNAL;
3916 /* If ENC_INSTR, then we know the exact source window to use because
3917 * no more copies can be issued. */
3918 if (stream->enc_state == ENC_INSTR)
3920 src->srcbase = stream->match_minaddr;
3921 src->srclen = (usize_t) length;
3922 XD3_ASSERT (src->srclen);
3926 /* Otherwise, we have to make a guess. More copies may still be
3927 * issued, but we have to decide the source window base and length
3929 src->srcbase = stream->match_minaddr;
3930 src->srclen = max ((usize_t) length,
3931 stream->avail_in + (stream->avail_in >> 2));
3935 /* Note: if the source size is known, we must reduce srclen or
3936 * code that expects to pass a single block w/ getblk == NULL
3937 * will not function, as the code will return GETSRCBLK asking
3938 * for the second block. */
3939 src->srclen = min (src->srclen, xd3_source_eof(src) - src->srcbase);
3942 IF_DEBUG1 (DP(RINT "[srcwin_setup_constrained] base %llu len %llu\n",
3943 src->srcbase, src->srclen));
3945 XD3_ASSERT (src->srclen);
3947 /* Set the taroff. This convenience variable is used even when
3948 stream->src == NULL. */
3949 stream->taroff = src->srclen;
3953 /* Sets the bounding region for a newly discovered source match, prior
3954 * to calling xd3_source_extend_match(). This sets the match_maxfwd,
3955 * match_maxback variables. Note: srcpos is an absolute position
3956 * (xoff_t) but the match_maxfwd, match_maxback variables are usize_t.
3957 * Returns 0 if the setup succeeds, or 1 if the source position lies
3958 * outside an already-decided srcbase/srclen window. */
3960 xd3_source_match_setup (xd3_stream *stream, xoff_t srcpos)
3962 xd3_source *src = stream->src;
3963 usize_t greedy_or_not;
3964 xoff_t frontier_pos;
3966 stream->match_maxback = 0;
3967 stream->match_maxfwd = 0;
3968 stream->match_back = 0;
3969 stream->match_fwd = 0;
3971 /* This avoids a non-blocking endless loop caused by scanning
3972 * backwards across a block boundary, only to find not enough
3973 * matching bytes to beat the current min_match due to a better lazy
3974 * target match: the re-entry to xd3_string_match() repeats the same
3975 * long match because the input position hasn't changed. TODO: if
3976 * ever duplicates are added to the source hash table, this logic
3977 * won't suffice to avoid loops. See testing/regtest.cc's
3978 * TestNonBlockingProgress test! */
3979 if (srcpos != 0 && srcpos == stream->match_last_srcpos)
3981 IF_DEBUG2(DP(RINT "[match_setup] looping failure\n"));
3985 /* Implement src->max_winsize, which prevents the encoder from seeking
3986 * back further than the LRU cache maintaining FIFO discipline, (to
3987 * avoid seeking). */
3988 frontier_pos = stream->src->frontier_blkno * stream->src->blksize;
3989 IF_DEBUG2(DP(RINT "[match_setup] frontier_pos %"Q"u, srcpos %"Q"u, "
3990 "src->max_winsize %"Q"u\n",
3991 frontier_pos, srcpos, stream->src->max_winsize));
3992 if (srcpos < frontier_pos &&
3993 frontier_pos - srcpos > stream->src->max_winsize) {
3994 IF_DEBUG1(DP(RINT "[match_setup] rejected due to src->max_winsize "
3995 "distance eof=%"Q"u srcpos=%"Q"u maxsz=%"Q"u\n",
3996 xd3_source_eof (stream->src),
3997 srcpos, stream->src->max_winsize));
4001 /* Going backwards, the 1.5-pass algorithm allows some
4002 * already-matched input may be covered by a longer source match.
4003 * The greedy algorithm does not allow this. */
4004 if (stream->flags & XD3_BEGREEDY)
4006 /* The greedy algorithm allows backward matching to the last
4007 matched position. */
4008 greedy_or_not = xd3_iopt_last_matched (stream);
4012 /* The 1.5-pass algorithm allows backward matching to go back as
4013 * far as the unencoded offset, which is updated as instructions
4014 * pass out of the iopt buffer. If this (default) is chosen, it
4015 * means xd3_iopt_erase may be called to eliminate instructions
4016 * when a covering source match is found. */
4017 greedy_or_not = stream->unencoded_offset;
4020 /* Backward target match limit. */
4021 XD3_ASSERT (stream->input_position >= greedy_or_not);
4022 stream->match_maxback = stream->input_position - greedy_or_not;
4024 /* Forward target match limit. */
4025 XD3_ASSERT (stream->avail_in > stream->input_position);
4026 stream->match_maxfwd = stream->avail_in - stream->input_position;
4028 /* Now we take the source position into account. It depends whether
4029 * the srclen/srcbase have been decided yet. */
4030 if (stream->srcwin_decided == 0)
4032 /* Unrestricted case: the match can cover the entire source,
4033 * 0--src->size. We compare the usize_t
4034 * match_maxfwd/match_maxback against the xoff_t
4035 * src->size/srcpos values and take the min. */
4036 if (srcpos < (xoff_t) stream->match_maxback)
4038 stream->match_maxback = (usize_t) srcpos;
4041 if (stream->src->eof_known)
4043 xoff_t srcavail = xd3_source_eof (stream->src) - srcpos;
4045 if (srcavail < (xoff_t) stream->match_maxfwd)
4047 stream->match_maxfwd = (usize_t) srcavail;
4052 "[match_setup] srcpos %"Q"u (tgtpos %"Q"u) "
4053 "unrestricted maxback %u maxfwd %u\n",
4055 stream->total_in + stream->input_position,
4056 stream->match_maxback,
4057 stream->match_maxfwd));
4061 /* Decided some source window. */
4062 XD3_ASSERT (src->srclen > 0);
4064 /* Restricted case: fail if the srcpos lies outside the source window */
4065 if ((srcpos < src->srcbase) ||
4066 (srcpos > (src->srcbase + (xoff_t) src->srclen)))
4068 IF_DEBUG1(DP(RINT "[match_setup] restricted source window failure\n"));
4075 srcavail = (usize_t) (srcpos - src->srcbase);
4076 if (srcavail < stream->match_maxback)
4078 stream->match_maxback = srcavail;
4081 srcavail = (usize_t) (src->srcbase + (xoff_t) src->srclen - srcpos);
4082 if (srcavail < stream->match_maxfwd)
4084 stream->match_maxfwd = srcavail;
4088 "[match_setup] srcpos %"Q"u (tgtpos %"Q"u) "
4089 "restricted maxback %u maxfwd %u\n",
4091 stream->total_in + stream->input_position,
4092 stream->match_maxback,
4093 stream->match_maxfwd));
4098 stream->match_state = MATCH_BACKWARD;
4099 stream->match_srcpos = srcpos;
4100 stream->match_last_srcpos = srcpos;
4104 stream->match_state = MATCH_SEARCHING;
4109 xd3_forward_match(const uint8_t *s1c, const uint8_t *s2c, int n)
4113 int nint = n / sizeof(int);
4118 const int *s1 = (const int*)s1c;
4119 const int *s2 = (const int*)s2c;
4120 int nint_8 = nint - 8;
4122 while (i <= nint_8 &&
4123 s1[i++] == s2[j++] &&
4124 s1[i++] == s2[j++] &&
4125 s1[i++] == s2[j++] &&
4126 s1[i++] == s2[j++] &&
4127 s1[i++] == s2[j++] &&
4128 s1[i++] == s2[j++] &&
4129 s1[i++] == s2[j++] &&
4130 s1[i++] == s2[j++]) { }
4132 i = (i - 1) * sizeof(int);
4136 while (i < n && s1c[i] == s2c[i])
4143 /* This function expands the source match backward and forward. It is
4144 * reentrant, since xd3_getblk may return XD3_GETSRCBLK, so most
4145 * variables are kept in xd3_stream. There are two callers of this
4146 * function, the string_matching routine when a checksum match is
4147 * discovered, and xd3_encode_input whenever a continuing (or initial)
4148 * match is suspected. The two callers do different things with the
4149 * input_position, thus this function leaves that variable untouched.
4150 * If a match is taken the resulting stream->match_fwd is left
4153 xd3_source_extend_match (xd3_stream *stream)
4156 xd3_source *src = stream->src;
4157 xoff_t matchoff; /* matchoff is the current right/left-boundary of
4158 the source match being tested. */
4159 usize_t streamoff; /* streamoff is the current right/left-boundary
4160 of the input match being tested. */
4161 xoff_t tryblk; /* tryblk, tryoff are the block, offset position
4164 usize_t tryrem; /* tryrem is the number of matchable bytes */
4167 IF_DEBUG2(DP(RINT "[extend match] srcpos %"Q"u\n",
4168 stream->match_srcpos));
4170 XD3_ASSERT (src != NULL);
4172 /* Does it make sense to compute backward match AFTER forward match? */
4173 if (stream->match_state == MATCH_BACKWARD)
4175 /* Note: this code is practically duplicated below, substituting
4176 * match_fwd/match_back and direction. TODO: Consolidate? */
4177 matchoff = stream->match_srcpos - stream->match_back;
4178 streamoff = stream->input_position - stream->match_back;
4179 xd3_blksize_div (matchoff, src, &tryblk, &tryoff);
4181 /* this loops backward over source blocks */
4182 while (stream->match_back < stream->match_maxback)
4184 /* see if we're backing across a source block boundary */
4187 tryoff = src->blksize;
4191 if ((ret = xd3_getblk (stream, tryblk)))
4193 /* if search went too far back, continue forward. */
4194 if (ret == XD3_TOOFARBACK)
4199 /* could be a XD3_GETSRCBLK failure. */
4203 tryrem = min (tryoff, stream->match_maxback - stream->match_back);
4205 IF_DEBUG2(DP(RINT "[maxback] maxback %u trysrc %"Q"u/%u tgt %u tryrem %u\n",
4206 stream->match_maxback, tryblk, tryoff, streamoff, tryrem));
4208 /* TODO: This code can be optimized similar to xd3_match_forward() */
4209 for (; tryrem != 0; tryrem -= 1, stream->match_back += 1)
4211 if (src->curblk[tryoff-1] != stream->next_in[streamoff-1])
4222 stream->match_state = MATCH_FORWARD;
4225 XD3_ASSERT (stream->match_state == MATCH_FORWARD);
4227 matchoff = stream->match_srcpos + stream->match_fwd;
4228 streamoff = stream->input_position + stream->match_fwd;
4229 xd3_blksize_div (matchoff, src, & tryblk, & tryoff);
4231 /* Note: practically the same code as backwards case above: same comments */
4232 while (stream->match_fwd < stream->match_maxfwd)
4234 if (tryoff == src->blksize)
4240 if ((ret = xd3_getblk (stream, tryblk)))
4242 XD3_ASSERT (ret != XD3_TOOFARBACK);
4244 /* could be a XD3_GETSRCBLK failure. */
4248 tryrem = min(stream->match_maxfwd - stream->match_fwd,
4249 src->onblk - tryoff);
4253 /* Generally, this means we have a power-of-two size source
4254 * and we just found the end-of-file, in this case it's an
4256 XD3_ASSERT (src->onblk < src->blksize);
4260 matched = xd3_forward_match(src->curblk + tryoff,
4261 stream->next_in + streamoff,
4264 streamoff += matched;
4265 stream->match_fwd += matched;
4267 if (tryrem != matched)
4273 stream->match_state = MATCH_SEARCHING;
4275 /* If the match ends short of the last instruction end, we probably
4276 * don't want it. There is the possibility that a copy ends short
4277 * of the last copy but also goes further back, in which case we
4278 * might want it. This code does not implement such: if so we would
4279 * need more complicated xd3_iopt_erase logic. */
4280 if (stream->match_fwd < stream->min_match)
4282 stream->match_fwd = 0;
4286 usize_t total = stream->match_fwd + stream->match_back;
4288 /* Correct the variables to remove match_back from the equation. */
4289 usize_t target_position = stream->input_position - stream->match_back;
4290 usize_t match_length = stream->match_back + stream->match_fwd;
4291 xoff_t match_position = stream->match_srcpos - stream->match_back;
4292 xoff_t match_end = stream->match_srcpos + stream->match_fwd;
4294 /* At this point we may have to erase any iopt-buffer
4295 * instructions that are fully covered by a backward-extending
4297 if (stream->match_back > 0)
4299 xd3_iopt_erase (stream, target_position, total);
4302 stream->match_back = 0;
4304 /* Update ranges. The first source match occurs with both
4306 if (stream->match_maxaddr == 0 ||
4307 match_position < stream->match_minaddr)
4309 stream->match_minaddr = match_position;
4312 if (match_end > stream->match_maxaddr)
4314 /* Note: per-window */
4315 stream->match_maxaddr = match_end;
4318 if (match_end > stream->maxsrcaddr)
4320 /* Note: across windows */
4321 stream->maxsrcaddr = match_end;
4326 DP(RINT "[source match:%d] <inp %"Q"u %"Q"u> <src %"Q"u %"Q"u> (%s) [ %u bytes ]\n",
4328 stream->total_in + target_position,
4329 stream->total_in + target_position + match_length,
4331 match_position + match_length,
4332 (stream->total_in + target_position == match_position) ? "same" : "diff",
4336 if ((ret = xd3_found_match (stream,
4337 /* decoder position */ target_position,
4338 /* length */ match_length,
4339 /* address */ match_position,
4340 /* is_source */ 1)))
4345 /* If the match ends with the available input: */
4346 if (target_position + match_length == stream->avail_in)
4348 /* Setup continuing match for the next window. */
4349 stream->match_state = MATCH_TARGET;
4350 stream->match_srcpos = match_end;
4357 /* Update the small hash. Values in the small_table are offset by
4358 * HASH_CKOFFSET (1) to distinguish empty buckets from real offsets. */
4360 xd3_scksum_insert (xd3_stream *stream,
4365 /* If we are maintaining previous duplicates. */
4366 if (stream->small_prev)
4368 usize_t last_pos = stream->small_table[inx];
4369 xd3_slist *pos_list = & stream->small_prev[pos & stream->sprevmask];
4371 /* Note last_pos is offset by HASH_CKOFFSET. */
4372 pos_list->last_pos = last_pos;
4375 /* Enter the new position into the hash bucket. */
4376 stream->small_table[inx] = pos + HASH_CKOFFSET;
4381 xd3_check_smatch (const uint8_t *ref0, const uint8_t *inp0,
4382 const uint8_t *inp_max, usize_t cmp_len)
4386 for (i = 0; i < cmp_len; i += 1)
4388 XD3_ASSERT (ref0[i] == inp0[i]);
4391 if (inp0 + cmp_len < inp_max)
4393 XD3_ASSERT (inp0[i] != ref0[i]);
4398 #endif /* XD3_DEBUG */
4400 /* When the hash table indicates a possible small string match, it
4401 * calls this routine to find the best match. The first matching
4402 * position is taken from the small_table, HASH_CKOFFSET is subtracted
4403 * to get the actual position. After checking that match, if previous
4404 * linked lists are in use (because stream->smatcher.small_chain > 1),
4405 * previous matches are tested searching for the longest match. If
4406 * (stream->min_match > MIN_MATCH) then a lazy match is in effect.
4409 xd3_smatch (xd3_stream *stream,
4412 usize_t *match_offset)
4415 usize_t match_length = 0;
4416 usize_t chain = (stream->min_match == MIN_MATCH ?
4417 stream->smatcher.small_chain :
4418 stream->smatcher.small_lchain);
4419 const uint8_t *inp_max = stream->next_in + stream->avail_in;
4423 SMALL_HASH_DEBUG1 (stream, stream->next_in + stream->input_position);
4425 XD3_ASSERT (stream->min_match + stream->input_position <= stream->avail_in);
4427 base -= HASH_CKOFFSET;
4431 IF_DEBUG2 (DP(RINT "smatch at base=%u inp=%u cksum=%u\n", base,
4432 stream->input_position, scksum));
4434 /* For small matches, we can always go to the end-of-input because
4435 * the matching position must be less than the input position. */
4436 XD3_ASSERT (base < stream->input_position);
4438 ref = stream->next_in + base;
4439 inp = stream->next_in + stream->input_position;
4441 SMALL_HASH_DEBUG2 (stream, ref);
4443 /* Expand potential match forward. */
4444 while (inp < inp_max && *inp == *ref)
4450 cmp_len = (usize_t)(inp - (stream->next_in + stream->input_position));
4452 /* Verify correctness */
4453 XD3_ASSERT (xd3_check_smatch (stream->next_in + base,
4454 stream->next_in + stream->input_position,
4457 /* Update longest match */
4458 if (cmp_len > match_length)
4460 ( match_length) = cmp_len;
4461 (*match_offset) = base;
4463 /* Stop if we match the entire input or have a long_enough match. */
4464 if (inp == inp_max || cmp_len >= stream->smatcher.long_enough)
4470 /* If we have not reached the chain limit, see if there is another
4471 previous position. */
4472 while (--chain != 0)
4474 /* Calculate the previous offset. */
4475 usize_t prev_pos = stream->small_prev[base & stream->sprevmask].last_pos;
4483 prev_pos -= HASH_CKOFFSET;
4485 if (prev_pos > base)
4492 XD3_ASSERT (stream->input_position > base);
4493 diff_pos = stream->input_position - base;
4495 /* Stop searching if we go beyond sprevsz, since those entries
4496 * are for unrelated checksum entries. */
4497 if (diff_pos & ~stream->sprevmask)
4506 /* Crude efficiency test: if the match is very short and very far back, it's
4507 * unlikely to help, but the exact calculation requires knowing the state of
4508 * the address cache and adjacent instructions, which we can't do here.
4509 * Rather than encode a probably inefficient copy here and check it later
4510 * (which complicates the code a lot), do this:
4512 if (match_length == 4 && stream->input_position - (*match_offset) >= 1<<14)
4514 /* It probably takes >2 bytes to encode an address >= 2^14 from here */
4517 if (match_length == 5 && stream->input_position - (*match_offset) >= 1<<21)
4519 /* It probably takes >3 bytes to encode an address >= 2^21 from here */
4523 /* It's unlikely that a window is large enough for the (match_length == 6 &&
4524 * address >= 2^28) check */
4525 return match_length;
4530 xd3_verify_small_state (xd3_stream *stream,
4535 uint32_t cksum = xd3_scksum (&state, inp, stream->smatcher.small_look);
4537 XD3_ASSERT (cksum == x_cksum);
4541 xd3_verify_large_state (xd3_stream *stream,
4545 uint32_t cksum = xd3_lcksum (inp, stream->smatcher.large_look);
4546 XD3_ASSERT (cksum == x_cksum);
4549 xd3_verify_run_state (xd3_stream *stream,
4554 usize_t slook = stream->smatcher.small_look;
4556 usize_t run_l = xd3_comprun (inp, slook, &run_c);
4558 XD3_ASSERT (run_l == 0 || run_c == *x_run_c);
4559 XD3_ASSERT (x_run_l > slook || run_l == x_run_l);
4561 #endif /* XD3_DEBUG */
4563 /* This function computes more source checksums to advance the window.
4564 * Called at every entrance to the string-match loop and each time
4565 * stream->input_position reaches the value returned as
4566 * *next_move_point. NB: this is one of the most expensive functions
4567 * in this code and also the most critical for good compression.
4568 * TODO: optimize the inner loop
4571 xd3_srcwin_move_point (xd3_stream *stream, usize_t *next_move_point)
4573 xoff_t logical_input_cksum_pos;
4576 if (stream->src->eof_known)
4578 source_size = xd3_source_eof (stream->src);
4579 XD3_ASSERT(stream->srcwin_cksum_pos <= source_size);
4581 if (stream->srcwin_cksum_pos == source_size)
4583 *next_move_point = USIZE_T_MAX;
4588 /* Begin by advancing at twice the input rate, up to half the
4589 * maximum window size. */
4590 logical_input_cksum_pos = min((stream->total_in + stream->input_position) * 2,
4591 (stream->total_in + stream->input_position) +
4592 (stream->src->max_winsize / 2));
4594 /* If srcwin_cksum_pos is already greater, wait until the difference
4596 if (stream->srcwin_cksum_pos > logical_input_cksum_pos)
4598 *next_move_point = stream->input_position +
4599 (usize_t)(stream->srcwin_cksum_pos - logical_input_cksum_pos);
4603 /* A long match may have extended past srcwin_cksum_pos. Don't
4604 * start checksumming already-matched source data. */
4605 if (stream->maxsrcaddr > stream->srcwin_cksum_pos)
4607 stream->srcwin_cksum_pos = stream->maxsrcaddr;
4610 if (logical_input_cksum_pos < stream->srcwin_cksum_pos)
4612 logical_input_cksum_pos = stream->srcwin_cksum_pos;
4615 /* Advance at least one source block. With the command-line
4616 * defaults this means:
4618 * if (src->size <= src->max_winsize), index the entire source at once
4619 * using the position of the first non-match. This is good for
4620 * small inputs, especially when the content may have moved anywhere
4621 * in the file (e.g., tar files).
4623 * if (src->size > src->max_winsize), index at least one block ahead
4624 * of the logical position. This is good for different reasons:
4625 * when a long match spanning several source blocks is encountered,
4626 * this avoids computing checksums for those blocks.
4628 logical_input_cksum_pos += stream->src->blksize;
4630 while (stream->srcwin_cksum_pos < logical_input_cksum_pos &&
4631 (!stream->src->eof_known ||
4632 stream->srcwin_cksum_pos < xd3_source_eof (stream->src)))
4635 xoff_t blkbaseoffset;
4637 ssize_t oldpos; /* Using ssize_t because of a */
4638 ssize_t blkpos; /* do { blkpos-- }
4639 while (blkpos >= oldpos); */
4641 xd3_blksize_div (stream->srcwin_cksum_pos,
4642 stream->src, &blkno, &blkrem);
4645 if ((ret = xd3_getblk (stream, blkno)))
4647 /* TOOFARBACK should never occur here, since we read forward. */
4648 if (ret == XD3_TOOFARBACK)
4653 "[srcwin_move_point] async getblk return for %"Q"u\n",
4659 "[srcwin_move_point] T=%"Q"u{%"Q"u} S=%"Q"u EOF=%"Q"u %s\n",
4660 stream->total_in + stream->input_position,
4661 logical_input_cksum_pos,
4662 stream->srcwin_cksum_pos,
4663 xd3_source_eof (stream->src),
4664 stream->src->eof_known ? "known" : "unknown"));
4666 blkpos = xd3_bytes_on_srcblk (stream->src, blkno);
4668 if (blkpos < (ssize_t) stream->smatcher.large_look)
4670 stream->srcwin_cksum_pos = (blkno + 1) * stream->src->blksize;
4671 IF_DEBUG1 (DP(RINT "[srcwin_move_point] continue (end-of-block)\n"));
4675 /* This inserts checksums for the entire block, in reverse,
4676 * starting from the end of the block. This logic does not test
4677 * stream->srcwin_cksum_pos because it always advances it to the
4678 * start of the next block.
4680 * oldpos is the srcwin_cksum_pos within this block. blkpos is
4681 * the number of bytes available. Each iteration inspects
4682 * large_look bytes then steps back large_step bytes. The
4683 * if-stmt above ensures at least one large_look of data. */
4684 blkpos -= stream->smatcher.large_look;
4685 blkbaseoffset = stream->src->blksize * blkno;
4689 uint32_t cksum = xd3_lcksum (stream->src->curblk + blkpos,
4690 stream->smatcher.large_look);
4691 usize_t hval = xd3_checksum_hash (& stream->large_hash, cksum);
4693 stream->large_table[hval] =
4694 (usize_t) (blkbaseoffset +
4695 (xoff_t)(blkpos + HASH_CKOFFSET));
4697 IF_DEBUG (stream->large_ckcnt += 1);
4699 blkpos -= stream->smatcher.large_step;
4701 while (blkpos >= oldpos);
4703 stream->srcwin_cksum_pos = (blkno + 1) * stream->src->blksize;
4707 "[srcwin_move_point] exited loop T=%"Q"u{%"Q"u} "
4708 "S=%"Q"u EOF=%"Q"u %s\n",
4709 stream->total_in + stream->input_position,
4710 logical_input_cksum_pos,
4711 stream->srcwin_cksum_pos,
4712 xd3_source_eof (stream->src),
4713 stream->src->eof_known ? "known" : "unknown"));
4715 if (stream->src->eof_known)
4717 source_size = xd3_source_eof (stream->src);
4719 if (stream->srcwin_cksum_pos >= source_size)
4721 /* This invariant is needed for xd3_source_cksum_offset() */
4722 stream->srcwin_cksum_pos = source_size;
4723 *next_move_point = USIZE_T_MAX;
4725 "[srcwin_move_point] finished with source input\n"));
4730 /* How long until this function should be called again. */
4731 XD3_ASSERT(stream->srcwin_cksum_pos >= logical_input_cksum_pos);
4732 *next_move_point = stream->input_position + 1 +
4733 (usize_t)(stream->srcwin_cksum_pos - logical_input_cksum_pos);
4737 #endif /* XD3_ENCODER */
4739 /********************************************************************
4741 *********************************************************************/
4743 #endif /* __XDELTA3_C_INLINE_PASS__ */
4744 #ifdef __XDELTA3_C_TEMPLATE_PASS__
4748 /********************************************************************
4750 *******************************************************************/
4752 /* Template macros */
4753 #define XD3_TEMPLATE(x) XD3_TEMPLATE2(x,TEMPLATE)
4754 #define XD3_TEMPLATE2(x,n) XD3_TEMPLATE3(x,n)
4755 #define XD3_TEMPLATE3(x,n) x ## n
4756 #define XD3_STRINGIFY(x) XD3_STRINGIFY2(x)
4757 #define XD3_STRINGIFY2(x) #x
4759 static int XD3_TEMPLATE(xd3_string_match_) (xd3_stream *stream);
4761 static const xd3_smatcher XD3_TEMPLATE(__smatcher_) =
4763 XD3_STRINGIFY(TEMPLATE),
4764 XD3_TEMPLATE(xd3_string_match_),
4768 LLOOK, LSTEP, SLOOK, SCHAIN, SLCHAIN, MAXLAZY, LONGENOUGH
4773 XD3_TEMPLATE(xd3_string_match_) (xd3_stream *stream)
4775 const int DO_SMALL = ! (stream->flags & XD3_NOCOMPRESS);
4776 const int DO_LARGE = (stream->src != NULL);
4777 const int DO_RUN = (1);
4780 uint32_t scksum = 0;
4781 uint32_t scksum_state = 0;
4782 uint32_t lcksum = 0;
4788 usize_t match_length;
4789 usize_t match_offset = 0;
4790 usize_t next_move_point;
4792 /* If there will be no compression due to settings or short input,
4793 * skip it entirely. */
4794 if (! (DO_SMALL || DO_LARGE || DO_RUN) ||
4795 stream->input_position + SLOOK > stream->avail_in) { goto loopnomore; }
4797 if ((ret = xd3_string_match_init (stream))) { return ret; }
4799 /* The restartloop label is reached when the incremental loop state
4800 * needs to be reset. */
4803 /* If there is not enough input remaining for any kind of match,
4805 if (stream->input_position + SLOOK > stream->avail_in) { goto loopnomore; }
4807 /* Now reset the incremental loop state: */
4809 /* The min_match variable is updated to avoid matching the same lazy
4810 * match over and over again. For example, if you find a (small)
4811 * match of length 9 at one position, you will likely find a match
4812 * of length 8 at the next position. */
4813 if (xd3_iopt_last_matched (stream) > stream->input_position)
4815 stream->min_match = max(MIN_MATCH,
4816 1 + xd3_iopt_last_matched(stream) -
4817 stream->input_position);
4821 stream->min_match = MIN_MATCH;
4824 /* The current input byte. */
4825 inp = stream->next_in + stream->input_position;
4827 /* Small match state. */
4830 scksum = xd3_scksum (&scksum_state, inp, SLOOK);
4836 run_l = xd3_comprun (inp, SLOOK, & run_c);
4839 /* Large match state. We continue the loop even after not enough
4840 * bytes for LLOOK remain, so always check stream->input_position in
4842 if (DO_LARGE && (stream->input_position + LLOOK <= stream->avail_in))
4844 /* Source window: next_move_point is the point that
4845 * stream->input_position must reach before computing more
4846 * source checksum. */
4847 if ((ret = xd3_srcwin_move_point (stream, & next_move_point)))
4852 lcksum = xd3_lcksum (inp, LLOOK);
4855 /* TRYLAZYLEN: True if a certain length match should be followed by
4856 * lazy search. This checks that LEN is shorter than MAXLAZY and
4857 * that there is enough leftover data to consider lazy matching.
4858 * "Enough" is set to 2 since the next match will start at the next
4859 * offset, it must match two extra characters. */
4860 #define TRYLAZYLEN(LEN,POS,MAX) ((MAXLAZY) > 0 && (LEN) < (MAXLAZY) \
4861 && (POS) + (LEN) <= (MAX) - 2)
4863 /* HANDLELAZY: This statement is called each time an instruciton is
4864 * emitted (three cases). If the instruction is large enough, the
4865 * loop is restarted, otherwise lazy matching may ensue. */
4866 #define HANDLELAZY(mlen) \
4867 if (TRYLAZYLEN ((mlen), (stream->input_position), (stream->avail_in))) \
4868 { stream->min_match = (mlen) + LEAST_MATCH_INCR; goto updateone; } \
4870 { stream->input_position += (mlen); goto restartloop; }
4872 /* Now loop over one input byte at a time until a match is found... */
4873 for (;; inp += 1, stream->input_position += 1)
4875 /* Now we try three kinds of string match in order of expense:
4876 * run, large match, small match. */
4878 /* Expand the start of a RUN. The test for (run_l == SLOOK)
4879 * avoids repeating this check when we pass through a run area
4880 * performing lazy matching. The run is only expanded once when
4881 * the min_match is first reached. If lazy matching is
4882 * performed, the run_l variable will remain inconsistent until
4883 * the first non-running input character is reached, at which
4884 * time the run_l may then again grow to SLOOK. */
4885 if (DO_RUN && run_l == SLOOK)
4887 usize_t max_len = stream->avail_in - stream->input_position;
4889 IF_DEBUG (xd3_verify_run_state (stream, inp, run_l, &run_c));
4891 while (run_l < max_len && inp[run_l] == run_c) { run_l += 1; }
4893 /* Output a RUN instruction. */
4894 if (run_l >= stream->min_match && run_l >= MIN_RUN)
4896 if ((ret = xd3_emit_run (stream, stream->input_position,
4897 run_l, &run_c))) { return ret; }
4903 /* If there is enough input remaining. */
4904 if (DO_LARGE && (stream->input_position + LLOOK <= stream->avail_in))
4906 if ((stream->input_position >= next_move_point) &&
4907 (ret = xd3_srcwin_move_point (stream, & next_move_point)))
4912 linx = xd3_checksum_hash (& stream->large_hash, lcksum);
4914 IF_DEBUG (xd3_verify_large_state (stream, inp, lcksum));
4916 if (stream->large_table[linx] != 0)
4918 /* the match_setup will fail if the source window has
4919 * been decided and the match lies outside it.
4920 * OPT: Consider forcing a window at this point to
4921 * permit a new source window. */
4923 xd3_source_cksum_offset(stream,
4924 stream->large_table[linx] -
4926 if (xd3_source_match_setup (stream, adj_offset) == 0)
4928 if ((ret = xd3_source_extend_match (stream)))
4933 /* Update stream position. match_fwd is zero if no
4935 if (stream->match_fwd > 0)
4937 HANDLELAZY (stream->match_fwd);
4943 /* Small matches. */
4946 sinx = xd3_checksum_hash (& stream->small_hash, scksum);
4948 /* Verify incremental state in debugging mode. */
4949 IF_DEBUG (xd3_verify_small_state (stream, inp, scksum));
4951 /* Search for the longest match */
4952 if (stream->small_table[sinx] != 0)
4954 match_length = xd3_smatch (stream,
4955 stream->small_table[sinx],
4964 /* Insert a hash for this string. */
4965 xd3_scksum_insert (stream, sinx, scksum, stream->input_position);
4967 /* Maybe output a COPY instruction */
4968 if (match_length >= stream->min_match)
4972 DP(RINT "[target match:%d] <inp %u %u> <cpy %u %u> "
4973 "(-%d) [ %u bytes ]\n",
4975 stream->input_position,
4976 stream->input_position + match_length,
4978 match_offset + match_length,
4979 stream->input_position - match_offset,
4983 if ((ret = xd3_found_match (stream,
4984 /* decoder position */
4985 stream->input_position,
4986 /* length */ match_length,
4987 /* address */ (xoff_t) match_offset,
4988 /* is_source */ 0)))
4993 /* Copy instruction. */
4994 HANDLELAZY (match_length);
4998 /* The logic above prevents excess work during lazy matching by
4999 * increasing min_match to avoid smaller matches. Each time we
5000 * advance stream->input_position by one, the minimum match
5001 * shortens as well. */
5002 if (stream->min_match > MIN_MATCH)
5004 stream->min_match -= 1;
5009 /* See if there are no more incremental cksums to compute. */
5010 if (stream->input_position + SLOOK == stream->avail_in)
5015 /* Compute next RUN, CKSUM */
5018 NEXTRUN (inp[SLOOK]);
5023 scksum = xd3_small_cksum_update (&scksum_state, inp, SLOOK);
5026 if (DO_LARGE && (stream->input_position + LLOOK < stream->avail_in))
5028 lcksum = xd3_large_cksum_update (lcksum, inp, LLOOK);
5036 #endif /* XD3_ENCODER */
5037 #endif /* __XDELTA3_C_TEMPLATE_PASS__ */