1 /* xdelta 3 - delta compression tools and library
2 * Copyright (C) 2001, 2003, 2004, 2005, 2006, 2007,
3 * 2008, 2009, 2010. Joshua P. MacDonald
5 * This program is free software; you can redistribute it and/or modify
6 * it under the terms of the GNU General Public License as published by
7 * the Free Software Foundation; either version 2 of the License, or
8 * (at your option) any later version.
10 * This program is distributed in the hope that it will be useful,
11 * but WITHOUT ANY WARRANTY; without even the implied warranty of
12 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
13 * GNU General Public License for more details.
15 * You should have received a copy of the GNU General Public License
16 * along with this program; if not, write to the Free Software
17 * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
19 -------------------------------------------------------------------
23 The goal of this library is to to implement both the (stand-alone)
24 data-compression and delta-compression aspects of VCDIFF encoding, and
25 to support a programming interface that works like Zlib
26 (http://www.gzip.org/zlib.html). See RFC3284: The VCDIFF Generic
27 Differencing and Compression Data Format.
29 VCDIFF is a unified encoding that combines data-compression and
30 delta-encoding ("differencing").
32 VCDIFF has a detailed byte-code instruction set with many features.
33 The instruction format supports an immediate size operand for small
34 COPYs and ADDs (e.g., under 18 bytes). There are also instruction
35 "modes", which are used to compress COPY addresses by using two
36 address caches. An instruction mode refers to slots in the NEAR
37 and SAME caches for recent addresses. NEAR remembers the
38 previous 4 (by default) COPY addresses, and SAME catches
39 frequent re-uses of the same address using a 3-way (by default)
40 256-entry associative cache of [ADDR mod 256], the encoded byte.
41 A hit in the NEAR/SAME cache requires 0/1 ADDR bytes.
43 VCDIFF has a default instruction table, but an alternate
44 instruction tables may themselves be be delta-compressed and
45 included in the encoding header. This allows even more freedom.
46 There are 9 instruction modes in the default code table, 4 near, 3
47 same, VCD_SELF (absolute encoding) and VCD_HERE (relative to the
50 ----------------------------------------------------------------------
54 Aside from the details of encoding and decoding, there are a bunch
57 1. STRING-MATCH. A two-level fingerprinting approach is used. A
58 single loop computes the two checksums -- small and large -- at
59 successive offsets in the TARGET file. The large checksum is more
60 accurate and is used to discover SOURCE matches, which are
61 potentially very long. The small checksum is used to discover
62 copies within the TARGET. Small matching, which is more expensive,
63 usually dominates the large STRING-MATCH costs in this code - the
64 more exhaustive the search, the better the results. Either of the
65 two string-matching mechanisms may be disabled.
67 2. INSTRUCTION SELECTION. The IOPT buffer here represents a queue
68 used to store overlapping copy instructions. There are two possible
69 optimizations that go beyond a greedy search. Both of these fall
70 into the category of "non-greedy matching" optimizations.
72 The first optimization stems from backward SOURCE-COPY matching.
73 When a new SOURCE-COPY instruction covers a previous instruction in
74 the target completely, it is erased from the queue. Randal Burns
75 originally analyzed these algorithms and did a lot of related work
76 (\cite the 1.5-pass algorithm).
78 The second optimization comes by the encoding of common very-small
79 COPY and ADD instructions, for which there are special DOUBLE-code
80 instructions, which code two instructions in a single byte.
82 The cost of bad instruction-selection overhead is relatively high
83 for data-compression, relative to delta-compression, so this second
84 optimization is fairly important. With "lazy" matching (the name
85 used in Zlib for a similar optimization), the string-match
86 algorithm searches after a match for potential overlapping copy
87 instructions. In Xdelta and by default, VCDIFF, the minimum match
88 size is 4 bytes, whereas Zlib searches with a 3-byte minimum. This
89 feature, combined with double instructions, provides a nice
90 challenge. Search in this file for "black magic", a heuristic.
92 3. STREAM ALIGNMENT. Stream alignment is needed to compress large
93 inputs in constant space. See xd3_srcwin_move_point().
95 4. WINDOW SELECTION. When the IOPT buffer flushes, in the first call
96 to xd3_iopt_finish_encoding containing any kind of copy instruction,
97 the parameters of the source window must be decided: the offset into
98 the source and the length of the window. Since the IOPT buffer is
99 finite, the program may be forced to fix these values before knowing
100 the best offset/length.
102 5. SECONDARY COMPRESSION. VCDIFF supports a secondary encoding to
103 be applied to the individual sections of the data format, which are
104 ADDRess, INSTruction, and DATA. Several secondary compressor
105 variations are implemented here, although none is standardized yet.
107 One is an adaptive huffman algorithm -- the FGK algorithm (Faller,
108 Gallager, and Knuth, 1985). This compressor is extremely slow.
110 The other is a simple static Huffman routine, which is the base
111 case of a semi-adaptive scheme published by D.J. Wheeler and first
112 widely used in bzip2 (by Julian Seward). This is a very
113 interesting algorithm, originally published in nearly cryptic form
114 by D.J. Wheeler. !!!NOTE!!! Because these are not standardized,
115 secondary compression remains off by default.
116 ftp://ftp.cl.cam.ac.uk/users/djw3/bred3.{c,ps}
117 --------------------------------------------------------------------
123 For user convenience, it is essential to recognize Gzip-compressed
124 files and automatically Gzip-decompress them prior to
125 delta-compression (or else no delta-compression will be achieved
126 unless the user manually decompresses the inputs). The compressed
127 represention competes with Xdelta, and this must be hidden from the
128 command-line user interface. The Xdelta-1.x encoding was simple, not
129 compressed itself, so Xdelta-1.x uses Zlib internally to compress the
132 This implementation supports external compression, which implements
133 the necessary fork() and pipe() mechanics. There is a tricky step
134 involved to support automatic detection of a compressed input in a
135 non-seekable input. First you read a bit of the input to detect
136 magic headers. When a compressed format is recognized, exec() the
137 external compression program and create a second child process to
138 copy the original input stream. [Footnote: There is a difficulty
139 related to using Gzip externally. It is not possible to decompress
140 and recompress a Gzip file transparently. If FILE.GZ had a
141 cryptographic signature, then, after: (1) Gzip-decompression, (2)
142 Xdelta-encoding, (3) Gzip-compression the signature could be
143 broken. The only way to solve this problem is to guess at Gzip's
144 compression level or control it by other means. I recommend that
145 specific implementations of any compression scheme store
146 information needed to exactly re-compress the input, that way
147 external compression is transparent - however, this won't happen
148 here until it has stabilized.]
150 2. APPLICATION-HEADER
152 This feature was introduced in RFC3284. It allows any application
153 to include a header within the VCDIFF file format. This allows
154 general inter-application data exchange with support for
155 application-specific extensions to communicate metadata.
159 An optional checksum value is included with each window, which can
160 be used to validate the final result. This verifies the correct source
161 file was used for decompression as well as the obvious advantage:
162 checking the implementation (and underlying) correctness.
166 The code makes efforts to avoid copying data more than necessary.
167 The code delays many initialization tasks until the first use, it
168 optimizes for identical (perfectly matching) inputs. It does not
169 compute any checksums until the first lookup misses. Memory usage
170 is reduced. String-matching is templatized (by slightly gross use
171 of CPP) to hard-code alternative compile-time defaults. The code
172 has few outside dependencies.
173 ----------------------------------------------------------------------
175 The default rfc3284 instruction table:
176 (see RFC for the explanation)
178 TYPE SIZE MODE TYPE SIZE MODE INDEX
179 --------------------------------------------------------------------
180 1. Run 0 0 Noop 0 0 0
181 2. Add 0, [1,17] 0 Noop 0 0 [1,18]
182 3. Copy 0, [4,18] 0 Noop 0 0 [19,34]
183 4. Copy 0, [4,18] 1 Noop 0 0 [35,50]
184 5. Copy 0, [4,18] 2 Noop 0 0 [51,66]
185 6. Copy 0, [4,18] 3 Noop 0 0 [67,82]
186 7. Copy 0, [4,18] 4 Noop 0 0 [83,98]
187 8. Copy 0, [4,18] 5 Noop 0 0 [99,114]
188 9. Copy 0, [4,18] 6 Noop 0 0 [115,130]
189 10. Copy 0, [4,18] 7 Noop 0 0 [131,146]
190 11. Copy 0, [4,18] 8 Noop 0 0 [147,162]
191 12. Add [1,4] 0 Copy [4,6] 0 [163,174]
192 13. Add [1,4] 0 Copy [4,6] 1 [175,186]
193 14. Add [1,4] 0 Copy [4,6] 2 [187,198]
194 15. Add [1,4] 0 Copy [4,6] 3 [199,210]
195 16. Add [1,4] 0 Copy [4,6] 4 [211,222]
196 17. Add [1,4] 0 Copy [4,6] 5 [223,234]
197 18. Add [1,4] 0 Copy 4 6 [235,238]
198 19. Add [1,4] 0 Copy 4 7 [239,242]
199 20. Add [1,4] 0 Copy 4 8 [243,246]
200 21. Copy 4 [0,8] Add 1 0 [247,255]
201 --------------------------------------------------------------------
203 Reading the source: Overview
205 This file includes itself in several passes to macro-expand certain
206 sections with variable forms. Just read ahead, there's only a
207 little confusion. I know this sounds ugly, but hard-coding some of
208 the string-matching parameters results in a 10-15% increase in
209 string-match performance. The only time this hurts is when you have
210 unbalanced #if/endifs.
212 A single compilation unit tames the Makefile. In short, this is to
213 allow the above-described hack without an explodingMakefile. The
214 single compilation unit includes the core library features,
215 configurable string-match templates, optional main() command-line
216 tool, misc optional features, and a regression test. Features are
217 controled with CPP #defines, see Makefile.am.
219 The initial __XDELTA3_C_HEADER_PASS__ starts first, the _INLINE_ and
220 _TEMPLATE_ sections follow. Easy stuff first, hard stuff last.
222 Optional features include:
224 xdelta3-main.h The command-line interface, external compression
225 support, POSIX-specific, info & VCDIFF-debug tools.
226 xdelta3-second.h The common secondary compression routines.
227 xdelta3-decoder.h All decoding routines.
228 xdelta3-djw.h The semi-adaptive huffman secondary encoder.
229 xdelta3-fgk.h The adaptive huffman secondary encoder.
230 xdelta3-test.h The unit test covers major algorithms,
231 encoding and decoding. There are single-bit
232 error decoding tests. There are 32/64-bit file size
233 boundary tests. There are command-line tests.
234 There are compression tests. There are external
235 compression tests. There are string-matching tests.
236 There should be more tests...
238 Additional headers include:
240 xdelta3.h The public header file.
241 xdelta3-cfgs.h The default settings for default, built-in
242 encoders. These are hard-coded at
243 compile-time. There is also a single
244 soft-coded string matcher for experimenting
245 with arbitrary values.
246 xdelta3-list.h A cyclic list template
248 Misc little debug utilities:
250 badcopy.c Randomly modifies an input file based on two
251 parameters: (1) the probability that a byte in
252 the file is replaced with a pseudo-random value,
253 and (2) the mean change size. Changes are
254 generated using an expoential distribution
255 which approximates the expected error_prob
257 --------------------------------------------------------------------
259 This file itself is unusually large. I hope to defend this layout
260 with lots of comments. Everything in this file is related to
261 encoding and decoding. I like it all together - the template stuff
264 #ifndef __XDELTA3_C_HEADER_PASS__
265 #define __XDELTA3_C_HEADER_PASS__
272 /***********************************************************************
274 ***********************************************************************/
276 #ifndef XD3_MAIN /* the main application */
281 #define VCDIFF_TOOLS XD3_MAIN
284 #ifndef SECONDARY_FGK /* one from the algorithm preservation department: */
285 #define SECONDARY_FGK 0 /* adaptive Huffman routines */
288 #ifndef SECONDARY_DJW /* semi-adaptive/static Huffman for the eventual */
289 #define SECONDARY_DJW 0 /* standardization, off by default until such time. */
292 #ifndef GENERIC_ENCODE_TABLES /* These three are the RFC-spec'd app-specific */
293 #define GENERIC_ENCODE_TABLES 0 /* code features. This is tested but not recommended */
294 #endif /* unless there's a real application. */
295 #ifndef GENERIC_ENCODE_TABLES_COMPUTE
296 #define GENERIC_ENCODE_TABLES_COMPUTE 0
298 #ifndef GENERIC_ENCODE_TABLES_COMPUTE_PRINT
299 #define GENERIC_ENCODE_TABLES_COMPUTE_PRINT 0
303 #define IF_ENCODER(x) x
305 #define IF_ENCODER(x)
308 /***********************************************************************/
310 /* header indicator bits */
311 #define VCD_SECONDARY (1U << 0) /* uses secondary compressor */
312 #define VCD_CODETABLE (1U << 1) /* supplies code table data */
313 #define VCD_APPHEADER (1U << 2) /* supplies application data */
314 #define VCD_INVHDR (~0x7U)
316 /* window indicator bits */
317 #define VCD_SOURCE (1U << 0) /* copy window in source file */
318 #define VCD_TARGET (1U << 1) /* copy window in target file */
319 #define VCD_ADLER32 (1U << 2) /* has adler32 checksum */
320 #define VCD_INVWIN (~0x7U)
322 #define VCD_SRCORTGT (VCD_SOURCE | VCD_TARGET)
324 /* delta indicator bits */
325 #define VCD_DATACOMP (1U << 0)
326 #define VCD_INSTCOMP (1U << 1)
327 #define VCD_ADDRCOMP (1U << 2)
328 #define VCD_INVDEL (~0x7U)
332 VCD_FGK_ID = 16, /* Note: these are not standard IANA-allocated IDs! */
338 /* Note: SEC_COUNT_FREQS Not implemented (to eliminate 1st Huffman pass) */
339 SEC_COUNT_FREQS = (1 << 0),
340 } xd3_secondary_flags;
343 DATA_SECTION, /* These indicate which section to the secondary
345 INST_SECTION, /* The header section is not compressed, therefore not
350 typedef unsigned int xd3_rtype;
352 /***********************************************************************/
354 #include "xdelta3-list.h"
356 XD3_MAKELIST(xd3_rlist, xd3_rinst, link);
358 /***********************************************************************/
360 #define SECONDARY_MIN_SAVINGS 2 /* Secondary compression has to save
361 at least this many bytes. */
362 #define SECONDARY_MIN_INPUT 10 /* Secondary compression needs at
363 least this many bytes. */
365 #define VCDIFF_MAGIC1 0xd6 /* 1st file byte */
366 #define VCDIFF_MAGIC2 0xc3 /* 2nd file byte */
367 #define VCDIFF_MAGIC3 0xc4 /* 3rd file byte */
368 #define VCDIFF_VERSION 0x00 /* 4th file byte */
370 #define VCD_SELF 0 /* 1st address mode */
371 #define VCD_HERE 1 /* 2nd address mode */
373 #define CODE_TABLE_STRING_SIZE (6 * 256) /* Should fit a code table string. */
374 #define CODE_TABLE_VCDIFF_SIZE (6 * 256) /* Should fit a compressed code
377 #define SECONDARY_ANY (SECONDARY_DJW || SECONDARY_FGK)
379 #define ALPHABET_SIZE 256 /* Used in test code--size of the secondary
380 * compressor alphabet. */
382 #define HASH_PERMUTE 1 /* The input is permuted by random nums */
383 #define ADLER_LARGE_CKSUM 1 /* Adler checksum vs. RK checksum */
385 #define HASH_CKOFFSET 1U /* Table entries distinguish "no-entry" from
386 * offset 0 using this offset. */
388 #define MIN_SMALL_LOOK 2U /* Match-optimization stuff. */
389 #define MIN_LARGE_LOOK 2U
390 #define MIN_MATCH_OFFSET 1U
391 #define MAX_MATCH_SPLIT 18U /* VCDIFF code table: 18 is the default limit
392 * for direct-coded ADD sizes */
394 #define LEAST_MATCH_INCR 0 /* The least number of bytes an overlapping
395 * match must beat the preceding match by. This
396 * is a bias for the lazy match optimization. A
397 * non-zero value means that an adjacent match
398 * has to be better by more than the step
399 * between them. 0. */
401 #define MIN_MATCH 4U /* VCDIFF code table: MIN_MATCH=4 */
402 #define MIN_ADD 1U /* 1 */
403 #define MIN_RUN 8U /* The shortest run, if it is shorter than this
404 * an immediate add/copy will be just as good.
405 * ADD1/COPY6 = 1I+1D+1A bytes, RUN18 =
408 #define MAX_MODES 9 /* Maximum number of nodes used for
409 * compression--does not limit decompression. */
411 #define ENC_SECTS 4 /* Number of separate output sections. */
413 #define HDR_TAIL(s) ((s)->enc_tails[0])
414 #define DATA_TAIL(s) ((s)->enc_tails[1])
415 #define INST_TAIL(s) ((s)->enc_tails[2])
416 #define ADDR_TAIL(s) ((s)->enc_tails[3])
418 #define HDR_HEAD(s) ((s)->enc_heads[0])
419 #define DATA_HEAD(s) ((s)->enc_heads[1])
420 #define INST_HEAD(s) ((s)->enc_heads[2])
421 #define ADDR_HEAD(s) ((s)->enc_heads[3])
423 #define TOTAL_MODES(x) (2+(x)->acache.s_same+(x)->acache.s_near)
425 /* Template instances. */
427 #define IF_BUILD_SLOW(x) x
429 #define IF_BUILD_SLOW(x)
432 #define IF_BUILD_FAST(x) x
434 #define IF_BUILD_FAST(x)
437 #define IF_BUILD_FASTER(x) x
439 #define IF_BUILD_FASTER(x)
441 #if XD3_BUILD_FASTEST
442 #define IF_BUILD_FASTEST(x) x
444 #define IF_BUILD_FASTEST(x)
447 #define IF_BUILD_SOFT(x) x
449 #define IF_BUILD_SOFT(x)
451 #if XD3_BUILD_DEFAULT
452 #define IF_BUILD_DEFAULT(x) x
454 #define IF_BUILD_DEFAULT(x)
457 /* Consume N bytes of input, only used by the decoder. */
458 #define DECODE_INPUT(n) \
460 stream->total_in += (xoff_t) (n); \
461 stream->avail_in -= (n); \
462 stream->next_in += (n); \
465 /* Update the run-length state */
466 #define NEXTRUN(c) do { if ((c) == run_c) { run_l += 1; } \
467 else { run_c = (c); run_l = 1; } } while (0)
469 /* This CPP-conditional stuff can be cleaned up... */
471 #define IF_REGRESSION(x) x
473 #define IF_REGRESSION(x)
476 /***********************************************************************/
479 static void* xd3_alloc0 (xd3_stream *stream,
484 static xd3_output* xd3_alloc_output (xd3_stream *stream,
485 xd3_output *old_output);
487 static int xd3_alloc_iopt (xd3_stream *stream, usize_t elts);
489 static void xd3_free_output (xd3_stream *stream,
492 static int xd3_emit_byte (xd3_stream *stream,
493 xd3_output **outputp,
496 static int xd3_emit_bytes (xd3_stream *stream,
497 xd3_output **outputp,
501 static int xd3_emit_double (xd3_stream *stream, xd3_rinst *first,
502 xd3_rinst *second, usize_t code);
503 static int xd3_emit_single (xd3_stream *stream, xd3_rinst *single,
506 static usize_t xd3_sizeof_output (xd3_output *output);
507 static void xd3_encode_reset (xd3_stream *stream);
509 static int xd3_source_match_setup (xd3_stream *stream, xoff_t srcpos);
510 static int xd3_source_extend_match (xd3_stream *stream);
511 static int xd3_srcwin_setup (xd3_stream *stream);
512 static usize_t xd3_iopt_last_matched (xd3_stream *stream);
513 static int xd3_emit_uint32_t (xd3_stream *stream, xd3_output **output,
516 static usize_t xd3_smatch (xd3_stream *stream,
519 usize_t *match_offset);
520 static int xd3_string_match_init (xd3_stream *stream);
521 static uint32_t xd3_scksum (uint32_t *state, const uint8_t *seg,
523 static usize_t xd3_comprun (const uint8_t *seg, usize_t slook, uint8_t *run_cp);
524 static int xd3_srcwin_move_point (xd3_stream *stream,
525 usize_t *next_move_point);
527 static int xd3_emit_run (xd3_stream *stream, usize_t pos,
528 usize_t size, uint8_t *run_c);
529 static usize_t xd3_checksum_hash (const xd3_hash_cfg *cfg,
530 const usize_t cksum);
531 static xoff_t xd3_source_cksum_offset(xd3_stream *stream, usize_t low);
532 static void xd3_scksum_insert (xd3_stream *stream,
539 static void xd3_verify_run_state (xd3_stream *stream,
543 static void xd3_verify_large_state (xd3_stream *stream,
546 static void xd3_verify_small_state (xd3_stream *stream,
550 #endif /* XD3_DEBUG */
551 #endif /* XD3_ENCODER */
553 static int xd3_decode_allocate (xd3_stream *stream, usize_t size,
554 uint8_t **copied1, usize_t *alloc1);
556 static void xd3_compute_code_table_string (const xd3_dinst *code_table,
558 static void* xd3_alloc (xd3_stream *stream, usize_t elts, usize_t size);
559 static void xd3_free (xd3_stream *stream, void *ptr);
561 static int xd3_read_uint32_t (xd3_stream *stream, const uint8_t **inpp,
562 const uint8_t *max, uint32_t *valp);
565 static int xd3_selftest (void);
568 /***********************************************************************/
570 #define UINT32_OFLOW_MASK 0xfe000000U
571 #define UINT64_OFLOW_MASK 0xfe00000000000000ULL
574 #define UINT32_MAX 4294967295U
578 #define UINT64_MAX 18446744073709551615ULL
581 #if SIZEOF_USIZE_T == 4
582 #define USIZE_T_MAX UINT32_MAX
583 #define xd3_decode_size xd3_decode_uint32_t
584 #define xd3_emit_size xd3_emit_uint32_t
585 #define xd3_sizeof_size xd3_sizeof_uint32_t
586 #define xd3_read_size xd3_read_uint32_t
587 #elif SIZEOF_USIZE_T == 8
588 #define USIZE_T_MAX UINT64_MAX
589 #define xd3_decode_size xd3_decode_uint64_t
590 #define xd3_emit_size xd3_emit_uint64_t
591 #define xd3_sizeof_size xd3_sizeof_uint64_t
592 #define xd3_read_size xd3_read_uint64_t
595 #if SIZEOF_XOFF_T == 4
596 #define XOFF_T_MAX UINT32_MAX
597 #define xd3_decode_offset xd3_decode_uint32_t
598 #define xd3_emit_offset xd3_emit_uint32_t
599 #elif SIZEOF_XOFF_T == 8
600 #define XOFF_T_MAX UINT64_MAX
601 #define xd3_decode_offset xd3_decode_uint64_t
602 #define xd3_emit_offset xd3_emit_uint64_t
605 #define USIZE_T_OVERFLOW(a,b) ((USIZE_T_MAX - (usize_t) (a)) < (usize_t) (b))
606 #define XOFF_T_OVERFLOW(a,b) ((XOFF_T_MAX - (xoff_t) (a)) < (xoff_t) (b))
608 const char* xd3_strerror (int ret)
612 case XD3_INPUT: return "XD3_INPUT";
613 case XD3_OUTPUT: return "XD3_OUTPUT";
614 case XD3_GETSRCBLK: return "XD3_GETSRCBLK";
615 case XD3_GOTHEADER: return "XD3_GOTHEADER";
616 case XD3_WINSTART: return "XD3_WINSTART";
617 case XD3_WINFINISH: return "XD3_WINFINISH";
618 case XD3_TOOFARBACK: return "XD3_TOOFARBACK";
619 case XD3_INTERNAL: return "XD3_INTERNAL";
620 case XD3_INVALID: return "XD3_INVALID";
621 case XD3_INVALID_INPUT: return "XD3_INVALID_INPUT";
622 case XD3_NOSECOND: return "XD3_NOSECOND";
623 case XD3_UNIMPLEMENTED: return "XD3_UNIMPLEMENTED";
628 /***********************************************************************/
630 #define xd3_sec_data(s) ((s)->sec_stream_d)
631 #define xd3_sec_inst(s) ((s)->sec_stream_i)
632 #define xd3_sec_addr(s) ((s)->sec_stream_a)
638 xd3_secondary_flags flags;
640 /* xd3_sec_stream is opaque to the generic code */
641 xd3_sec_stream* (*alloc) (xd3_stream *stream);
642 void (*destroy) (xd3_stream *stream,
643 xd3_sec_stream *sec);
644 void (*init) (xd3_sec_stream *sec);
645 int (*decode) (xd3_stream *stream,
646 xd3_sec_stream *sec_stream,
647 const uint8_t **input,
648 const uint8_t *input_end,
650 const uint8_t *output_end);
652 int (*encode) (xd3_stream *stream,
653 xd3_sec_stream *sec_stream,
660 #define BIT_STATE_ENCODE_INIT { 0, 1 }
661 #define BIT_STATE_DECODE_INIT { 0, 0x100 }
663 typedef struct _bit_state bit_state;
670 #if SECONDARY_ANY == 0
677 xd3_decode_secondary (xd3_stream *stream,
679 xd3_sec_stream **sec_streamp);
682 xd3_encode_secondary (xd3_stream *stream,
685 xd3_sec_stream **sec_streamp,
689 #endif /* SECONDARY_ANY */
692 extern const xd3_sec_type fgk_sec_type;
694 #define FGK_CASE(s) \
695 s->sec_type = & fgk_sec_type; \
699 #define FGK_CASE(s) \
700 s->msg = "unavailable secondary compressor: FGK Adaptive Huffman"; \
705 extern const xd3_sec_type djw_sec_type;
707 #define DJW_CASE(s) \
708 s->sec_type = & djw_sec_type; \
712 #define DJW_CASE(s) \
713 s->msg = "unavailable secondary compressor: DJW Static Huffman"; \
717 /***********************************************************************/
719 #include "xdelta3-hash.h"
721 /* Process template passes - this includes xdelta3.c several times. */
722 #define __XDELTA3_C_TEMPLATE_PASS__
723 #include "xdelta3-cfgs.h"
724 #undef __XDELTA3_C_TEMPLATE_PASS__
726 /* Process the inline pass. */
727 #define __XDELTA3_C_INLINE_PASS__
729 #undef __XDELTA3_C_INLINE_PASS__
731 /* Secondary compression */
733 #include "xdelta3-second.h"
737 #include "xdelta3-fgk.h"
738 const xd3_sec_type fgk_sec_type =
741 "FGK Adaptive Huffman",
743 (xd3_sec_stream* (*)(xd3_stream*)) fgk_alloc,
744 (void (*)(xd3_stream*, xd3_sec_stream*)) fgk_destroy,
745 (void (*)(xd3_sec_stream*)) fgk_init,
746 (int (*)(xd3_stream*, xd3_sec_stream*, const uint8_t**, const uint8_t*,
747 uint8_t**, const uint8_t*)) xd3_decode_fgk,
748 IF_ENCODER((int (*)(xd3_stream*, xd3_sec_stream*, xd3_output*,
749 xd3_output*, xd3_sec_cfg*)) xd3_encode_fgk)
754 #include "xdelta3-djw.h"
755 const xd3_sec_type djw_sec_type =
760 (xd3_sec_stream* (*)(xd3_stream*)) djw_alloc,
761 (void (*)(xd3_stream*, xd3_sec_stream*)) djw_destroy,
762 (void (*)(xd3_sec_stream*)) djw_init,
763 (int (*)(xd3_stream*, xd3_sec_stream*, const uint8_t**, const uint8_t*,
764 uint8_t**, const uint8_t*)) xd3_decode_huff,
765 IF_ENCODER((int (*)(xd3_stream*, xd3_sec_stream*, xd3_output*,
766 xd3_output*, xd3_sec_cfg*)) xd3_encode_huff)
770 #if XD3_MAIN || PYTHON_MODULE || SWIG_MODULE || NOT_MAIN
771 #include "xdelta3-main.h"
775 #include "xdelta3-test.h"
779 #include "xdelta3-python.h"
782 #endif /* __XDELTA3_C_HEADER_PASS__ */
783 #ifdef __XDELTA3_C_INLINE_PASS__
785 /****************************************************************
787 *****************************************************************/
789 /* The following code implements a parametrized description of the
790 * code table given above for a few reasons. It is not necessary for
791 * implementing the standard, to support compression with variable
792 * tables, so an implementation is only required to know the default
793 * code table to begin decompression. (If the encoder uses an
794 * alternate table, the table is included in compressed form inside
797 * Before adding variable-table support there were two functions which
798 * were hard-coded to the default table above.
799 * xd3_compute_default_table() would create the default table by
800 * filling a 256-elt array of xd3_dinst values. The corresponding
801 * function, xd3_choose_instruction(), would choose an instruction
802 * based on the hard-coded parameters of the default code table.
804 * Notes: The parametrized code table description here only generates
805 * tables of a certain regularity similar to the default table by
806 * allowing to vary the distribution of single- and
807 * double-instructions and change the number of near and same copy
808 * modes. More exotic tables are only possible by extending this
811 * For performance reasons, both the parametrized and non-parametrized
812 * versions of xd3_choose_instruction remain. The parametrized
813 * version is only needed for testing multi-table decoding support.
814 * If ever multi-table encoding is required, this can be optimized by
815 * compiling static functions for each table.
818 /* The XD3_CHOOSE_INSTRUCTION calls xd3_choose_instruction with the
819 * table description when GENERIC_ENCODE_TABLES are in use. The
820 * IF_GENCODETBL macro enables generic-code-table specific code. */
821 #if GENERIC_ENCODE_TABLES
822 #define XD3_CHOOSE_INSTRUCTION(stream,prev,inst) xd3_choose_instruction (stream->code_table_desc, prev, inst)
823 #define IF_GENCODETBL(x) x
825 #define XD3_CHOOSE_INSTRUCTION(stream,prev,inst) xd3_choose_instruction (prev, inst)
826 #define IF_GENCODETBL(x)
829 /* This structure maintains information needed by
830 * xd3_choose_instruction to compute the code for a double instruction
831 * by first indexing an array of code_table_sizes by copy mode, then
832 * using (offset + (muliplier * X)) */
833 struct _xd3_code_table_sizes {
839 /* This contains a complete description of a code table. */
840 struct _xd3_code_table_desc
842 /* Assumes a single RUN instruction */
843 /* Assumes that MIN_MATCH is 4 */
845 uint8_t add_sizes; /* Number of immediate-size single adds (default 17) */
846 uint8_t near_modes; /* Number of near copy modes (default 4) */
847 uint8_t same_modes; /* Number of same copy modes (default 3) */
848 uint8_t cpy_sizes; /* Number of immediate-size single copies (default 15) */
850 uint8_t addcopy_add_max; /* Maximum add size for an add-copy double instruction,
851 all modes (default 4) */
852 uint8_t addcopy_near_cpy_max; /* Maximum cpy size for an add-copy double instruction,
853 up through VCD_NEAR modes (default 6) */
854 uint8_t addcopy_same_cpy_max; /* Maximum cpy size for an add-copy double instruction,
855 VCD_SAME modes (default 4) */
857 uint8_t copyadd_add_max; /* Maximum add size for a copy-add double instruction,
858 all modes (default 1) */
859 uint8_t copyadd_near_cpy_max; /* Maximum cpy size for a copy-add double instruction,
860 up through VCD_NEAR modes (default 4) */
861 uint8_t copyadd_same_cpy_max; /* Maximum cpy size for a copy-add double instruction,
862 VCD_SAME modes (default 4) */
864 xd3_code_table_sizes addcopy_max_sizes[MAX_MODES];
865 xd3_code_table_sizes copyadd_max_sizes[MAX_MODES];
868 /* The rfc3284 code table is represented: */
869 static const xd3_code_table_desc __rfc3284_code_table_desc = {
875 4, /* add-copy max add */
876 6, /* add-copy max cpy, near */
877 4, /* add-copy max cpy, same */
879 1, /* copy-add max add */
880 4, /* copy-add max cpy, near */
881 4, /* copy-add max cpy, same */
884 { {6,163,3},{6,175,3},{6,187,3},{6,199,3},{6,211,3},{6,223,3},{4,235,1},{4,239,1},{4,243,1} },
886 { {4,247,1},{4,248,1},{4,249,1},{4,250,1},{4,251,1},{4,252,1},{4,253,1},{4,254,1},{4,255,1} },
889 #if GENERIC_ENCODE_TABLES
890 /* An alternate code table for testing (5 near, 0 same):
892 * TYPE SIZE MODE TYPE SIZE MODE INDEX
893 * ---------------------------------------------------------------
894 * 1. Run 0 0 Noop 0 0 0
895 * 2. Add 0, [1,23] 0 Noop 0 0 [1,24]
896 * 3. Copy 0, [4,20] 0 Noop 0 0 [25,42]
897 * 4. Copy 0, [4,20] 1 Noop 0 0 [43,60]
898 * 5. Copy 0, [4,20] 2 Noop 0 0 [61,78]
899 * 6. Copy 0, [4,20] 3 Noop 0 0 [79,96]
900 * 7. Copy 0, [4,20] 4 Noop 0 0 [97,114]
901 * 8. Copy 0, [4,20] 5 Noop 0 0 [115,132]
902 * 9. Copy 0, [4,20] 6 Noop 0 0 [133,150]
903 * 10. Add [1,4] 0 Copy [4,6] 0 [151,162]
904 * 11. Add [1,4] 0 Copy [4,6] 1 [163,174]
905 * 12. Add [1,4] 0 Copy [4,6] 2 [175,186]
906 * 13. Add [1,4] 0 Copy [4,6] 3 [187,198]
907 * 14. Add [1,4] 0 Copy [4,6] 4 [199,210]
908 * 15. Add [1,4] 0 Copy [4,6] 5 [211,222]
909 * 16. Add [1,4] 0 Copy [4,6] 6 [223,234]
910 * 17. Copy 4 [0,6] Add [1,3] 0 [235,255]
911 * --------------------------------------------------------------- */
912 static const xd3_code_table_desc __alternate_code_table_desc = {
918 4, /* add-copy max add */
919 6, /* add-copy max cpy, near */
920 0, /* add-copy max cpy, same */
922 3, /* copy-add max add */
923 4, /* copy-add max cpy, near */
924 0, /* copy-add max cpy, same */
927 { {6,151,3},{6,163,3},{6,175,3},{6,187,3},{6,199,3},{6,211,3},{6,223,3},{0,0,0},{0,0,0} },
929 { {4,235,1},{4,238,1},{4,241,1},{4,244,1},{4,247,1},{4,250,1},{4,253,1},{0,0,0},{0,0,0} },
933 /* Computes code table entries of TBL using the specified description. */
935 xd3_build_code_table (const xd3_code_table_desc *desc, xd3_dinst *tbl)
937 usize_t size1, size2, mode;
938 usize_t cpy_modes = 2 + desc->near_modes + desc->same_modes;
941 (d++)->type1 = XD3_RUN;
942 (d++)->type1 = XD3_ADD;
944 for (size1 = 1; size1 <= desc->add_sizes; size1 += 1, d += 1)
950 for (mode = 0; mode < cpy_modes; mode += 1)
952 (d++)->type1 = XD3_CPY + mode;
954 for (size1 = MIN_MATCH; size1 < MIN_MATCH + desc->cpy_sizes; size1 += 1, d += 1)
956 d->type1 = XD3_CPY + mode;
961 for (mode = 0; mode < cpy_modes; mode += 1)
963 for (size1 = 1; size1 <= desc->addcopy_add_max; size1 += 1)
965 usize_t max = (mode < 2U + desc->near_modes) ?
966 desc->addcopy_near_cpy_max :
967 desc->addcopy_same_cpy_max;
969 for (size2 = MIN_MATCH; size2 <= max; size2 += 1, d += 1)
973 d->type2 = XD3_CPY + mode;
979 for (mode = 0; mode < cpy_modes; mode += 1)
981 usize_t max = (mode < 2U + desc->near_modes) ?
982 desc->copyadd_near_cpy_max :
983 desc->copyadd_same_cpy_max;
985 for (size1 = MIN_MATCH; size1 <= max; size1 += 1)
987 for (size2 = 1; size2 <= desc->copyadd_add_max; size2 += 1, d += 1)
989 d->type1 = XD3_CPY + mode;
997 XD3_ASSERT (d - tbl == 256);
1000 /* This function generates the static default code table. */
1001 static const xd3_dinst*
1002 xd3_rfc3284_code_table (void)
1004 static xd3_dinst __rfc3284_code_table[256];
1006 if (__rfc3284_code_table[0].type1 != XD3_RUN)
1008 xd3_build_code_table (& __rfc3284_code_table_desc, __rfc3284_code_table);
1011 return __rfc3284_code_table;
1015 #if GENERIC_ENCODE_TABLES
1016 /* This function generates the alternate code table. */
1017 static const xd3_dinst*
1018 xd3_alternate_code_table (void)
1020 static xd3_dinst __alternate_code_table[256];
1022 if (__alternate_code_table[0].type1 != XD3_RUN)
1024 xd3_build_code_table (& __alternate_code_table_desc, __alternate_code_table);
1027 return __alternate_code_table;
1030 /* This function computes the ideal second instruction INST based on
1031 * preceding instruction PREV. If it is possible to issue a double
1032 * instruction based on this pair it sets PREV->code2, otherwise it
1033 * sets INST->code1. */
1035 xd3_choose_instruction (const xd3_code_table_desc *desc, xd3_rinst *prev, xd3_rinst *inst)
1040 /* The 0th instruction is RUN */
1046 if (inst->size > desc->add_sizes)
1048 /* The first instruction is non-immediate ADD */
1053 /* The following ADD_SIZES instructions are immediate ADDs */
1054 inst->code1 = 1 + inst->size;
1056 /* Now check for a possible COPY-ADD double instruction */
1059 int prev_mode = prev->type - XD3_CPY;
1061 /* If previous is a copy. Note: as long as the previous
1062 * is not a RUN instruction, it should be a copy because
1063 * it cannot be an add. This check is more clear. */
1064 if (prev_mode >= 0 && inst->size <= desc->copyadd_add_max)
1066 const xd3_code_table_sizes *sizes = & desc->copyadd_max_sizes[prev_mode];
1068 /* This check and the inst->size-<= above are == in
1069 the default table. */
1070 if (prev->size <= sizes->cpy_max)
1072 /* The second and third exprs are 0 in the
1074 prev->code2 = sizes->offset +
1075 (sizes->mult * (prev->size - MIN_MATCH)) +
1076 (inst->size - MIN_ADD);
1085 int mode = inst->type - XD3_CPY;
1087 /* The large copy instruction is offset by the run, large add,
1088 * and immediate adds, then multipled by the number of
1089 * immediate copies plus one (the large copy) (i.e., if there
1090 * are 15 immediate copy instructions then there are 16 copy
1091 * instructions per mode). */
1092 inst->code1 = 2 + desc->add_sizes + (1 + desc->cpy_sizes) * mode;
1094 /* Now if the copy is short enough for an immediate instruction. */
1095 if (inst->size < MIN_MATCH + desc->cpy_sizes &&
1096 /* TODO: there needs to be a more comprehensive test for this
1097 * boundary condition, merge is now exercising code in which
1098 * size < MIN_MATCH is possible and it's unclear if the above
1099 * size < (MIN_MATCH + cpy_sizes) should be a <= from inspection
1100 * of the default table version below. */
1101 inst->size >= MIN_MATCH)
1103 inst->code1 += inst->size + 1 - MIN_MATCH;
1105 /* Now check for a possible ADD-COPY double instruction. */
1106 if ( (prev != NULL) &&
1107 (prev->type == XD3_ADD) &&
1108 (prev->size <= desc->addcopy_add_max) )
1110 const xd3_code_table_sizes *sizes = & desc->addcopy_max_sizes[mode];
1112 if (inst->size <= sizes->cpy_max)
1114 prev->code2 = sizes->offset +
1115 (sizes->mult * (prev->size - MIN_ADD)) +
1116 (inst->size - MIN_MATCH);
1123 #else /* GENERIC_ENCODE_TABLES */
1125 /* This version of xd3_choose_instruction is hard-coded for the default
1128 xd3_choose_instruction (xd3_rinst *prev, xd3_rinst *inst)
1139 if (inst->size <= 17)
1141 inst->code1 += inst->size;
1143 if ( (inst->size == 1) &&
1145 (prev->size == 4) &&
1146 (prev->type >= XD3_CPY) )
1148 prev->code2 = 247 + (prev->type - XD3_CPY);
1156 int mode = inst->type - XD3_CPY;
1158 XD3_ASSERT (inst->type >= XD3_CPY && inst->type < 12);
1160 inst->code1 = 19 + 16 * mode;
1162 if (inst->size <= 18 && inst->size >= 4)
1164 inst->code1 += inst->size - 3;
1166 if ( (prev != NULL) &&
1167 (prev->type == XD3_ADD) &&
1170 if ( (inst->size <= 6) &&
1173 prev->code2 = 163 + (mode * 12) + (3 * (prev->size - 1)) + (inst->size - 4);
1175 XD3_ASSERT (prev->code2 <= 234);
1177 else if ( (inst->size == 4) &&
1180 prev->code2 = 235 + ((mode - 6) * 4) + (prev->size - 1);
1182 XD3_ASSERT (prev->code2 <= 246);
1187 XD3_ASSERT (inst->code1 <= 162);
1192 #endif /* GENERIC_ENCODE_TABLES */
1194 /***********************************************************************
1195 Instruction table encoder/decoder
1196 ***********************************************************************/
1198 #if GENERIC_ENCODE_TABLES
1199 #if GENERIC_ENCODE_TABLES_COMPUTE == 0
1201 /* In this case, we hard-code the result of
1202 * compute_code_table_encoding for each alternate code table,
1203 * presuming that saves time/space. This has been 131 bytes, but
1204 * secondary compression was turned off. */
1205 static const uint8_t __alternate_code_table_compressed[178] =
1206 {0xd6,0xc3,0xc4,0x00,0x00,0x01,0x8a,0x6f,0x40,0x81,0x27,0x8c,0x00,0x00,0x4a,0x4a,0x0d,0x02,0x01,0x03,
1207 0x01,0x03,0x00,0x01,0x00,0x00,0x01,0x02,0x03,0x04,0x05,0x06,0x07,0x08,0x09,0x0a,0x0b,0x0c,0x0d,0x0e,
1208 0x0f,0x10,0x11,0x12,0x13,0x14,0x15,0x16,0x17,0x00,0x01,0x01,0x01,0x02,0x02,0x02,0x03,0x03,0x03,0x04,
1209 0x04,0x04,0x04,0x00,0x04,0x05,0x06,0x01,0x02,0x03,0x00,0x01,0x02,0x03,0x04,0x05,0x06,0x05,0x05,0x05,
1210 0x06,0x06,0x06,0x00,0x01,0x02,0x03,0x04,0x05,0x06,0x00,0x02,0x00,0x18,0x13,0x63,0x00,0x1b,0x00,0x54,
1211 0x00,0x15,0x23,0x6f,0x00,0x28,0x13,0x54,0x00,0x15,0x01,0x1a,0x31,0x23,0x6c,0x0d,0x23,0x48,0x00,0x15,
1212 0x93,0x6f,0x00,0x28,0x04,0x23,0x51,0x04,0x32,0x00,0x2b,0x00,0x12,0x00,0x12,0x00,0x12,0x00,0x12,0x00,
1213 0x12,0x00,0x12,0x53,0x57,0x9c,0x07,0x43,0x6f,0x00,0x34,0x00,0x0c,0x00,0x0c,0x00,0x0c,0x00,0x0c,0x00,
1214 0x0c,0x00,0x0c,0x00,0x15,0x00,0x82,0x6f,0x00,0x15,0x12,0x0c,0x00,0x03,0x03,0x00,0x06,0x00,};
1217 xd3_compute_alternate_table_encoding (xd3_stream *stream, const uint8_t **data, usize_t *size)
1219 (*data) = __alternate_code_table_compressed;
1220 (*size) = sizeof (__alternate_code_table_compressed);
1226 /* The alternate code table will be computed and stored here. */
1227 static uint8_t __alternate_code_table_compressed[CODE_TABLE_VCDIFF_SIZE];
1228 static usize_t __alternate_code_table_compressed_size;
1230 /* This function generates a delta describing the code table for
1231 * encoding within a VCDIFF file. This function is NOT thread safe
1232 * because it is only intended that this function is used to generate
1233 * statically-compiled strings. "comp_string" must be sized
1234 * CODE_TABLE_VCDIFF_SIZE. */
1235 int xd3_compute_code_table_encoding (xd3_stream *in_stream,
1236 const xd3_dinst *code_table,
1237 uint8_t *comp_string,
1238 usize_t *comp_string_size)
1240 /* Use DJW secondary compression if it is on by default. This saves
1241 * about 20 bytes. */
1242 uint8_t dflt_string[CODE_TABLE_STRING_SIZE];
1243 uint8_t code_string[CODE_TABLE_STRING_SIZE];
1245 xd3_compute_code_table_string (xd3_rfc3284_code_table (), dflt_string);
1246 xd3_compute_code_table_string (code_table, code_string);
1248 return xd3_encode_memory (code_string, CODE_TABLE_STRING_SIZE,
1249 dflt_string, CODE_TABLE_STRING_SIZE,
1250 comp_string, comp_string_size,
1251 CODE_TABLE_VCDIFF_SIZE,
1255 /* Compute a delta between alternate and rfc3284 tables. As soon as
1256 * another alternate table is added, this code should become generic.
1257 * For now there is only one alternate table for testing. */
1259 xd3_compute_alternate_table_encoding (xd3_stream *stream, const uint8_t **data, usize_t *size)
1263 if (__alternate_code_table_compressed[0] == 0)
1265 if ((ret = xd3_compute_code_table_encoding (stream, xd3_alternate_code_table (),
1266 __alternate_code_table_compressed,
1267 & __alternate_code_table_compressed_size)))
1272 /* During development of a new code table, enable this variable to print
1273 * the new static contents and determine its size. At run time the
1274 * table will be filled in appropriately, but at least it should have
1275 * the proper size beforehand. */
1276 #if GENERIC_ENCODE_TABLES_COMPUTE_PRINT
1280 DP(RINT, "\nstatic const usize_t __alternate_code_table_compressed_size = %u;\n",
1281 __alternate_code_table_compressed_size);
1283 DP(RINT, "static const uint8_t __alternate_code_table_compressed[%u] =\n{",
1284 __alternate_code_table_compressed_size);
1286 for (i = 0; i < __alternate_code_table_compressed_size; i += 1)
1288 DP(RINT, "0x%02x,", __alternate_code_table_compressed[i]);
1289 if ((i % 20) == 19) { DP(RINT, "\n"); }
1297 (*data) = __alternate_code_table_compressed;
1298 (*size) = __alternate_code_table_compressed_size;
1302 #endif /* GENERIC_ENCODE_TABLES_COMPUTE != 0 */
1303 #endif /* GENERIC_ENCODE_TABLES */
1305 #endif /* XD3_ENCODER */
1307 /* This function generates the 1536-byte string specified in sections 5.4 and
1308 * 7 of rfc3284, which is used to represent a code table within a VCDIFF
1310 void xd3_compute_code_table_string (const xd3_dinst *code_table, uint8_t *str)
1314 XD3_ASSERT (CODE_TABLE_STRING_SIZE == 6 * 256);
1316 for (s = 0; s < 6; s += 1)
1318 for (i = 0; i < 256; i += 1)
1322 case 0: *str++ = (code_table[i].type1 >= XD3_CPY ? XD3_CPY : code_table[i].type1); break;
1323 case 1: *str++ = (code_table[i].type2 >= XD3_CPY ? XD3_CPY : code_table[i].type2); break;
1324 case 2: *str++ = (code_table[i].size1); break;
1325 case 3: *str++ = (code_table[i].size2); break;
1326 case 4: *str++ = (code_table[i].type1 >= XD3_CPY ? code_table[i].type1 - XD3_CPY : 0); break;
1327 case 5: *str++ = (code_table[i].type2 >= XD3_CPY ? code_table[i].type2 - XD3_CPY : 0); break;
1333 /* This function translates the code table string into the internal representation. The
1334 * stream's near and same-modes should already be set. */
1336 xd3_apply_table_string (xd3_stream *stream, const uint8_t *code_string)
1339 int modes = TOTAL_MODES (stream);
1340 xd3_dinst *code_table;
1342 if ((code_table = stream->code_table_alloc =
1343 (xd3_dinst*) xd3_alloc (stream,
1344 (usize_t) sizeof (xd3_dinst),
1350 for (s = 0; s < 6; s += 1)
1352 for (i = 0; i < 256; i += 1)
1357 if (*code_string > XD3_CPY)
1359 stream->msg = "invalid code-table opcode";
1360 return XD3_INTERNAL;
1362 code_table[i].type1 = *code_string++;
1365 if (*code_string > XD3_CPY)
1367 stream->msg = "invalid code-table opcode";
1368 return XD3_INTERNAL;
1370 code_table[i].type2 = *code_string++;
1373 if (*code_string != 0 && code_table[i].type1 == XD3_NOOP)
1375 stream->msg = "invalid code-table size";
1376 return XD3_INTERNAL;
1378 code_table[i].size1 = *code_string++;
1381 if (*code_string != 0 && code_table[i].type2 == XD3_NOOP)
1383 stream->msg = "invalid code-table size";
1384 return XD3_INTERNAL;
1386 code_table[i].size2 = *code_string++;
1389 if (*code_string >= modes)
1391 stream->msg = "invalid code-table mode";
1392 return XD3_INTERNAL;
1394 if (*code_string != 0 && code_table[i].type1 != XD3_CPY)
1396 stream->msg = "invalid code-table mode";
1397 return XD3_INTERNAL;
1399 code_table[i].type1 += *code_string++;
1402 if (*code_string >= modes)
1404 stream->msg = "invalid code-table mode";
1405 return XD3_INTERNAL;
1407 if (*code_string != 0 && code_table[i].type2 != XD3_CPY)
1409 stream->msg = "invalid code-table mode";
1410 return XD3_INTERNAL;
1412 code_table[i].type2 += *code_string++;
1418 stream->code_table = code_table;
1422 /* This function applies a code table delta and returns an actual code table. */
1424 xd3_apply_table_encoding (xd3_stream *in_stream, const uint8_t *data, usize_t size)
1426 uint8_t dflt_string[CODE_TABLE_STRING_SIZE];
1427 uint8_t code_string[CODE_TABLE_STRING_SIZE];
1431 xd3_compute_code_table_string (xd3_rfc3284_code_table (), dflt_string);
1433 if ((ret = xd3_decode_memory (data, size,
1434 dflt_string, CODE_TABLE_STRING_SIZE,
1435 code_string, &code_size,
1436 CODE_TABLE_STRING_SIZE,
1437 0))) { return ret; }
1439 if (code_size != sizeof (code_string))
1441 in_stream->msg = "corrupt code-table encoding";
1442 return XD3_INTERNAL;
1445 return xd3_apply_table_string (in_stream, code_string);
1448 /***********************************************************************/
1451 xd3_swap_uint8p (uint8_t** p1, uint8_t** p2)
1459 xd3_swap_usize_t (usize_t* p1, usize_t* p2)
1466 /* It's not constant time, but it computes the log. */
1468 xd3_check_pow2 (usize_t value, usize_t *logof)
1472 if (logof == NULL) {
1478 for (; x != 0; x <<= 1, *logof += 1)
1486 return XD3_INTERNAL;
1490 xd3_pow2_roundup (usize_t x)
1500 xd3_round_blksize (usize_t sz, usize_t blksz)
1502 usize_t mod = sz & (blksz-1);
1504 XD3_ASSERT (xd3_check_pow2 (blksz, NULL) == 0);
1506 return mod ? (sz + (blksz - mod)) : sz;
1509 /***********************************************************************
1510 Adler32 stream function: code copied from Zlib, defined in RFC1950
1511 ***********************************************************************/
1513 #define A32_BASE 65521L /* Largest prime smaller than 2^16 */
1514 #define A32_NMAX 5552 /* NMAX is the largest n such that 255n(n+1)/2 + (n+1)(BASE-1) <= 2^32-1 */
1516 #define A32_DO1(buf,i) {s1 += buf[i]; s2 += s1;}
1517 #define A32_DO2(buf,i) A32_DO1(buf,i); A32_DO1(buf,i+1);
1518 #define A32_DO4(buf,i) A32_DO2(buf,i); A32_DO2(buf,i+2);
1519 #define A32_DO8(buf,i) A32_DO4(buf,i); A32_DO4(buf,i+4);
1520 #define A32_DO16(buf) A32_DO8(buf,0); A32_DO8(buf,8);
1522 static unsigned long adler32 (unsigned long adler, const uint8_t *buf,
1525 unsigned long s1 = adler & 0xffff;
1526 unsigned long s2 = (adler >> 16) & 0xffff;
1531 k = (len < A32_NMAX) ? len : A32_NMAX;
1555 return (s2 << 16) | s1;
1558 /***********************************************************************
1560 ***********************************************************************/
1564 xd3_comprun (const uint8_t *seg, usize_t slook, uint8_t *run_cp)
1570 for (i = 0; i < slook; i += 1)
1581 /***********************************************************************
1582 Basic encoder/decoder functions
1583 ***********************************************************************/
1586 xd3_decode_byte (xd3_stream *stream, usize_t *val)
1588 if (stream->avail_in == 0)
1590 stream->msg = "further input required";
1594 (*val) = stream->next_in[0];
1601 xd3_decode_bytes (xd3_stream *stream, uint8_t *buf, usize_t *pos, usize_t size)
1606 /* Note: The case where (*pos == size) happens when a zero-length
1607 * appheader or code table is transmitted, but there is nothing in
1608 * the standard against that. */
1611 if (stream->avail_in == 0)
1613 stream->msg = "further input required";
1618 take = min (want, stream->avail_in);
1620 memcpy (buf + *pos, stream->next_in, (size_t) take);
1622 DECODE_INPUT (take);
1631 xd3_emit_byte (xd3_stream *stream,
1632 xd3_output **outputp,
1635 xd3_output *output = (*outputp);
1637 if (output->next == output->avail)
1639 xd3_output *aoutput;
1641 if ((aoutput = xd3_alloc_output (stream, output)) == NULL)
1646 output = (*outputp) = aoutput;
1649 output->base[output->next++] = code;
1655 xd3_emit_bytes (xd3_stream *stream,
1656 xd3_output **outputp,
1657 const uint8_t *base,
1660 xd3_output *output = (*outputp);
1666 if (output->next == output->avail)
1668 xd3_output *aoutput;
1670 if ((aoutput = xd3_alloc_output (stream, output)) == NULL)
1675 output = (*outputp) = aoutput;
1678 take = min (output->avail - output->next, size);
1680 memcpy (output->base + output->next, base, (size_t) take);
1682 output->next += take;
1690 #endif /* XD3_ENCODER */
1692 /*********************************************************************
1693 Integer encoder/decoder functions
1694 **********************************************************************/
1696 #define DECODE_INTEGER_TYPE(PART,OFLOW) \
1697 while (stream->avail_in != 0) \
1699 usize_t next = stream->next_in[0]; \
1705 stream->msg = "overflow in decode_integer"; \
1706 return XD3_INVALID_INPUT; \
1709 PART = (PART << 7) | (next & 127); \
1711 if ((next & 128) == 0) \
1719 stream->msg = "further input required"; \
1722 #define READ_INTEGER_TYPE(TYPE, OFLOW) \
1724 const uint8_t *inp = (*inpp); \
1731 stream->msg = "end-of-input in read_integer"; \
1732 return XD3_INVALID_INPUT; \
1737 stream->msg = "overflow in read_intger"; \
1738 return XD3_INVALID_INPUT; \
1742 val = (val << 7) | (next & 127); \
1744 while (next & 128); \
1751 #define EMIT_INTEGER_TYPE() \
1752 /* max 64-bit value in base-7 encoding is 9.1 bytes */ \
1754 usize_t bufi = 10; \
1756 /* This loop performs division and turns on all MSBs. */ \
1759 buf[--bufi] = (num & 127) | 128; \
1764 /* Turn off MSB of the last byte. */ \
1767 return xd3_emit_bytes (stream, output, buf + bufi, 10 - bufi)
1769 #define IF_SIZEOF32(x) if (num < (1U << (7 * (x)))) return (x);
1770 #define IF_SIZEOF64(x) if (num < (1ULL << (7 * (x)))) return (x);
1773 static inline uint32_t
1774 xd3_sizeof_uint32_t (uint32_t num)
1784 xd3_decode_uint32_t (xd3_stream *stream, uint32_t *val)
1785 { DECODE_INTEGER_TYPE (stream->dec_32part, UINT32_OFLOW_MASK); }
1788 xd3_read_uint32_t (xd3_stream *stream, const uint8_t **inpp,
1789 const uint8_t *max, uint32_t *valp)
1790 { READ_INTEGER_TYPE (uint32_t, UINT32_OFLOW_MASK); }
1794 xd3_emit_uint32_t (xd3_stream *stream, xd3_output **output, uint32_t num)
1795 { EMIT_INTEGER_TYPE (); }
1801 xd3_decode_uint64_t (xd3_stream *stream, uint64_t *val)
1802 { DECODE_INTEGER_TYPE (stream->dec_64part, UINT64_OFLOW_MASK); }
1806 xd3_emit_uint64_t (xd3_stream *stream, xd3_output **output, uint64_t num)
1807 { EMIT_INTEGER_TYPE (); }
1810 /* These are tested but not used */
1813 xd3_read_uint64_t (xd3_stream *stream, const uint8_t **inpp,
1814 const uint8_t *max, uint64_t *valp)
1815 { READ_INTEGER_TYPE (uint64_t, UINT64_OFLOW_MASK); }
1818 xd3_sizeof_uint64_t (uint64_t num)
1836 /***********************************************************************
1838 ***********************************************************************/
1841 xd3_alloc_cache (xd3_stream *stream)
1843 if (stream->acache.near_array != NULL)
1845 xd3_free (stream, stream->acache.near_array);
1848 if (stream->acache.same_array != NULL)
1850 xd3_free (stream, stream->acache.same_array);
1853 if (((stream->acache.s_near > 0) &&
1854 (stream->acache.near_array = (usize_t*)
1855 xd3_alloc (stream, stream->acache.s_near,
1856 (usize_t) sizeof (usize_t)))
1858 ((stream->acache.s_same > 0) &&
1859 (stream->acache.same_array = (usize_t*)
1860 xd3_alloc (stream, stream->acache.s_same * 256,
1861 (usize_t) sizeof (usize_t)))
1871 xd3_init_cache (xd3_addr_cache* acache)
1873 if (acache->s_near > 0)
1875 memset (acache->near_array, 0, acache->s_near * sizeof (usize_t));
1876 acache->next_slot = 0;
1879 if (acache->s_same > 0)
1881 memset (acache->same_array, 0, acache->s_same * 256 * sizeof (usize_t));
1886 xd3_update_cache (xd3_addr_cache* acache, usize_t addr)
1888 if (acache->s_near > 0)
1890 acache->near_array[acache->next_slot] = addr;
1891 acache->next_slot = (acache->next_slot + 1) % acache->s_near;
1894 if (acache->s_same > 0)
1896 acache->same_array[addr % (acache->s_same*256)] = addr;
1901 /* OPT: this gets called a lot, can it be optimized? */
1903 xd3_encode_address (xd3_stream *stream,
1909 usize_t i, bestm, ret;
1910 xd3_addr_cache* acache = & stream->acache;
1912 #define SMALLEST_INT(x) do { if (((x) & ~127U) == 0) { goto good; } } while (0)
1914 /* Attempt to find the address mode that yields the smallest integer value
1915 * for "d", the encoded address value, thereby minimizing the encoded size
1916 * of the address. */
1920 XD3_ASSERT (addr < here);
1922 SMALLEST_INT (bestd);
1924 if ((d = here-addr) < bestd)
1929 SMALLEST_INT (bestd);
1932 for (i = 0; i < acache->s_near; i += 1)
1934 /* Note: If we used signed computation here, we'd could compte d
1935 * and then check (d >= 0 && d < bestd). */
1936 if (addr >= acache->near_array[i])
1938 d = addr - acache->near_array[i];
1943 bestm = i+2; /* 2 counts the VCD_SELF, VCD_HERE modes */
1945 SMALLEST_INT (bestd);
1950 if (acache->s_same > 0 &&
1951 acache->same_array[d = addr%(acache->s_same*256)] == addr)
1954 /* 2 + s_near offsets past the VCD_NEAR modes */
1955 bestm = acache->s_near + 2 + d/256;
1957 if ((ret = xd3_emit_byte (stream, & ADDR_TAIL (stream), bestd)))
1966 if ((ret = xd3_emit_size (stream, & ADDR_TAIL (stream), bestd)))
1972 xd3_update_cache (acache, addr);
1981 xd3_decode_address (xd3_stream *stream, usize_t here,
1982 usize_t mode, const uint8_t **inpp,
1983 const uint8_t *max, uint32_t *valp)
1986 usize_t same_start = 2 + stream->acache.s_near;
1988 if (mode < same_start)
1990 if ((ret = xd3_read_size (stream, inpp, max, valp))) { return ret; }
1997 (*valp) = here - (*valp);
2000 (*valp) += stream->acache.near_array[mode - 2];
2008 stream->msg = "address underflow";
2009 return XD3_INVALID_INPUT;
2014 (*valp) = stream->acache.same_array[mode*256 + (**inpp)];
2019 xd3_update_cache (& stream->acache, *valp);
2024 /***********************************************************************
2026 ***********************************************************************/
2029 __xd3_alloc_func (void* opaque, usize_t items, usize_t size)
2031 return malloc ((size_t) items * (size_t) size);
2035 __xd3_free_func (void* opaque, void* address)
2041 xd3_alloc (xd3_stream *stream,
2045 void *a = stream->alloc (stream->opaque, elts, size);
2049 IF_DEBUG (stream->alloc_cnt += 1);
2050 IF_DEBUG2 (DP(RINT "[stream %p malloc] size %u ptr %p\n",
2051 stream, elts * size, a));
2055 stream->msg = "out of memory";
2062 xd3_free (xd3_stream *stream,
2067 IF_DEBUG (stream->free_cnt += 1);
2068 XD3_ASSERT (stream->free_cnt <= stream->alloc_cnt);
2069 IF_DEBUG2 (DP(RINT "[stream %p free] %p\n",
2071 stream->free (stream->opaque, ptr);
2077 xd3_alloc0 (xd3_stream *stream,
2081 void *a = xd3_alloc (stream, elts, size);
2085 memset (a, 0, (size_t) (elts * size));
2092 xd3_alloc_output (xd3_stream *stream,
2093 xd3_output *old_output)
2098 if (stream->enc_free != NULL)
2100 output = stream->enc_free;
2101 stream->enc_free = output->next_page;
2105 if ((output = (xd3_output*) xd3_alloc (stream, 1,
2106 (usize_t) sizeof (xd3_output)))
2112 if ((base = (uint8_t*) xd3_alloc (stream, XD3_ALLOCSIZE,
2113 sizeof (uint8_t))) == NULL)
2115 xd3_free (stream, output);
2119 output->base = base;
2120 output->avail = XD3_ALLOCSIZE;
2127 old_output->next_page = output;
2130 output->next_page = NULL;
2136 xd3_sizeof_output (xd3_output *output)
2140 for (; output; output = output->next_page)
2149 xd3_freelist_output (xd3_stream *stream,
2157 output = output->next_page;
2160 tmp->next_page = stream->enc_free;
2161 stream->enc_free = tmp;
2166 xd3_free_output (xd3_stream *stream,
2177 next = output->next_page;
2179 xd3_free (stream, output->base);
2180 xd3_free (stream, output);
2185 #endif /* XD3_ENCODER */
2188 xd3_free_stream (xd3_stream *stream)
2190 xd3_iopt_buflist *blist = stream->iopt_alloc;
2192 while (blist != NULL)
2194 xd3_iopt_buflist *tmp = blist;
2195 blist = blist->next;
2196 xd3_free (stream, tmp->buffer);
2197 xd3_free (stream, tmp);
2200 xd3_free (stream, stream->large_table);
2201 xd3_free (stream, stream->small_table);
2202 xd3_free (stream, stream->small_prev);
2207 for (i = 0; i < ENC_SECTS; i += 1)
2209 xd3_free_output (stream, stream->enc_heads[i]);
2211 xd3_free_output (stream, stream->enc_free);
2215 xd3_free (stream, stream->acache.near_array);
2216 xd3_free (stream, stream->acache.same_array);
2218 xd3_free (stream, stream->inst_sect.copied1);
2219 xd3_free (stream, stream->addr_sect.copied1);
2220 xd3_free (stream, stream->data_sect.copied1);
2222 xd3_free (stream, stream->dec_buffer);
2223 xd3_free (stream, (uint8_t*) stream->dec_lastwin);
2225 xd3_free (stream, stream->buf_in);
2226 xd3_free (stream, stream->dec_appheader);
2227 xd3_free (stream, stream->dec_codetbl);
2228 xd3_free (stream, stream->code_table_alloc);
2231 xd3_free (stream, stream->inst_sect.copied2);
2232 xd3_free (stream, stream->addr_sect.copied2);
2233 xd3_free (stream, stream->data_sect.copied2);
2235 if (stream->sec_type != NULL)
2237 stream->sec_type->destroy (stream, stream->sec_stream_d);
2238 stream->sec_type->destroy (stream, stream->sec_stream_i);
2239 stream->sec_type->destroy (stream, stream->sec_stream_a);
2243 xd3_free (stream, stream->whole_target.adds);
2244 xd3_free (stream, stream->whole_target.inst);
2245 xd3_free (stream, stream->whole_target.wininfo);
2247 XD3_ASSERT (stream->alloc_cnt == stream->free_cnt);
2249 memset (stream, 0, sizeof (xd3_stream));
2252 #if (XD3_DEBUG > 1 || VCDIFF_TOOLS)
2254 xd3_rtype_to_string (xd3_rtype type, int print_mode)
2272 case XD3_CPY + 0: return "CPY_0";
2273 case XD3_CPY + 1: return "CPY_1";
2274 case XD3_CPY + 2: return "CPY_2";
2275 case XD3_CPY + 3: return "CPY_3";
2276 case XD3_CPY + 4: return "CPY_4";
2277 case XD3_CPY + 5: return "CPY_5";
2278 case XD3_CPY + 6: return "CPY_6";
2279 case XD3_CPY + 7: return "CPY_7";
2280 case XD3_CPY + 8: return "CPY_8";
2281 case XD3_CPY + 9: return "CPY_9";
2282 default: return "CPY>9";
2287 /****************************************************************
2288 Stream configuration
2289 ******************************************************************/
2292 xd3_config_stream(xd3_stream *stream,
2297 xd3_smatcher *smatcher = &stream->smatcher;
2302 memset (config, 0, sizeof (*config));
2305 /* Initial setup: no error checks yet */
2306 memset (stream, 0, sizeof (*stream));
2308 stream->winsize = config->winsize ? config->winsize : XD3_DEFAULT_WINSIZE;
2309 stream->sprevsz = config->sprevsz ? config->sprevsz : XD3_DEFAULT_SPREVSZ;
2310 stream->srcwin_maxsz = config->srcwin_maxsz ?
2311 config->srcwin_maxsz : XD3_DEFAULT_SRCWINSZ;
2313 if (config->iopt_size == 0)
2315 stream->iopt_size = XD3_ALLOCSIZE / sizeof(xd3_rinst);
2316 stream->iopt_unlimited = 1;
2320 stream->iopt_size = config->iopt_size;
2323 stream->getblk = config->getblk;
2324 stream->alloc = config->alloc ? config->alloc : __xd3_alloc_func;
2325 stream->free = config->freef ? config->freef : __xd3_free_func;
2326 stream->opaque = config->opaque;
2327 stream->flags = config->flags;
2329 /* Secondary setup. */
2330 stream->sec_data = config->sec_data;
2331 stream->sec_inst = config->sec_inst;
2332 stream->sec_addr = config->sec_addr;
2334 stream->sec_data.data_type = DATA_SECTION;
2335 stream->sec_inst.data_type = INST_SECTION;
2336 stream->sec_addr.data_type = ADDR_SECTION;
2338 /* Check static sizes. */
2339 if (sizeof (usize_t) != SIZEOF_USIZE_T ||
2340 sizeof (xoff_t) != SIZEOF_XOFF_T ||
2341 (ret = xd3_check_pow2(XD3_ALLOCSIZE, NULL)))
2343 stream->msg = "incorrect compilation: wrong integer sizes";
2344 return XD3_INTERNAL;
2347 /* Check/set secondary compressor. */
2348 switch (stream->flags & XD3_SEC_TYPE)
2351 if (stream->flags & XD3_SEC_NOALL)
2353 stream->msg = "XD3_SEC flags require a secondary compressor type";
2354 return XD3_INTERNAL;
2362 stream->msg = "too many secondary compressor types set";
2363 return XD3_INTERNAL;
2366 /* Check/set encoder code table. */
2367 switch (stream->flags & XD3_ALT_CODE_TABLE) {
2369 stream->code_table_desc = & __rfc3284_code_table_desc;
2370 stream->code_table_func = xd3_rfc3284_code_table;
2372 #if GENERIC_ENCODE_TABLES
2373 case XD3_ALT_CODE_TABLE:
2374 stream->code_table_desc = & __alternate_code_table_desc;
2375 stream->code_table_func = xd3_alternate_code_table;
2376 stream->comp_table_func = xd3_compute_alternate_table_encoding;
2380 stream->msg = "alternate code table support was not compiled";
2381 return XD3_INTERNAL;
2385 if (smatcher->small_chain == 1 &&
2386 smatcher->small_lchain == 1)
2388 stream->sprevsz = 0;
2392 if ((ret = xd3_check_pow2 (stream->sprevsz, NULL)))
2394 stream->msg = "sprevsz is required to be a power of two";
2395 return XD3_INTERNAL;
2398 stream->sprevmask = stream->sprevsz - 1;
2401 /* Default scanner settings. */
2403 switch (config->smatch_cfg)
2405 IF_BUILD_SOFT(case XD3_SMATCH_SOFT:
2407 *smatcher = config->smatcher_soft;
2408 smatcher->string_match = __smatcher_soft.string_match;
2409 smatcher->name = __smatcher_soft.name;
2410 if (smatcher->large_look < MIN_MATCH ||
2411 smatcher->large_step < 1 ||
2412 smatcher->small_look < MIN_MATCH)
2414 stream->msg = "invalid soft string-match config";
2420 IF_BUILD_DEFAULT(case XD3_SMATCH_DEFAULT:
2421 *smatcher = __smatcher_default;
2423 IF_BUILD_SLOW(case XD3_SMATCH_SLOW:
2424 *smatcher = __smatcher_slow;
2426 IF_BUILD_FASTEST(case XD3_SMATCH_FASTEST:
2427 *smatcher = __smatcher_fastest;
2429 IF_BUILD_FASTER(case XD3_SMATCH_FASTER:
2430 *smatcher = __smatcher_faster;
2432 IF_BUILD_FAST(case XD3_SMATCH_FAST:
2433 *smatcher = __smatcher_fast;
2436 stream->msg = "invalid string match config type";
2437 return XD3_INTERNAL;
2440 if (config->smatch_cfg == XD3_SMATCH_DEFAULT &&
2441 (stream->flags & XD3_COMPLEVEL_MASK) != 0)
2443 int level = (stream->flags & XD3_COMPLEVEL_MASK) >> XD3_COMPLEVEL_SHIFT;
2448 IF_BUILD_FASTEST(*smatcher = __smatcher_fastest;
2451 IF_BUILD_FASTER(*smatcher = __smatcher_faster;
2453 case 3: case 4: case 5:
2454 IF_BUILD_FAST(*smatcher = __smatcher_fast;
2457 IF_BUILD_DEFAULT(*smatcher = __smatcher_default;
2460 IF_BUILD_SLOW(*smatcher = __smatcher_slow;
2462 IF_BUILD_DEFAULT(*smatcher = __smatcher_default;
2464 IF_BUILD_FAST(*smatcher = __smatcher_fast;
2466 IF_BUILD_FASTER(*smatcher = __smatcher_faster;
2468 IF_BUILD_FASTEST(*smatcher = __smatcher_fastest;
2477 /***********************************************************
2479 ***********************************************************/
2482 xoff_t xd3_source_eof(const xd3_source *src)
2484 xoff_t r = (src->blksize * src->max_blkno) + (xoff_t)src->onlastblk;
2489 usize_t xd3_bytes_on_srcblk (xd3_source *src, xoff_t blkno)
2491 usize_t r = (blkno == src->max_blkno ?
2497 /* This function interfaces with the client getblk function, checks
2498 * its results, updates frontier_blkno, max_blkno, onlastblk, eof_known. */
2500 xd3_getblk (xd3_stream *stream, xoff_t blkno)
2503 xd3_source *source = stream->src;
2505 if (source->curblk == NULL || blkno != source->curblkno)
2507 source->getblkno = blkno;
2509 if (stream->getblk == NULL)
2511 stream->msg = "getblk source input";
2512 return XD3_GETSRCBLK;
2515 ret = stream->getblk (stream, source, blkno);
2518 IF_DEBUG1 (DP(RINT "[getblk] app error blkno %"Q"u: %s\n",
2519 blkno, xd3_strerror (ret)));
2524 if (blkno >= source->frontier_blkno)
2526 if (blkno > source->max_blkno)
2528 source->max_blkno = blkno;
2529 source->onlastblk = source->onblk;
2532 if (source->onblk == source->blksize)
2534 source->frontier_blkno = blkno + 1;
2536 IF_DEBUG2 (DP(RINT "[getblk] full source blkno %"Q"u: "
2537 "source length unknown %"Q"u\n",
2539 xd3_source_eof (source)));
2543 if (!source->eof_known)
2545 IF_DEBUG2 (DP(RINT "[getblk] eof block has %d bytes; "
2546 "source length known %"Q"u\n",
2547 xd3_bytes_on_srcblk (source, blkno),
2548 xd3_source_eof (source)));
2549 source->eof_known = 1;
2552 source->frontier_blkno = blkno;
2556 XD3_ASSERT (source->curblk != NULL);
2557 IF_DEBUG2 (DP(RINT "[getblk] read source block %"Q"u onblk %u blksize %u\n",
2558 blkno, source->onblk, source->blksize));
2560 if (blkno == source->max_blkno)
2562 /* In case the application sets the source as 1 block w/ a
2564 source->onlastblk = source->onblk;
2566 if (source->onblk == source->blksize)
2568 source->frontier_blkno = blkno + 1;
2574 /***********************************************************
2576 ***************************************************************/
2579 xd3_set_source (xd3_stream *stream,
2588 /* Enforce power-of-two blocksize so that source-block number
2589 * calculations are cheap. */
2590 if (!xd3_check_pow2 (src->blksize, &shiftby) == 0)
2593 src->blksize = xd3_pow2_roundup(src->blksize);
2594 check = xd3_check_pow2 (src->blksize, &shiftby);
2595 XD3_ASSERT (check == 0);
2596 IF_DEBUG1 (DP(RINT "raising srcblksz to %u\n", src->blksize));
2599 src->shiftby = shiftby;
2600 src->maskby = (1 << shiftby) - 1;
2605 xd3_set_source_and_size (xd3_stream *stream,
2606 xd3_source *user_source,
2607 xoff_t source_size) {
2608 int ret = xd3_set_source (stream, user_source);
2611 stream->src->eof_known = 1;
2612 IF_DEBUG2 (DP(RINT "[set source] size known %"Q"u\n",
2615 xd3_blksize_div(source_size,
2617 &stream->src->max_blkno,
2618 &stream->src->onlastblk);
2624 xd3_abort_stream (xd3_stream *stream)
2626 stream->dec_state = DEC_ABORTED;
2627 stream->enc_state = ENC_ABORTED;
2631 xd3_close_stream (xd3_stream *stream)
2633 if (stream->enc_state != 0 && stream->enc_state != ENC_ABORTED)
2635 if (stream->buf_leftover != NULL)
2637 stream->msg = "encoding is incomplete";
2638 return XD3_INTERNAL;
2641 if (stream->enc_state == ENC_POSTWIN)
2644 xd3_encode_reset (stream);
2646 stream->current_window += 1;
2647 stream->enc_state = ENC_INPUT;
2650 /* If encoding, should be ready for more input but not actually
2652 if (stream->enc_state != ENC_INPUT || stream->avail_in != 0)
2654 stream->msg = "encoding is incomplete";
2655 return XD3_INTERNAL;
2660 switch (stream->dec_state)
2664 /* TODO: Address the zero-byte ambiguity. Does the encoder
2665 * emit a window or not? If so, then catch an error here.
2666 * If not, need another routine to say
2667 * decode_at_least_one_if_empty. */
2671 /* If decoding, should be ready for the next window. */
2672 stream->msg = "EOF in decode";
2673 return XD3_INTERNAL;
2680 /**************************************************************
2682 ****************************************************************/
2685 xd3_get_appheader (xd3_stream *stream,
2689 if (stream->dec_state < DEC_WININD)
2691 stream->msg = "application header not available";
2692 return XD3_INTERNAL;
2695 (*data) = stream->dec_appheader;
2696 (*size) = stream->dec_appheadsz;
2700 /**********************************************************
2702 *************************************************/
2704 #include "xdelta3-decode.h"
2706 /****************************************************************
2708 *****************************************************************/
2712 xd3_set_appheader (xd3_stream *stream,
2713 const uint8_t *data,
2716 stream->enc_appheader = data;
2717 stream->enc_appheadsz = size;
2722 xd3_iopt_check (xd3_stream *stream)
2724 usize_t ul = xd3_rlist_length (& stream->iopt_used);
2725 usize_t fl = xd3_rlist_length (& stream->iopt_free);
2727 return (ul + fl + (stream->iout ? 1 : 0)) == stream->iopt_size;
2732 xd3_iopt_free (xd3_stream *stream, xd3_rinst *i)
2734 xd3_rinst *n = xd3_rlist_remove (i);
2735 xd3_rlist_push_back (& stream->iopt_free, i);
2740 xd3_iopt_free_nonadd (xd3_stream *stream, xd3_rinst *i)
2742 if (i->type != XD3_ADD)
2744 xd3_rlist_push_back (& stream->iopt_free, i);
2748 /* When an instruction is ready to flush from the iopt buffer, this
2749 * function is called to produce an encoding. It writes the
2750 * instruction plus size, address, and data to the various encoding
2753 xd3_iopt_finish_encoding (xd3_stream *stream, xd3_rinst *inst)
2757 /* Check for input overflow. */
2758 XD3_ASSERT (inst->pos + inst->size <= stream->avail_in);
2764 /* the address may have an offset if there is a source window. */
2766 xd3_source *src = stream->src;
2770 /* If there is a source copy, the source must have its
2771 * source window decided before we can encode. This can
2772 * be bad -- we have to make this decision even if no
2773 * source matches have been found. */
2774 if (stream->srcwin_decided == 0)
2776 if ((ret = xd3_srcwin_setup (stream))) { return ret; }
2780 stream->srcwin_decided_early = (!stream->src->eof_known ||
2781 (stream->srcwin_cksum_pos <
2782 xd3_source_eof (stream->src)));
2785 /* xtra field indicates the copy is from the source */
2788 XD3_ASSERT (inst->addr >= src->srcbase);
2789 XD3_ASSERT (inst->addr + inst->size <=
2790 src->srcbase + src->srclen);
2791 addr = (usize_t)(inst->addr - src->srcbase);
2792 stream->n_scpy += 1;
2793 stream->l_scpy += (xoff_t) inst->size;
2797 /* with source window: target copy address is offset
2799 addr = stream->taroff + (usize_t) inst->addr;
2800 stream->n_tcpy += 1;
2801 stream->l_tcpy += (xoff_t) inst->size;
2806 addr = (usize_t) inst->addr;
2807 stream->n_tcpy += 1;
2808 stream->l_tcpy += inst->size;
2811 /* Note: used to assert inst->size >= MIN_MATCH, but not true
2812 * for merge operations & identical match heuristics. */
2813 /* the "here" position is always offset by taroff */
2814 if ((ret = xd3_encode_address (stream, addr, inst->pos + stream->taroff,
2822 DP(RINT "[iopt copy:%d] pos %"Q"u-%"Q"u addr %"Q"u-%"Q"u size %u\n",
2824 stream->total_in + inst->pos,
2825 stream->total_in + inst->pos + inst->size,
2826 inst->addr, inst->addr + inst->size, inst->size);
2832 XD3_ASSERT (inst->size >= MIN_MATCH);
2834 if ((ret = xd3_emit_byte (stream, & DATA_TAIL (stream), inst->xtra))) { return ret; }
2837 stream->l_run += inst->size;
2841 DP(RINT "[iopt run:%d] pos %"Q"u size %u\n", cnt++, stream->total_in + inst->pos, inst->size);
2847 if ((ret = xd3_emit_bytes (stream, & DATA_TAIL (stream),
2848 stream->next_in + inst->pos, inst->size))) { return ret; }
2851 stream->l_add += inst->size;
2855 DP(RINT "[iopt add:%d] pos %"Q"u size %u\n", cnt++, stream->total_in + inst->pos, inst->size);
2862 /* This is the only place stream->unencoded_offset is incremented. */
2863 XD3_ASSERT (stream->unencoded_offset == inst->pos);
2864 stream->unencoded_offset += inst->size;
2868 XD3_CHOOSE_INSTRUCTION (stream, stream->iout, inst);
2870 if (stream->iout != NULL)
2872 if (stream->iout->code2 != 0)
2874 if ((ret = xd3_emit_double (stream, stream->iout, inst, stream->iout->code2))) { return ret; }
2876 xd3_iopt_free_nonadd (stream, stream->iout);
2877 xd3_iopt_free_nonadd (stream, inst);
2878 stream->iout = NULL;
2883 if ((ret = xd3_emit_single (stream, stream->iout, stream->iout->code1))) { return ret; }
2885 xd3_iopt_free_nonadd (stream, stream->iout);
2889 stream->iout = inst;
2894 /* This possibly encodes an add instruction, iadd, which must remain
2895 * on the stack until the following call to
2896 * xd3_iopt_finish_encoding. */
2898 xd3_iopt_add (xd3_stream *stream, usize_t pos, xd3_rinst *iadd)
2901 usize_t off = stream->unencoded_offset;
2905 iadd->type = XD3_ADD;
2907 iadd->size = pos - off;
2909 if ((ret = xd3_iopt_finish_encoding (stream, iadd))) { return ret; }
2915 /* This function calls xd3_iopt_finish_encoding to finish encoding an
2916 * instruction, and it may also produce an add instruction for an
2917 * unmatched region. */
2919 xd3_iopt_add_encoding (xd3_stream *stream, xd3_rinst *inst)
2924 if ((ret = xd3_iopt_add (stream, inst->pos, & iadd))) { return ret; }
2926 if ((ret = xd3_iopt_finish_encoding (stream, inst))) { return ret; }
2931 /* Generates a final add instruction to encode the remaining input. */
2933 xd3_iopt_add_finalize (xd3_stream *stream)
2938 if ((ret = xd3_iopt_add (stream, stream->avail_in, & iadd))) { return ret; }
2942 if ((ret = xd3_emit_single (stream, stream->iout, stream->iout->code1))) { return ret; }
2944 xd3_iopt_free_nonadd (stream, stream->iout);
2945 stream->iout = NULL;
2951 /* Compact the instruction buffer by choosing the best non-overlapping
2952 * instructions when lazy string-matching. There are no ADDs in the
2953 * iopt buffer because those are synthesized in xd3_iopt_add_encoding
2954 * and during xd3_iopt_add_finalize. */
2956 xd3_iopt_flush_instructions (xd3_stream *stream, int force)
2958 xd3_rinst *r1 = xd3_rlist_front (& stream->iopt_used);
2969 XD3_ASSERT (xd3_iopt_check (stream));
2971 /* Note: once tried to skip this step if it's possible to assert
2972 * there are no overlapping instructions. Doesn't work because
2973 * xd3_opt_erase leaves overlapping instructions. */
2974 while (! xd3_rlist_end (& stream->iopt_used, r1) &&
2975 ! xd3_rlist_end (& stream->iopt_used, r2 = xd3_rlist_next (r1)))
2977 r1end = r1->pos + r1->size;
2979 /* If the instructions do not overlap, continue. */
2980 if (r1end <= r2->pos)
2986 r2end = r2->pos + r2->size;
2988 /* The min_match adjustments prevent this. */
2989 XD3_ASSERT (r2end > (r1end + LEAST_MATCH_INCR));
2991 /* If r3 is available... */
2992 if (! xd3_rlist_end (& stream->iopt_used, r3 = xd3_rlist_next (r2)))
2994 /* If r3 starts before r1 finishes or just about, r2 is irrelevant */
2995 if (r3->pos <= r1end + 1)
2997 xd3_iopt_free (stream, r2);
3003 /* Unless force, end the loop when r3 is not available. */
3007 r2off = r2->pos - r1->pos;
3008 r2moff = r2end - r1end;
3009 gap = r2end - r1->pos;
3011 /* If the two matches overlap almost entirely, choose the better match
3012 * and discard the other. The else branch can still create inefficient
3013 * copies, e.g., a 4-byte copy that takes 4 bytes to encode, which
3014 * xd3_smatch() wouldn't allow by its crude efficiency check. However,
3015 * in this case there are adjacent copies which mean the add would cost
3016 * one extra byte. Allow the inefficiency here. */
3017 if (gap < 2*MIN_MATCH || r2moff <= 2 || r2off <= 2)
3019 /* Only one match should be used, choose the longer one. */
3020 if (r1->size < r2->size)
3022 xd3_iopt_free (stream, r1);
3027 /* We are guaranteed that r1 does not overlap now, so advance past r2 */
3028 r1 = xd3_iopt_free (stream, r2);
3034 /* Shorten one of the instructions -- could be optimized
3035 * based on the address cache. */
3040 XD3_ASSERT (r1end > r2->pos && r2end > r1->pos);
3042 /* Try to balance the length of both instructions, but avoid
3043 * making both longer than MAX_MATCH_SPLIT . */
3045 newsize = min (MAX_MATCH_SPLIT, gap - average);
3047 /* Should be possible to simplify this code. */
3048 if (newsize > r1->size)
3051 adjust1 = r1end - r2->pos;
3053 else if (newsize > r2->size)
3056 adjust1 = r1end - r2->pos;
3058 XD3_ASSERT (r1->size > adjust1);
3060 r1->size -= adjust1;
3062 /* don't shorten r2 */
3068 adjust1 = r1->size - newsize;
3070 if (r2->pos > r1end - adjust1)
3072 adjust1 -= r2->pos - (r1end - adjust1);
3075 XD3_ASSERT (r1->size > adjust1);
3077 r1->size -= adjust1;
3080 XD3_ASSERT (r1->pos + r1->size >= r2->pos);
3082 adjust1 = r1->pos + r1->size - r2->pos;
3085 /* Fallthrough above if-else, shorten r2 */
3086 XD3_ASSERT (r2->size > adjust1);
3088 r2->size -= adjust1;
3090 r2->addr += adjust1;
3092 XD3_ASSERT (r1->size >= MIN_MATCH);
3093 XD3_ASSERT (r2->size >= MIN_MATCH);
3099 XD3_ASSERT (xd3_iopt_check (stream));
3101 /* If forcing, pick instructions until the list is empty, otherwise
3102 * this empties 50% of the queue. */
3103 for (flushed = 0; ! xd3_rlist_empty (& stream->iopt_used); )
3105 xd3_rinst *renc = xd3_rlist_pop_front (& stream->iopt_used);
3106 if ((ret = xd3_iopt_add_encoding (stream, renc)))
3113 if (++flushed > stream->iopt_size / 2)
3118 /* If there are only two instructions remaining, break,
3119 * because they were not optimized. This means there were
3120 * more than 50% eliminated by the loop above. */
3121 r1 = xd3_rlist_front (& stream->iopt_used);
3122 if (xd3_rlist_end(& stream->iopt_used, r1) ||
3123 xd3_rlist_end(& stream->iopt_used, r2 = xd3_rlist_next (r1)) ||
3124 xd3_rlist_end(& stream->iopt_used, r3 = xd3_rlist_next (r2)))
3131 XD3_ASSERT (xd3_iopt_check (stream));
3133 XD3_ASSERT (!force || xd3_rlist_length (& stream->iopt_used) == 0);
3139 xd3_iopt_get_slot (xd3_stream *stream, xd3_rinst** iptr)
3144 if (xd3_rlist_empty (& stream->iopt_free))
3146 if (stream->iopt_unlimited)
3148 usize_t elts = XD3_ALLOCSIZE / sizeof(xd3_rinst);
3150 if ((ret = xd3_alloc_iopt (stream, elts)))
3155 stream->iopt_size += elts;
3159 if ((ret = xd3_iopt_flush_instructions (stream, 0))) { return ret; }
3161 XD3_ASSERT (! xd3_rlist_empty (& stream->iopt_free));
3165 i = xd3_rlist_pop_back (& stream->iopt_free);
3167 xd3_rlist_push_back (& stream->iopt_used, i);
3171 ++stream->i_slots_used;
3176 /* A copy is about to be emitted that extends backwards to POS,
3177 * therefore it may completely cover some existing instructions in the
3178 * buffer. If an instruction is completely covered by this new match,
3179 * erase it. If the new instruction is covered by the previous one,
3180 * return 1 to skip it. */
3182 xd3_iopt_erase (xd3_stream *stream, usize_t pos, usize_t size)
3184 while (! xd3_rlist_empty (& stream->iopt_used))
3186 xd3_rinst *r = xd3_rlist_back (& stream->iopt_used);
3188 /* Verify that greedy is working. The previous instruction
3189 * should end before the new one begins. */
3190 XD3_ASSERT ((stream->flags & XD3_BEGREEDY) == 0 || (r->pos + r->size <= pos));
3191 /* Verify that min_match is working. The previous instruction
3192 * should end before the new one ends. */
3193 XD3_ASSERT ((stream->flags & XD3_BEGREEDY) != 0 || (r->pos + r->size < pos + size));
3195 /* See if the last instruction starts before the new
3196 * instruction. If so, there is nothing to erase. */
3202 /* Otherwise, the new instruction covers the old one, delete it
3204 xd3_rlist_remove (r);
3205 xd3_rlist_push_back (& stream->iopt_free, r);
3206 --stream->i_slots_used;
3210 /* This function tells the last matched input position. */
3212 xd3_iopt_last_matched (xd3_stream *stream)
3216 if (xd3_rlist_empty (& stream->iopt_used))
3221 r = xd3_rlist_back (& stream->iopt_used);
3223 return r->pos + r->size;
3226 /*********************************************************
3228 ***********************************************************/
3231 xd3_emit_single (xd3_stream *stream, xd3_rinst *single, usize_t code)
3233 int has_size = stream->code_table[code].size1 == 0;
3236 IF_DEBUG2 (DP(RINT "[emit1] %u %s (%u) code %u\n",
3238 xd3_rtype_to_string ((xd3_rtype) single->type, 0),
3242 if ((ret = xd3_emit_byte (stream, & INST_TAIL (stream), code)))
3249 if ((ret = xd3_emit_size (stream, & INST_TAIL (stream), single->size)))
3259 xd3_emit_double (xd3_stream *stream, xd3_rinst *first,
3260 xd3_rinst *second, usize_t code)
3264 /* All double instructions use fixed sizes, so all we need to do is
3265 * output the instruction code, no sizes. */
3266 XD3_ASSERT (stream->code_table[code].size1 != 0 &&
3267 stream->code_table[code].size2 != 0);
3269 if ((ret = xd3_emit_byte (stream, & INST_TAIL (stream), code)))
3274 IF_DEBUG2 (DP(RINT "[emit2]: %u %s (%u) %s (%u) code %u\n",
3276 xd3_rtype_to_string ((xd3_rtype) first->type, 0),
3278 xd3_rtype_to_string ((xd3_rtype) second->type, 0),
3285 /* This enters a potential run instruction into the iopt buffer. The
3286 * position argument is relative to the target window. */
3288 xd3_emit_run (xd3_stream *stream, usize_t pos, usize_t size, uint8_t *run_c)
3293 if ((ret = xd3_iopt_get_slot (stream, & ri))) { return ret; }
3303 /* This enters a potential copy instruction into the iopt buffer. The
3304 * position argument is relative to the target window.. */
3306 xd3_found_match (xd3_stream *stream, usize_t pos,
3307 usize_t size, xoff_t addr, int is_source)
3312 if ((ret = xd3_iopt_get_slot (stream, & ri))) { return ret; }
3315 ri->xtra = is_source;
3324 xd3_emit_hdr (xd3_stream *stream)
3327 int use_secondary = stream->sec_type != NULL;
3328 int use_adler32 = stream->flags & (XD3_ADLER32 | XD3_ADLER32_RECODE);
3329 int vcd_source = xd3_encoder_used_source (stream);
3330 usize_t win_ind = 0;
3331 usize_t del_ind = 0;
3338 if (stream->current_window == 0)
3340 usize_t hdr_ind = 0;
3341 int use_appheader = stream->enc_appheader != NULL;
3342 int use_gencodetbl = GENERIC_ENCODE_TABLES &&
3343 (stream->code_table_desc != & __rfc3284_code_table_desc);
3345 if (use_secondary) { hdr_ind |= VCD_SECONDARY; }
3346 if (use_gencodetbl) { hdr_ind |= VCD_CODETABLE; }
3347 if (use_appheader) { hdr_ind |= VCD_APPHEADER; }
3349 if ((ret = xd3_emit_byte (stream, & HDR_TAIL (stream),
3350 VCDIFF_MAGIC1)) != 0 ||
3351 (ret = xd3_emit_byte (stream, & HDR_TAIL (stream),
3352 VCDIFF_MAGIC2)) != 0 ||
3353 (ret = xd3_emit_byte (stream, & HDR_TAIL (stream),
3354 VCDIFF_MAGIC3)) != 0 ||
3355 (ret = xd3_emit_byte (stream, & HDR_TAIL (stream),
3356 VCDIFF_VERSION)) != 0 ||
3357 (ret = xd3_emit_byte (stream, & HDR_TAIL (stream), hdr_ind)) != 0)
3362 /* Secondary compressor ID */
3364 if (use_secondary &&
3365 (ret = xd3_emit_byte (stream, & HDR_TAIL (stream),
3366 stream->sec_type->id)))
3372 /* Compressed code table */
3375 usize_t code_table_size;
3376 const uint8_t *code_table_data;
3378 if ((ret = stream->comp_table_func (stream, & code_table_data,
3379 & code_table_size)))
3384 if ((ret = xd3_emit_size (stream, & HDR_TAIL (stream),
3385 code_table_size + 2)) ||
3386 (ret = xd3_emit_byte (stream, & HDR_TAIL (stream),
3387 stream->code_table_desc->near_modes)) ||
3388 (ret = xd3_emit_byte (stream, & HDR_TAIL (stream),
3389 stream->code_table_desc->same_modes)) ||
3390 (ret = xd3_emit_bytes (stream, & HDR_TAIL (stream),
3391 code_table_data, code_table_size)))
3397 /* Application header */
3400 if ((ret = xd3_emit_size (stream, & HDR_TAIL (stream),
3401 stream->enc_appheadsz)) ||
3402 (ret = xd3_emit_bytes (stream, & HDR_TAIL (stream),
3403 stream->enc_appheader,
3404 stream->enc_appheadsz)))
3411 /* try to compress this window */
3419 # define ENCODE_SECONDARY_SECTION(UPPER,LOWER) \
3420 ((stream->flags & XD3_SEC_NO ## UPPER) == 0 && \
3421 (ret = xd3_encode_secondary (stream, \
3422 & UPPER ## _HEAD (stream), \
3423 & UPPER ## _TAIL (stream), \
3424 & xd3_sec_ ## LOWER (stream), \
3425 & stream->sec_ ## LOWER, \
3428 if (ENCODE_SECONDARY_SECTION (DATA, data) ||
3429 ENCODE_SECONDARY_SECTION (INST, inst) ||
3430 ENCODE_SECONDARY_SECTION (ADDR, addr))
3435 del_ind |= (data_sec ? VCD_DATACOMP : 0);
3436 del_ind |= (inst_sec ? VCD_INSTCOMP : 0);
3437 del_ind |= (addr_sec ? VCD_ADDRCOMP : 0);
3441 /* if (vcd_target) { win_ind |= VCD_TARGET; } */
3442 if (vcd_source) { win_ind |= VCD_SOURCE; }
3443 if (use_adler32) { win_ind |= VCD_ADLER32; }
3445 /* window indicator */
3446 if ((ret = xd3_emit_byte (stream, & HDR_TAIL (stream), win_ind)))
3454 /* or (vcd_target) { ... } */
3455 if ((ret = xd3_emit_size (stream, & HDR_TAIL (stream),
3456 stream->src->srclen)) ||
3457 (ret = xd3_emit_offset (stream, & HDR_TAIL (stream),
3458 stream->src->srcbase))) { return ret; }
3461 tgt_len = stream->avail_in;
3462 data_len = xd3_sizeof_output (DATA_HEAD (stream));
3463 inst_len = xd3_sizeof_output (INST_HEAD (stream));
3464 addr_len = xd3_sizeof_output (ADDR_HEAD (stream));
3466 /* The enc_len field is a redundency for future extensions.*/
3467 enc_len = (1 + (xd3_sizeof_size (tgt_len) +
3468 xd3_sizeof_size (data_len) +
3469 xd3_sizeof_size (inst_len) +
3470 xd3_sizeof_size (addr_len)) +
3474 (use_adler32 ? 4 : 0));
3476 if ((ret = xd3_emit_size (stream, & HDR_TAIL (stream), enc_len)) ||
3477 (ret = xd3_emit_size (stream, & HDR_TAIL (stream), tgt_len)) ||
3478 (ret = xd3_emit_byte (stream, & HDR_TAIL (stream), del_ind)) ||
3479 (ret = xd3_emit_size (stream, & HDR_TAIL (stream), data_len)) ||
3480 (ret = xd3_emit_size (stream, & HDR_TAIL (stream), inst_len)) ||
3481 (ret = xd3_emit_size (stream, & HDR_TAIL (stream), addr_len)))
3491 if (stream->flags & XD3_ADLER32)
3493 a32 = adler32 (1L, stream->next_in, stream->avail_in);
3497 a32 = stream->recode_adler32;
3501 send[0] = (uint8_t) (a32 >> 24);
3502 send[1] = (uint8_t) (a32 >> 16);
3503 send[2] = (uint8_t) (a32 >> 8);
3504 send[3] = (uint8_t) (a32 & 0x000000FFU);
3506 if ((ret = xd3_emit_bytes (stream, & HDR_TAIL (stream), send, 4)))
3515 /****************************************************************
3517 ****************************************************************/
3520 xd3_encode_buffer_leftover (xd3_stream *stream)
3525 /* Allocate the buffer. */
3526 if (stream->buf_in == NULL &&
3527 (stream->buf_in = (uint8_t*) xd3_alloc (stream, stream->winsize, 1)) == NULL)
3532 IF_DEBUG2 (DP(RINT "[leftover] flush?=%s\n", (stream->flags & XD3_FLUSH) ? "yes" : "no"));
3534 /* Take leftover input first. */
3535 if (stream->buf_leftover != NULL)
3537 XD3_ASSERT (stream->buf_avail == 0);
3538 XD3_ASSERT (stream->buf_leftavail < stream->winsize);
3540 IF_DEBUG2 (DP(RINT "[leftover] previous %u avail %u\n", stream->buf_leftavail, stream->avail_in));
3542 memcpy (stream->buf_in, stream->buf_leftover, stream->buf_leftavail);
3544 stream->buf_leftover = NULL;
3545 stream->buf_avail = stream->buf_leftavail;
3548 /* Copy into the buffer. */
3549 room = stream->winsize - stream->buf_avail;
3550 take = min (room, stream->avail_in);
3552 memcpy (stream->buf_in + stream->buf_avail, stream->next_in, take);
3554 stream->buf_avail += take;
3556 if (take < stream->avail_in)
3558 /* Buffer is full */
3559 stream->buf_leftover = stream->next_in + take;
3560 stream->buf_leftavail = stream->avail_in - take;
3562 else if ((stream->buf_avail < stream->winsize) && !(stream->flags & XD3_FLUSH))
3564 /* Buffer has space */
3565 IF_DEBUG2 (DP(RINT "[leftover] emptied %u\n", take));
3569 /* Use the buffer: */
3570 IF_DEBUG2 (DP(RINT "[leftover] take %u remaining %u\n", take, stream->buf_leftavail));
3571 stream->next_in = stream->buf_in;
3572 stream->avail_in = stream->buf_avail;
3573 stream->buf_avail = 0;
3578 /* Allocates one block of xd3_rlist elements */
3580 xd3_alloc_iopt (xd3_stream *stream, usize_t elts)
3583 xd3_iopt_buflist* last =
3584 (xd3_iopt_buflist*) xd3_alloc (stream, sizeof (xd3_iopt_buflist), 1);
3587 (last->buffer = (xd3_rinst*) xd3_alloc (stream, sizeof (xd3_rinst), elts)) == NULL)
3592 last->next = stream->iopt_alloc;
3593 stream->iopt_alloc = last;
3595 for (i = 0; i < elts; i += 1)
3597 xd3_rlist_push_back (& stream->iopt_free, & last->buffer[i]);
3603 /* This function allocates all memory initially used by the encoder. */
3605 xd3_encode_init (xd3_stream *stream, int full_init)
3611 int large_comp = (stream->src != NULL);
3612 int small_comp = ! (stream->flags & XD3_NOCOMPRESS);
3614 /* Memory allocations for checksum tables are delayed until
3615 * xd3_string_match_init in the first call to string_match--that way
3616 * identical or short inputs require no table allocation. */
3619 usize_t hash_values = (stream->srcwin_maxsz /
3620 stream->smatcher.large_step);
3622 xd3_size_hashtable (stream,
3624 & stream->large_hash);
3629 /* TODO: This is under devel: used to have min(sprevsz) here, which sort
3630 * of makes sense, but observed fast performance w/ larger tables, which
3631 * also sort of makes sense. @@@ */
3632 usize_t hash_values = stream->winsize;
3634 xd3_size_hashtable (stream,
3636 & stream->small_hash);
3641 for (i = 0; i < ENC_SECTS; i += 1)
3643 if ((stream->enc_heads[i] =
3644 stream->enc_tails[i] =
3645 xd3_alloc_output (stream, NULL)) == NULL)
3652 xd3_rlist_init (& stream->iopt_used);
3653 xd3_rlist_init (& stream->iopt_free);
3655 if (xd3_alloc_iopt (stream, stream->iopt_size) != 0) { goto fail; }
3657 XD3_ASSERT (xd3_rlist_length (& stream->iopt_free) == stream->iopt_size);
3658 XD3_ASSERT (xd3_rlist_length (& stream->iopt_used) == 0);
3660 /* address cache, code table */
3661 stream->acache.s_near = stream->code_table_desc->near_modes;
3662 stream->acache.s_same = stream->code_table_desc->same_modes;
3663 stream->code_table = stream->code_table_func ();
3665 return xd3_alloc_cache (stream);
3673 xd3_encode_init_full (xd3_stream *stream)
3675 return xd3_encode_init (stream, 1);
3679 xd3_encode_init_partial (xd3_stream *stream)
3681 return xd3_encode_init (stream, 0);
3684 /* Called after the ENC_POSTOUT state, this puts the output buffers
3685 * back into separate lists and re-initializes some variables. (The
3686 * output lists were spliced together during the ENC_FLUSH state.) */
3688 xd3_encode_reset (xd3_stream *stream)
3693 stream->avail_in = 0;
3694 stream->small_reset = 1;
3695 stream->i_slots_used = 0;
3697 if (stream->src != NULL)
3699 stream->src->srcbase = 0;
3700 stream->src->srclen = 0;
3701 stream->srcwin_decided = 0;
3702 stream->srcwin_decided_early = 0;
3703 stream->match_minaddr = 0;
3704 stream->match_maxaddr = 0;
3708 /* Reset output chains. */
3709 olist = stream->enc_heads[0];
3711 for (i = 0; i < ENC_SECTS; i += 1)
3713 XD3_ASSERT (olist != NULL);
3715 stream->enc_heads[i] = olist;
3716 stream->enc_tails[i] = olist;
3717 olist = olist->next_page;
3719 stream->enc_heads[i]->next = 0;
3720 stream->enc_heads[i]->next_page = NULL;
3722 stream->enc_tails[i]->next_page = NULL;
3723 stream->enc_tails[i] = stream->enc_heads[i];
3726 xd3_freelist_output (stream, olist);
3729 /* The main encoding routine. */
3731 xd3_encode_input (xd3_stream *stream)
3735 if (stream->dec_state != 0)
3737 stream->msg = "encoder/decoder transition";
3738 return XD3_INTERNAL;
3741 switch (stream->enc_state)
3744 /* Only reached on first time through: memory setup. */
3745 if ((ret = xd3_encode_init_full (stream))) { return ret; }
3747 stream->enc_state = ENC_INPUT;
3751 /* If there is no input yet, just return. This checks for
3752 * next_in == NULL, not avail_in == 0 since zero bytes is a
3753 * valid input. There is an assertion in xd3_avail_input() that
3754 * next_in != NULL for this reason. By returning right away we
3755 * avoid creating an input buffer before the caller has supplied
3756 * its first data. It is possible for xd3_avail_input to be
3757 * called both before and after the first call to
3758 * xd3_encode_input(). */
3759 if (stream->next_in == NULL)
3765 /* See if we should buffer the input: either if there is already
3766 * a leftover buffer, or if the input is short of winsize
3767 * without flush. The label at this point is reached by a goto
3768 * below, when there is leftover input after postout. */
3769 if ((stream->buf_leftover != NULL) ||
3770 (stream->buf_avail != 0) ||
3771 (stream->avail_in < stream->winsize && ! (stream->flags & XD3_FLUSH)))
3773 if ((ret = xd3_encode_buffer_leftover (stream))) { return ret; }
3776 /* Initalize the address cache before each window. */
3777 xd3_init_cache (& stream->acache);
3779 stream->input_position = 0;
3780 stream->min_match = MIN_MATCH;
3781 stream->unencoded_offset = 0;
3783 stream->enc_state = ENC_SEARCH;
3785 IF_DEBUG2 (DP(RINT "[WINSTART:%"Q"u] input bytes %u offset %"Q"u\n",
3786 stream->current_window, stream->avail_in,
3788 return XD3_WINSTART;
3791 IF_DEBUG2 (DP(RINT "[SEARCH] match_state %d avail_in %u %s\n",
3792 stream->match_state, stream->avail_in,
3793 stream->src ? "source" : "no source"));
3795 /* Reentrant matching. */
3796 if (stream->src != NULL)
3798 switch (stream->match_state)
3801 /* Try matching forward at the start of the target.
3802 * This is entered the first time through, to check for
3803 * a perfect match, and whenever there is a source match
3804 * that extends to the end of the previous window. The
3805 * match_srcpos field is initially zero and later set
3806 * during xd3_source_extend_match. */
3808 if (stream->avail_in > 0)
3810 /* This call can't fail because the source window is
3812 ret = xd3_source_match_setup (stream, stream->match_srcpos);
3813 XD3_ASSERT (ret == 0);
3814 stream->match_state = MATCH_FORWARD;
3818 stream->match_state = MATCH_SEARCHING;
3819 stream->match_fwd = 0;
3821 XD3_ASSERT (stream->match_fwd == 0);
3824 case MATCH_BACKWARD:
3825 if (stream->avail_in != 0)
3827 if ((ret = xd3_source_extend_match (stream)) != 0)
3832 /* The search has to make forward progress here
3833 * or else it can get stuck in a match-backward
3834 * (getsrcblk) then match-forward (getsrcblk),
3835 * find insufficient match length, then repeat
3836 * exactly the same search.
3838 stream->input_position += stream->match_fwd;
3841 case MATCH_SEARCHING:
3842 /* Continue string matching. (It's possible that the
3843 * initial match continued through the entire input, in
3844 * which case we're still in MATCH_FORWARD and should
3845 * remain so for the next input window.) */
3850 /* String matching... */
3851 if (stream->avail_in != 0 &&
3852 (ret = stream->smatcher.string_match (stream)))
3857 stream->enc_state = ENC_INSTR;
3860 /* Note: Jump here to encode VCDIFF deltas w/o using this
3861 * string-matching code. Merging code code enters here. */
3863 /* Flush the instrution buffer, then possibly add one more
3864 * instruction, then emit the header. */
3865 if ((ret = xd3_iopt_flush_instructions (stream, 1)) ||
3866 (ret = xd3_iopt_add_finalize (stream)))
3871 stream->enc_state = ENC_FLUSH;
3874 /* Note: main_recode_func() bypasses string-matching by setting
3876 if ((ret = xd3_emit_hdr (stream)))
3882 stream->enc_current = HDR_HEAD (stream);
3884 /* Chain all the outputs together. After doing this, it looks
3885 * as if there is only one section. The other enc_heads are set
3886 * to NULL to avoid freeing them more than once. */
3887 for (i = 1; i < ENC_SECTS; i += 1)
3889 stream->enc_tails[i-1]->next_page = stream->enc_heads[i];
3890 stream->enc_heads[i] = NULL;
3895 stream->enc_state = ENC_POSTOUT;
3896 stream->next_out = stream->enc_current->base;
3897 stream->avail_out = stream->enc_current->next;
3898 stream->total_out += (xoff_t) stream->avail_out;
3900 /* If there is any output in this buffer, return it, otherwise
3901 * fall through to handle the next buffer or finish the window
3902 * after all buffers have been output. */
3903 if (stream->avail_out > 0)
3905 /* This is the only place xd3_encode returns XD3_OUTPUT */
3911 if (stream->avail_out != 0)
3913 stream->msg = "missed call to consume output";
3914 return XD3_INTERNAL;
3917 /* Continue outputting one buffer at a time, until the next is NULL. */
3918 if ((stream->enc_current = stream->enc_current->next_page) != NULL)
3923 stream->total_in += (xoff_t) stream->avail_in;
3924 stream->enc_state = ENC_POSTWIN;
3926 IF_DEBUG2 (DP(RINT "[WINFINISH:%"Q"u] in=%"Q"u\n",
3927 stream->current_window,
3929 return XD3_WINFINISH;
3933 xd3_encode_reset (stream);
3935 stream->current_window += 1;
3936 stream->enc_state = ENC_INPUT;
3938 /* If there is leftover input to flush, repeat. */
3939 if (stream->buf_leftover != NULL)
3944 /* Ready for more input. */
3948 stream->msg = "invalid state";
3949 return XD3_INTERNAL;
3952 #endif /* XD3_ENCODER */
3954 /*****************************************************************
3955 Client convenience functions
3956 ******************************************************************/
3959 xd3_process_stream (int is_encode,
3961 int (*func) (xd3_stream *),
3963 const uint8_t *input,
3966 usize_t *output_size,
3967 usize_t output_size_max)
3970 usize_t n = min(stream->winsize, input_size);
3974 stream->flags |= XD3_FLUSH;
3976 xd3_avail_input (stream, input + ipos, n);
3982 switch((ret = func (stream)))
3984 case XD3_OUTPUT: { /* memcpy below */ break; }
3986 n = min(stream->winsize, input_size - ipos);
3990 xd3_avail_input (stream, input + ipos, n);
3994 case XD3_GOTHEADER: { /* ignore */ continue; }
3995 case XD3_WINSTART: { /* ignore */ continue; }
3996 case XD3_WINFINISH: { /* ignore */ continue; }
3999 stream->msg = "stream requires source input";
4000 return XD3_INTERNAL;
4004 /* xd3_encode_input/xd3_decode_input never return 0 */
4005 stream->msg = "invalid return: 0";
4006 return XD3_INTERNAL;
4012 if (*output_size + stream->avail_out > output_size_max)
4014 stream->msg = "insufficient output space";
4018 memcpy (output + *output_size, stream->next_out, stream->avail_out);
4020 *output_size += stream->avail_out;
4022 xd3_consume_output (stream);
4025 return (close_stream == 0) ? 0 : xd3_close_stream (stream);
4029 xd3_process_memory (int is_encode,
4030 int (*func) (xd3_stream *),
4032 const uint8_t *input,
4034 const uint8_t *source,
4035 usize_t source_size,
4037 usize_t *output_size,
4038 usize_t output_size_max,
4045 memset (& stream, 0, sizeof (stream));
4046 memset (& config, 0, sizeof (config));
4048 if (input == NULL || output == NULL) {
4049 stream.msg = "invalid input/output buffer";
4054 config.flags = flags;
4058 config.srcwin_maxsz = source_size;
4059 config.winsize = min(input_size, (usize_t) XD3_DEFAULT_WINSIZE);
4060 config.iopt_size = min(input_size / 32, XD3_DEFAULT_IOPT_SIZE);
4061 config.iopt_size = max(config.iopt_size, 128U);
4062 config.sprevsz = xd3_pow2_roundup (config.winsize);
4065 if ((ret = xd3_config_stream (&stream, &config)) != 0)
4072 memset (& src, 0, sizeof (src));
4074 src.blksize = source_size;
4075 src.onblk = source_size;
4076 src.curblk = source;
4079 if ((ret = xd3_set_source_and_size (&stream, &src, source_size)) != 0)
4085 if ((ret = xd3_process_stream (is_encode,
4091 output_size_max)) != 0)
4099 IF_DEBUG2 (DP(RINT "process_memory: %d: %s\n", ret, stream.msg));
4101 xd3_free_stream(&stream);
4106 xd3_decode_stream (xd3_stream *stream,
4107 const uint8_t *input,
4110 usize_t *output_size,
4111 usize_t output_size_max)
4113 return xd3_process_stream (0, stream, & xd3_decode_input, 1,
4115 output, output_size, output_size_max);
4119 xd3_decode_memory (const uint8_t *input,
4121 const uint8_t *source,
4122 usize_t source_size,
4124 usize_t *output_size,
4125 usize_t output_size_max,
4127 return xd3_process_memory (0, & xd3_decode_input, 1,
4129 source, source_size,
4130 output, output_size, output_size_max,
4137 xd3_encode_stream (xd3_stream *stream,
4138 const uint8_t *input,
4141 usize_t *output_size,
4142 usize_t output_size_max)
4144 return xd3_process_stream (1, stream, & xd3_encode_input, 1,
4146 output, output_size, output_size_max);
4150 xd3_encode_memory (const uint8_t *input,
4152 const uint8_t *source,
4153 usize_t source_size,
4155 usize_t *output_size,
4156 usize_t output_size_max,
4158 return xd3_process_memory (1, & xd3_encode_input, 1,
4160 source, source_size,
4161 output, output_size, output_size_max,
4167 /*************************************************************
4168 String matching helpers
4169 *************************************************************/
4172 /* Do the initial xd3_string_match() checksum table setup.
4173 * Allocations are delayed until first use to avoid allocation
4174 * sometimes (e.g., perfect matches, zero-length inputs). */
4176 xd3_string_match_init (xd3_stream *stream)
4178 const int DO_SMALL = ! (stream->flags & XD3_NOCOMPRESS);
4179 const int DO_LARGE = (stream->src != NULL);
4181 if (DO_LARGE && stream->large_table == NULL)
4183 if ((stream->large_table =
4184 (usize_t*) xd3_alloc0 (stream, stream->large_hash.size, sizeof (usize_t))) == NULL)
4192 /* Subsequent calls can return immediately after checking reset. */
4193 if (stream->small_table != NULL)
4195 /* The target hash table is reinitialized once per window. */
4196 /* TODO: This would not have to be reinitialized if absolute
4197 * offsets were being stored. */
4198 if (stream->small_reset)
4200 stream->small_reset = 0;
4201 memset (stream->small_table, 0,
4202 sizeof (usize_t) * stream->small_hash.size);
4208 if ((stream->small_table =
4209 (usize_t*) xd3_alloc0 (stream,
4210 stream->small_hash.size,
4211 sizeof (usize_t))) == NULL)
4216 /* If there is a previous table needed. */
4217 if (stream->smatcher.small_lchain > 1 ||
4218 stream->smatcher.small_chain > 1)
4220 if ((stream->small_prev =
4221 (xd3_slist*) xd3_alloc (stream,
4223 sizeof (xd3_slist))) == NULL)
4233 #if XD3_USE_LARGEFILE64
4234 /* This function handles the 32/64bit ambiguity -- file positions are 64bit
4235 * but the hash table for source-offsets is 32bit. */
4237 xd3_source_cksum_offset(xd3_stream *stream, usize_t low)
4239 xoff_t scp = stream->srcwin_cksum_pos;
4240 xoff_t s0 = scp >> 32;
4242 usize_t sr = (usize_t) scp;
4248 /* This should not be >= because srcwin_cksum_pos is the next
4249 * position to index. */
4251 return (--s0 << 32) | low;
4254 return (s0 << 32) | low;
4258 xd3_source_cksum_offset(xd3_stream *stream, usize_t low)
4260 return (xoff_t) low;
4264 /* This function sets up the stream->src fields srcbase, srclen. The
4265 * call is delayed until these values are needed to encode a copy
4266 * address. At this point the decision has to be made. */
4268 xd3_srcwin_setup (xd3_stream *stream)
4270 xd3_source *src = stream->src;
4273 /* Check the undecided state. */
4274 XD3_ASSERT (src->srclen == 0 && src->srcbase == 0);
4276 /* Avoid repeating this call. */
4277 stream->srcwin_decided = 1;
4279 /* If the stream is flushing, then the iopt buffer was able to
4280 * contain the complete encoding. If no copies were issued no
4281 * source window is actually needed. This prevents the VCDIFF
4282 * header from including source base/len. xd3_emit_hdr checks for
4284 if (stream->enc_state == ENC_INSTR && stream->match_maxaddr == 0)
4289 /* Check for overflow, srclen is usize_t - this can't happen unless
4290 * XD3_DEFAULT_SRCBACK and related parameters are extreme - should
4291 * use smaller windows. */
4292 length = stream->match_maxaddr - stream->match_minaddr;
4294 x = (xoff_t) USIZE_T_MAX;
4297 stream->msg = "source window length overflow (not 64bit)";
4298 return XD3_INTERNAL;
4301 /* If ENC_INSTR, then we know the exact source window to use because
4302 * no more copies can be issued. */
4303 if (stream->enc_state == ENC_INSTR)
4305 src->srcbase = stream->match_minaddr;
4306 src->srclen = (usize_t) length;
4307 XD3_ASSERT (src->srclen);
4311 /* Otherwise, we have to make a guess. More copies may still be
4312 * issued, but we have to decide the source window base and length
4314 src->srcbase = stream->match_minaddr;
4315 src->srclen = max ((usize_t) length,
4316 stream->avail_in + (stream->avail_in >> 2));
4318 /* OPT: If we know the source size, it might be possible to reduce
4320 XD3_ASSERT (src->srclen);
4322 /* Set the taroff. This convenience variable is used even when
4323 stream->src == NULL. */
4324 stream->taroff = src->srclen;
4328 /* Sets the bounding region for a newly discovered source match, prior
4329 * to calling xd3_source_extend_match(). This sets the match_maxfwd,
4330 * match_maxback variables. Note: srcpos is an absolute position
4331 * (xoff_t) but the match_maxfwd, match_maxback variables are usize_t.
4332 * Returns 0 if the setup succeeds, or 1 if the source position lies
4333 * outside an already-decided srcbase/srclen window. */
4335 xd3_source_match_setup (xd3_stream *stream, xoff_t srcpos)
4337 xd3_source *src = stream->src;
4338 usize_t greedy_or_not;
4339 xoff_t frontier_pos;
4341 stream->match_maxback = 0;
4342 stream->match_maxfwd = 0;
4343 stream->match_back = 0;
4344 stream->match_fwd = 0;
4346 /* This avoids a non-blocking endless loop caused by scanning
4347 * backwards across a block boundary, only to find not enough
4348 * matching bytes to beat the current min_match due to a better lazy
4349 * target match: the re-entry to xd3_string_match() repeats the same
4350 * long match because the input position hasn't changed. TODO: if
4351 * ever duplicates are added to the source hash table, this logic
4352 * won't suffice to avoid loops. See testing/regtest.cc's
4353 * TestNonBlockingProgress test! */
4354 if (srcpos != 0 && srcpos == stream->match_last_srcpos)
4356 IF_DEBUG2(DP(RINT "[match_setup] looping failure\n"));
4360 /* Implement srcwin_maxsz, which prevents the encoder from seeking
4361 * back further than the LRU cache maintaining FIFO discipline, (to
4362 * avoid seeking). */
4364 stream->src->frontier_blkno * stream->src->blksize;
4365 IF_DEBUG1(DP(RINT "[match_setup] frontier_pos %"Q"u, srcpos %"Q"u, "
4366 "srcwin_maxsz %u\n",
4367 frontier_pos, srcpos, stream->srcwin_maxsz));
4368 if (srcpos < frontier_pos &&
4369 frontier_pos - srcpos > stream->srcwin_maxsz) {
4370 IF_DEBUG1(DP(RINT "[match_setup] rejected due to srcwin_maxsz "
4371 "distance eof=%"Q"u srcpos=%"Q"u maxsz=%u\n",
4372 xd3_source_eof (stream->src),
4373 srcpos, stream->srcwin_maxsz));
4377 /* Going backwards, the 1.5-pass algorithm allows some
4378 * already-matched input may be covered by a longer source match.
4379 * The greedy algorithm does not allow this. */
4380 if (stream->flags & XD3_BEGREEDY)
4382 /* The greedy algorithm allows backward matching to the last
4383 matched position. */
4384 greedy_or_not = xd3_iopt_last_matched (stream);
4388 /* The 1.5-pass algorithm allows backward matching to go back as
4389 * far as the unencoded offset, which is updated as instructions
4390 * pass out of the iopt buffer. If this (default) is chosen, it
4391 * means xd3_iopt_erase may be called to eliminate instructions
4392 * when a covering source match is found. */
4393 greedy_or_not = stream->unencoded_offset;
4396 /* Backward target match limit. */
4397 XD3_ASSERT (stream->input_position >= greedy_or_not);
4398 stream->match_maxback = stream->input_position - greedy_or_not;
4400 /* Forward target match limit. */
4401 XD3_ASSERT (stream->avail_in > stream->input_position);
4402 stream->match_maxfwd = stream->avail_in - stream->input_position;
4404 /* Now we take the source position into account. It depends whether
4405 * the srclen/srcbase have been decided yet. */
4406 if (stream->srcwin_decided == 0)
4408 /* Unrestricted case: the match can cover the entire source,
4409 * 0--src->size. We compare the usize_t
4410 * match_maxfwd/match_maxback against the xoff_t
4411 * src->size/srcpos values and take the min. */
4412 if (srcpos < (xoff_t) stream->match_maxback)
4414 stream->match_maxback = (usize_t) srcpos;
4417 if (stream->src->eof_known)
4419 xoff_t srcavail = xd3_source_eof (stream->src) - srcpos;
4421 if (srcavail < (xoff_t) stream->match_maxfwd)
4423 stream->match_maxfwd = (usize_t) srcavail;
4428 "[match_setup] srcpos %"Q"u (tgtpos %"Q"u) "
4429 "unrestricted maxback %u maxfwd %u\n",
4431 stream->total_in + stream->input_position,
4432 stream->match_maxback,
4433 stream->match_maxfwd));
4437 /* Decided some source window. */
4438 XD3_ASSERT (src->srclen > 0);
4440 /* Restricted case: fail if the srcpos lies outside the source window */
4441 if ((srcpos < src->srcbase) ||
4442 (srcpos > (src->srcbase + (xoff_t) src->srclen)))
4444 IF_DEBUG1(DP(RINT "[match_setup] restricted source window failure\n"));
4451 srcavail = (usize_t) (srcpos - src->srcbase);
4452 if (srcavail < stream->match_maxback)
4454 stream->match_maxback = srcavail;
4457 srcavail = (usize_t) (src->srcbase + (xoff_t) src->srclen - srcpos);
4458 if (srcavail < stream->match_maxfwd)
4460 stream->match_maxfwd = srcavail;
4464 "[match_setup] srcpos %"Q"u (tgtpos %"Q"u) "
4465 "restricted maxback %u maxfwd %u\n",
4467 stream->total_in + stream->input_position,
4468 stream->match_maxback,
4469 stream->match_maxfwd));
4474 stream->match_state = MATCH_BACKWARD;
4475 stream->match_srcpos = srcpos;
4476 stream->match_last_srcpos = srcpos;
4480 stream->match_state = MATCH_SEARCHING;
4485 xd3_forward_match(const uint8_t *s1c, const uint8_t *s2c, int n)
4489 int nint = n / sizeof(int);
4494 const int *s1 = (const int*)s1c;
4495 const int *s2 = (const int*)s2c;
4496 int nint_8 = nint - 8;
4498 while (i <= nint_8 &&
4499 s1[i++] == s2[j++] &&
4500 s1[i++] == s2[j++] &&
4501 s1[i++] == s2[j++] &&
4502 s1[i++] == s2[j++] &&
4503 s1[i++] == s2[j++] &&
4504 s1[i++] == s2[j++] &&
4505 s1[i++] == s2[j++] &&
4506 s1[i++] == s2[j++]) { }
4508 i = (i - 1) * sizeof(int);
4512 while (i < n && s1c[i] == s2c[i])
4519 /* This function expands the source match backward and forward. It is
4520 * reentrant, since xd3_getblk may return XD3_GETSRCBLK, so most
4521 * variables are kept in xd3_stream. There are two callers of this
4522 * function, the string_matching routine when a checksum match is
4523 * discovered, and xd3_encode_input whenever a continuing (or initial)
4524 * match is suspected. The two callers do different things with the
4525 * input_position, thus this function leaves that variable untouched.
4526 * If a match is taken the resulting stream->match_fwd is left
4529 xd3_source_extend_match (xd3_stream *stream)
4532 xd3_source *src = stream->src;
4533 xoff_t matchoff; /* matchoff is the current right/left-boundary of
4534 the source match being tested. */
4535 usize_t streamoff; /* streamoff is the current right/left-boundary
4536 of the input match being tested. */
4537 xoff_t tryblk; /* tryblk, tryoff are the block, offset position
4540 usize_t tryrem; /* tryrem is the number of matchable bytes */
4543 IF_DEBUG2(DP(RINT "[extend match] srcpos %"Q"u\n",
4544 stream->match_srcpos));
4546 XD3_ASSERT (src != NULL);
4548 /* Does it make sense to compute backward match AFTER forward match? */
4549 if (stream->match_state == MATCH_BACKWARD)
4551 /* Note: this code is practically duplicated below, substituting
4552 * match_fwd/match_back and direction. TODO: Consolidate? */
4553 matchoff = stream->match_srcpos - stream->match_back;
4554 streamoff = stream->input_position - stream->match_back;
4555 xd3_blksize_div (matchoff, src, &tryblk, &tryoff);
4557 /* this loops backward over source blocks */
4558 while (stream->match_back < stream->match_maxback)
4560 /* see if we're backing across a source block boundary */
4563 tryoff = src->blksize;
4567 if ((ret = xd3_getblk (stream, tryblk)))
4569 /* if search went too far back, continue forward. */
4570 if (ret == XD3_TOOFARBACK)
4575 /* could be a XD3_GETSRCBLK failure. */
4579 tryrem = min (tryoff, stream->match_maxback - stream->match_back);
4581 IF_DEBUG2(DP(RINT "[maxback] maxback %u trysrc %"Q"u/%u tgt %u tryrem %u\n",
4582 stream->match_maxback, tryblk, tryoff, streamoff, tryrem));
4584 /* TODO: This code can be optimized similar to xd3_match_forward() */
4585 for (; tryrem != 0; tryrem -= 1, stream->match_back += 1)
4587 if (src->curblk[tryoff-1] != stream->next_in[streamoff-1])
4598 stream->match_state = MATCH_FORWARD;
4601 XD3_ASSERT (stream->match_state == MATCH_FORWARD);
4603 matchoff = stream->match_srcpos + stream->match_fwd;
4604 streamoff = stream->input_position + stream->match_fwd;
4605 xd3_blksize_div (matchoff, src, & tryblk, & tryoff);
4607 /* Note: practically the same code as backwards case above: same comments */
4608 while (stream->match_fwd < stream->match_maxfwd)
4610 if (tryoff == src->blksize)
4616 if ((ret = xd3_getblk (stream, tryblk)))
4618 /* if search went too far back, continue forward. */
4619 if (ret == XD3_TOOFARBACK)
4624 /* could be a XD3_GETSRCBLK failure. */
4628 tryrem = min(stream->match_maxfwd - stream->match_fwd,
4629 src->onblk - tryoff);
4633 /* Generally, this means we have a power-of-two size source
4634 * and we just found the end-of-file, in this case it's an
4636 XD3_ASSERT (src->onblk < src->blksize);
4640 matched = xd3_forward_match(src->curblk + tryoff,
4641 stream->next_in + streamoff,
4644 streamoff += matched;
4645 stream->match_fwd += matched;
4647 if (tryrem != matched)
4653 stream->match_state = MATCH_SEARCHING;
4655 /* If the match ends short of the last instruction end, we probably
4656 * don't want it. There is the possibility that a copy ends short
4657 * of the last copy but also goes further back, in which case we
4658 * might want it. This code does not implement such: if so we would
4659 * need more complicated xd3_iopt_erase logic. */
4660 if (stream->match_fwd < stream->min_match)
4662 stream->match_fwd = 0;
4666 usize_t total = stream->match_fwd + stream->match_back;
4668 /* Correct the variables to remove match_back from the equation. */
4669 usize_t target_position = stream->input_position - stream->match_back;
4670 usize_t match_length = stream->match_back + stream->match_fwd;
4671 xoff_t match_position = stream->match_srcpos - stream->match_back;
4672 xoff_t match_end = stream->match_srcpos + stream->match_fwd;
4674 /* At this point we may have to erase any iopt-buffer
4675 * instructions that are fully covered by a backward-extending
4677 if (stream->match_back > 0)
4679 xd3_iopt_erase (stream, target_position, total);
4682 stream->match_back = 0;
4684 /* Update ranges. The first source match occurs with both
4686 if (stream->match_maxaddr == 0 ||
4687 match_position < stream->match_minaddr)
4689 stream->match_minaddr = match_position;
4692 if (match_end > stream->match_maxaddr)
4694 /* Note: per-window */
4695 stream->match_maxaddr = match_end;
4698 if (match_end > stream->maxsrcaddr)
4700 /* Note: across windows */
4701 stream->maxsrcaddr = match_end;
4706 DP(RINT "[source match:%d] <inp %"Q"u %"Q"u> <src %"Q"u %"Q"u> (%s) [ %u bytes ]\n",
4708 stream->total_in + target_position,
4709 stream->total_in + target_position + match_length,
4711 match_position + match_length,
4712 (stream->total_in + target_position == match_position) ? "same" : "diff",
4716 if ((ret = xd3_found_match (stream,
4717 /* decoder position */ target_position,
4718 /* length */ match_length,
4719 /* address */ match_position,
4720 /* is_source */ 1)))
4725 /* If the match ends with the available input: */
4726 if (target_position + match_length == stream->avail_in)
4728 /* Setup continuing match for the next window. */
4729 stream->match_state = MATCH_TARGET;
4730 stream->match_srcpos = match_end;
4737 /* Update the small hash. Values in the small_table are offset by
4738 * HASH_CKOFFSET (1) to distinguish empty buckets from real offsets. */
4740 xd3_scksum_insert (xd3_stream *stream,
4745 /* If we are maintaining previous duplicates. */
4746 if (stream->small_prev)
4748 usize_t last_pos = stream->small_table[inx];
4749 xd3_slist *pos_list = & stream->small_prev[pos & stream->sprevmask];
4751 /* Note last_pos is offset by HASH_CKOFFSET. */
4752 pos_list->last_pos = last_pos;
4755 /* Enter the new position into the hash bucket. */
4756 stream->small_table[inx] = pos + HASH_CKOFFSET;
4761 xd3_check_smatch (const uint8_t *ref0, const uint8_t *inp0,
4762 const uint8_t *inp_max, usize_t cmp_len)
4766 for (i = 0; i < cmp_len; i += 1)
4768 XD3_ASSERT (ref0[i] == inp0[i]);
4771 if (inp0 + cmp_len < inp_max)
4773 XD3_ASSERT (inp0[i] != ref0[i]);
4778 #endif /* XD3_DEBUG */
4780 /* When the hash table indicates a possible small string match, it
4781 * calls this routine to find the best match. The first matching
4782 * position is taken from the small_table, HASH_CKOFFSET is subtracted
4783 * to get the actual position. After checking that match, if previous
4784 * linked lists are in use (because stream->smatcher.small_chain > 1),
4785 * previous matches are tested searching for the longest match. If
4786 * (stream->min_match > MIN_MATCH) then a lazy match is in effect.
4789 xd3_smatch (xd3_stream *stream,
4792 usize_t *match_offset)
4795 usize_t match_length = 0;
4796 usize_t chain = (stream->min_match == MIN_MATCH ?
4797 stream->smatcher.small_chain :
4798 stream->smatcher.small_lchain);
4799 const uint8_t *inp_max = stream->next_in + stream->avail_in;
4803 SMALL_HASH_DEBUG1 (stream, stream->next_in + stream->input_position);
4805 XD3_ASSERT (stream->min_match + stream->input_position <= stream->avail_in);
4807 base -= HASH_CKOFFSET;
4811 IF_DEBUG2 (DP(RINT "smatch at base=%u inp=%u cksum=%u\n", base,
4812 stream->input_position, scksum));
4814 /* For small matches, we can always go to the end-of-input because
4815 * the matching position must be less than the input position. */
4816 XD3_ASSERT (base < stream->input_position);
4818 ref = stream->next_in + base;
4819 inp = stream->next_in + stream->input_position;
4821 SMALL_HASH_DEBUG2 (stream, ref);
4823 /* Expand potential match forward. */
4824 while (inp < inp_max && *inp == *ref)
4830 cmp_len = (usize_t)(inp - (stream->next_in + stream->input_position));
4832 /* Verify correctness */
4833 XD3_ASSERT (xd3_check_smatch (stream->next_in + base,
4834 stream->next_in + stream->input_position,
4837 /* Update longest match */
4838 if (cmp_len > match_length)
4840 ( match_length) = cmp_len;
4841 (*match_offset) = base;
4843 /* Stop if we match the entire input or have a long_enough match. */
4844 if (inp == inp_max || cmp_len >= stream->smatcher.long_enough)
4850 /* If we have not reached the chain limit, see if there is another
4851 previous position. */
4852 while (--chain != 0)
4854 /* Calculate the previous offset. */
4855 usize_t prev_pos = stream->small_prev[base & stream->sprevmask].last_pos;
4863 prev_pos -= HASH_CKOFFSET;
4865 if (prev_pos > base)
4872 XD3_ASSERT (stream->input_position > base);
4873 diff_pos = stream->input_position - base;
4875 /* Stop searching if we go beyond sprevsz, since those entries
4876 * are for unrelated checksum entries. */
4877 if (diff_pos & ~stream->sprevmask)
4886 /* Crude efficiency test: if the match is very short and very far back, it's
4887 * unlikely to help, but the exact calculation requires knowing the state of
4888 * the address cache and adjacent instructions, which we can't do here.
4889 * Rather than encode a probably inefficient copy here and check it later
4890 * (which complicates the code a lot), do this:
4892 if (match_length == 4 && stream->input_position - (*match_offset) >= 1<<14)
4894 /* It probably takes >2 bytes to encode an address >= 2^14 from here */
4897 if (match_length == 5 && stream->input_position - (*match_offset) >= 1<<21)
4899 /* It probably takes >3 bytes to encode an address >= 2^21 from here */
4903 /* It's unlikely that a window is large enough for the (match_length == 6 &&
4904 * address >= 2^28) check */
4905 return match_length;
4910 xd3_verify_small_state (xd3_stream *stream,
4915 uint32_t cksum = xd3_scksum (&state, inp, stream->smatcher.small_look);
4917 XD3_ASSERT (cksum == x_cksum);
4921 xd3_verify_large_state (xd3_stream *stream,
4925 uint32_t cksum = xd3_lcksum (inp, stream->smatcher.large_look);
4926 XD3_ASSERT (cksum == x_cksum);
4929 xd3_verify_run_state (xd3_stream *stream,
4934 usize_t slook = stream->smatcher.small_look;
4936 usize_t run_l = xd3_comprun (inp, slook, &run_c);
4938 XD3_ASSERT (run_l == 0 || run_c == *x_run_c);
4939 XD3_ASSERT (x_run_l > slook || run_l == x_run_l);
4941 #endif /* XD3_DEBUG */
4943 /* This function computes more source checksums to advance the window.
4944 * Called at every entrance to the string-match loop and each time
4945 * stream->input_position reaches the value returned as
4946 * *next_move_point. NB: this is one of the most expensive functions
4947 * in this code and also the most critical for good compression.
4948 * TODO: optimize the inner loop
4951 xd3_srcwin_move_point (xd3_stream *stream, usize_t *next_move_point)
4953 xoff_t logical_input_cksum_pos;
4956 if (stream->src->eof_known)
4958 source_size = xd3_source_eof (stream->src);
4959 XD3_ASSERT(stream->srcwin_cksum_pos <= source_size);
4961 if (stream->srcwin_cksum_pos == source_size)
4963 *next_move_point = USIZE_T_MAX;
4968 /* Begin by advancing at twice the input rate, up to half the
4969 * maximum window size. */
4970 logical_input_cksum_pos = min((stream->total_in + stream->input_position) * 2,
4971 (stream->total_in + stream->input_position) +
4972 (stream->srcwin_maxsz / 2));
4974 /* If srcwin_cksum_pos is already greater, wait until the difference
4976 if (stream->srcwin_cksum_pos > logical_input_cksum_pos)
4978 *next_move_point = stream->input_position +
4979 (usize_t)(stream->srcwin_cksum_pos - logical_input_cksum_pos);
4983 /* A long match may have extended past srcwin_cksum_pos. Don't
4984 * start checksumming already-matched source data. */
4985 if (stream->maxsrcaddr > stream->srcwin_cksum_pos)
4987 stream->srcwin_cksum_pos = stream->maxsrcaddr;
4990 if (logical_input_cksum_pos < stream->srcwin_cksum_pos)
4992 logical_input_cksum_pos = stream->srcwin_cksum_pos;
4995 /* Advance at least one source block. With the command-line
4996 * defaults this means:
4998 * if (src->size <= srcwin_maxsz), index the entire source at once
4999 * using the position of the first non-match. This is good for
5000 * small inputs, especially when the content may have moved anywhere
5001 * in the file (e.g., tar files).
5003 * if (src->size > srcwin_maxsz), index at least one block (which
5004 * the command-line sets to 1/32 of srcwin_maxsz) ahead of the
5005 * logical position. This is good for different reasons: when a
5006 * long match spanning several source blocks is encountered, this
5007 * avoids computing checksums for those blocks. If the data can
5008 * move anywhere, this is bad.
5010 logical_input_cksum_pos += stream->src->blksize;
5012 while (stream->srcwin_cksum_pos < logical_input_cksum_pos &&
5013 (!stream->src->eof_known ||
5014 stream->srcwin_cksum_pos < xd3_source_eof (stream->src)))
5017 xoff_t blkbaseoffset;
5019 ssize_t oldpos; /* Using ssize_t because of a */
5020 ssize_t blkpos; /* do { blkpos-- }
5021 while (blkpos >= oldpos); */
5023 xd3_blksize_div (stream->srcwin_cksum_pos,
5024 stream->src, &blkno, &blkrem);
5027 if ((ret = xd3_getblk (stream, blkno)))
5029 /* TOOFARBACK should never occur here, since we read forward. */
5030 if (ret == XD3_TOOFARBACK)
5035 "[srcwin_move_point] async getblk return for %"Q"u\n",
5041 "[srcwin_move_point] T=%"Q"u{%"Q"u} S=%"Q"u EOF=%"Q"u %s\n",
5042 stream->total_in + stream->input_position,
5043 logical_input_cksum_pos,
5044 stream->srcwin_cksum_pos,
5045 xd3_source_eof (stream->src),
5046 stream->src->eof_known ? "known" : "unknown"));
5048 blkpos = xd3_bytes_on_srcblk (stream->src, blkno);
5050 if (blkpos < (ssize_t) stream->smatcher.large_look)
5052 stream->srcwin_cksum_pos = (blkno + 1) * stream->src->blksize;
5053 IF_DEBUG1 (DP(RINT "[srcwin_move_point] continue (end-of-block)\n"));
5057 /* This inserts checksums for the entire block, in reverse,
5058 * starting from the end of the block. This logic does not test
5059 * stream->srcwin_cksum_pos because it always advances it to the
5060 * start of the next block.
5062 * oldpos is the srcwin_cksum_pos within this block. blkpos is
5063 * the number of bytes available. Each iteration inspects
5064 * large_look bytes then steps back large_step bytes. The
5065 * if-stmt above ensures at least one large_look of data. */
5066 blkpos -= stream->smatcher.large_look;
5067 blkbaseoffset = stream->src->blksize * blkno;
5071 uint32_t cksum = xd3_lcksum (stream->src->curblk + blkpos,
5072 stream->smatcher.large_look);
5073 usize_t hval = xd3_checksum_hash (& stream->large_hash, cksum);
5075 stream->large_table[hval] =
5076 (usize_t) (blkbaseoffset +
5077 (xoff_t)(blkpos + HASH_CKOFFSET));
5079 IF_DEBUG (stream->large_ckcnt += 1);
5081 blkpos -= stream->smatcher.large_step;
5083 while (blkpos >= oldpos);
5085 stream->srcwin_cksum_pos = (blkno + 1) * stream->src->blksize;
5089 "[srcwin_move_point] exited loop T=%"Q"u{%"Q"u} "
5090 "S=%"Q"u EOF=%"Q"u %s\n",
5091 stream->total_in + stream->input_position,
5092 logical_input_cksum_pos,
5093 stream->srcwin_cksum_pos,
5094 xd3_source_eof (stream->src),
5095 stream->src->eof_known ? "known" : "unknown"));
5097 if (stream->src->eof_known)
5099 source_size = xd3_source_eof (stream->src);
5101 if (stream->srcwin_cksum_pos >= source_size)
5103 /* This invariant is needed for xd3_source_cksum_offset() */
5104 stream->srcwin_cksum_pos = source_size;
5105 *next_move_point = USIZE_T_MAX;
5107 "[srcwin_move_point] finished with source input\n"));
5112 /* How long until this function should be called again. */
5113 XD3_ASSERT(stream->srcwin_cksum_pos >= logical_input_cksum_pos);
5114 *next_move_point = stream->input_position + 1 +
5115 (usize_t)(stream->srcwin_cksum_pos - logical_input_cksum_pos);
5119 #endif /* XD3_ENCODER */
5121 /********************************************************************
5123 *********************************************************************/
5125 #endif /* __XDELTA3_C_INLINE_PASS__ */
5126 #ifdef __XDELTA3_C_TEMPLATE_PASS__
5130 /********************************************************************
5132 *******************************************************************/
5134 /* Template macros */
5135 #define XD3_TEMPLATE(x) XD3_TEMPLATE2(x,TEMPLATE)
5136 #define XD3_TEMPLATE2(x,n) XD3_TEMPLATE3(x,n)
5137 #define XD3_TEMPLATE3(x,n) x ## n
5138 #define XD3_STRINGIFY(x) XD3_STRINGIFY2(x)
5139 #define XD3_STRINGIFY2(x) #x
5141 static int XD3_TEMPLATE(xd3_string_match_) (xd3_stream *stream);
5143 static const xd3_smatcher XD3_TEMPLATE(__smatcher_) =
5145 XD3_STRINGIFY(TEMPLATE),
5146 XD3_TEMPLATE(xd3_string_match_),
5150 LLOOK, LSTEP, SLOOK, SCHAIN, SLCHAIN, MAXLAZY, LONGENOUGH
5155 XD3_TEMPLATE(xd3_string_match_) (xd3_stream *stream)
5157 const int DO_SMALL = ! (stream->flags & XD3_NOCOMPRESS);
5158 const int DO_LARGE = (stream->src != NULL);
5159 const int DO_RUN = (1);
5162 uint32_t scksum = 0;
5163 uint32_t scksum_state = 0;
5164 uint32_t lcksum = 0;
5170 usize_t match_length;
5171 usize_t match_offset = 0;
5172 usize_t next_move_point;
5174 /* If there will be no compression due to settings or short input,
5175 * skip it entirely. */
5176 if (! (DO_SMALL || DO_LARGE || DO_RUN) ||
5177 stream->input_position + SLOOK > stream->avail_in) { goto loopnomore; }
5179 if ((ret = xd3_string_match_init (stream))) { return ret; }
5181 /* The restartloop label is reached when the incremental loop state
5182 * needs to be reset. */
5185 /* If there is not enough input remaining for any kind of match,
5187 if (stream->input_position + SLOOK > stream->avail_in) { goto loopnomore; }
5189 /* Now reset the incremental loop state: */
5191 /* The min_match variable is updated to avoid matching the same lazy
5192 * match over and over again. For example, if you find a (small)
5193 * match of length 9 at one position, you will likely find a match
5194 * of length 8 at the next position. */
5195 if (xd3_iopt_last_matched (stream) > stream->input_position)
5197 stream->min_match = max(MIN_MATCH,
5198 1 + xd3_iopt_last_matched(stream) -
5199 stream->input_position);
5203 stream->min_match = MIN_MATCH;
5206 /* The current input byte. */
5207 inp = stream->next_in + stream->input_position;
5209 /* Small match state. */
5212 scksum = xd3_scksum (&scksum_state, inp, SLOOK);
5218 run_l = xd3_comprun (inp, SLOOK, & run_c);
5221 /* Large match state. We continue the loop even after not enough
5222 * bytes for LLOOK remain, so always check stream->input_position in
5224 if (DO_LARGE && (stream->input_position + LLOOK <= stream->avail_in))
5226 /* Source window: next_move_point is the point that
5227 * stream->input_position must reach before computing more
5228 * source checksum. */
5229 if ((ret = xd3_srcwin_move_point (stream, & next_move_point)))
5234 lcksum = xd3_lcksum (inp, LLOOK);
5237 /* TRYLAZYLEN: True if a certain length match should be followed by
5238 * lazy search. This checks that LEN is shorter than MAXLAZY and
5239 * that there is enough leftover data to consider lazy matching.
5240 * "Enough" is set to 2 since the next match will start at the next
5241 * offset, it must match two extra characters. */
5242 #define TRYLAZYLEN(LEN,POS,MAX) ((MAXLAZY) > 0 && (LEN) < (MAXLAZY) \
5243 && (POS) + (LEN) <= (MAX) - 2)
5245 /* HANDLELAZY: This statement is called each time an instruciton is
5246 * emitted (three cases). If the instruction is large enough, the
5247 * loop is restarted, otherwise lazy matching may ensue. */
5248 #define HANDLELAZY(mlen) \
5249 if (TRYLAZYLEN ((mlen), (stream->input_position), (stream->avail_in))) \
5250 { stream->min_match = (mlen) + LEAST_MATCH_INCR; goto updateone; } \
5252 { stream->input_position += (mlen); goto restartloop; }
5254 /* Now loop over one input byte at a time until a match is found... */
5255 for (;; inp += 1, stream->input_position += 1)
5257 /* Now we try three kinds of string match in order of expense:
5258 * run, large match, small match. */
5260 /* Expand the start of a RUN. The test for (run_l == SLOOK)
5261 * avoids repeating this check when we pass through a run area
5262 * performing lazy matching. The run is only expanded once when
5263 * the min_match is first reached. If lazy matching is
5264 * performed, the run_l variable will remain inconsistent until
5265 * the first non-running input character is reached, at which
5266 * time the run_l may then again grow to SLOOK. */
5267 if (DO_RUN && run_l == SLOOK)
5269 usize_t max_len = stream->avail_in - stream->input_position;
5271 IF_DEBUG (xd3_verify_run_state (stream, inp, run_l, &run_c));
5273 while (run_l < max_len && inp[run_l] == run_c) { run_l += 1; }
5275 /* Output a RUN instruction. */
5276 if (run_l >= stream->min_match && run_l >= MIN_RUN)
5278 if ((ret = xd3_emit_run (stream, stream->input_position,
5279 run_l, &run_c))) { return ret; }
5285 /* If there is enough input remaining. */
5286 if (DO_LARGE && (stream->input_position + LLOOK <= stream->avail_in))
5288 if ((stream->input_position >= next_move_point) &&
5289 (ret = xd3_srcwin_move_point (stream, & next_move_point)))
5294 linx = xd3_checksum_hash (& stream->large_hash, lcksum);
5296 IF_DEBUG (xd3_verify_large_state (stream, inp, lcksum));
5298 if (stream->large_table[linx] != 0)
5300 /* the match_setup will fail if the source window has
5301 * been decided and the match lies outside it.
5302 * OPT: Consider forcing a window at this point to
5303 * permit a new source window. */
5305 xd3_source_cksum_offset(stream,
5306 stream->large_table[linx] -
5308 if (xd3_source_match_setup (stream, adj_offset) == 0)
5310 if ((ret = xd3_source_extend_match (stream)))
5315 /* Update stream position. match_fwd is zero if no
5317 if (stream->match_fwd > 0)
5319 HANDLELAZY (stream->match_fwd);
5325 /* Small matches. */
5328 sinx = xd3_checksum_hash (& stream->small_hash, scksum);
5330 /* Verify incremental state in debugging mode. */
5331 IF_DEBUG (xd3_verify_small_state (stream, inp, scksum));
5333 /* Search for the longest match */
5334 if (stream->small_table[sinx] != 0)
5336 match_length = xd3_smatch (stream,
5337 stream->small_table[sinx],
5346 /* Insert a hash for this string. */
5347 xd3_scksum_insert (stream, sinx, scksum, stream->input_position);
5349 /* Maybe output a COPY instruction */
5350 if (match_length >= stream->min_match)
5354 DP(RINT "[target match:%d] <inp %u %u> <cpy %u %u> "
5355 "(-%d) [ %u bytes ]\n",
5357 stream->input_position,
5358 stream->input_position + match_length,
5360 match_offset + match_length,
5361 stream->input_position - match_offset,
5365 if ((ret = xd3_found_match (stream,
5366 /* decoder position */
5367 stream->input_position,
5368 /* length */ match_length,
5369 /* address */ (xoff_t) match_offset,
5370 /* is_source */ 0)))
5375 /* Copy instruction. */
5376 HANDLELAZY (match_length);
5380 /* The logic above prevents excess work during lazy matching by
5381 * increasing min_match to avoid smaller matches. Each time we
5382 * advance stream->input_position by one, the minimum match
5383 * shortens as well. */
5384 if (stream->min_match > MIN_MATCH)
5386 stream->min_match -= 1;
5391 /* See if there are no more incremental cksums to compute. */
5392 if (stream->input_position + SLOOK == stream->avail_in)
5397 /* Compute next RUN, CKSUM */
5400 NEXTRUN (inp[SLOOK]);
5405 scksum = xd3_small_cksum_update (&scksum_state, inp, SLOOK);
5408 if (DO_LARGE && (stream->input_position + LLOOK < stream->avail_in))
5410 lcksum = xd3_large_cksum_update (lcksum, inp, LLOOK);
5418 #endif /* XD3_ENCODER */
5419 #endif /* __XDELTA3_C_TEMPLATE_PASS__ */