1 <!-- Creator : groff version 1.22.3 -->
2 <!-- CreationDate: Sun Oct 23 20:38:29 2016 -->
3 <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
4 "http://www.w3.org/TR/html4/loose.dtd">
7 <meta name="generator" content="groff -Thtml, see www.gnu.org">
8 <meta http-equiv="Content-Type" content="text/html; charset=US-ASCII">
9 <meta name="Content-Style" content="text/css">
10 <style type="text/css">
11 p { margin-top: 0; margin-bottom: 0; vertical-align: top }
12 pre { margin-top: 0; margin-bottom: 0; vertical-align: top }
13 table { margin-top: 0; margin-bottom: 0; vertical-align: top }
14 h1 { text-align: center }
23 <p>TAR(5) BSD File Formats Manual TAR(5)</p>
25 <p style="margin-top: 1em"><b>NAME</b></p>
27 <p style="margin-left:6%;"><b>tar</b> — format of
28 tape archive files</p>
30 <p style="margin-top: 1em"><b>DESCRIPTION</b></p>
32 <p style="margin-left:6%;">The <b>tar</b> archive format
33 collects any number of files, directories, and other file
34 system objects (symbolic links, device nodes, etc.) into a
35 single stream of bytes. The format was originally designed
36 to be used with tape drives that operate with fixed-size
37 blocks, but is widely used as a general packaging
40 <p style="margin-left:6%; margin-top: 1em"><b>General
42 A <b>tar</b> archive consists of a series of 512-byte
43 records. Each file system object requires a header record
44 which stores basic metadata (pathname, owner, permissions,
45 etc.) and zero or more records containing any file data. The
46 end of the archive is indicated by two records consisting
47 entirely of zero bytes.</p>
49 <p style="margin-left:6%; margin-top: 1em">For
50 compatibility with tape drives that use fixed block sizes,
51 programs that read or write tar files always read or write a
52 fixed number of records with each I/O operation. These
53 ’’blocks’’ are always a multiple of
54 the record size. The maximum block size supported by early
55 implementations was 10240 bytes or 20 records. This is still
56 the default for most implementations although block sizes of
57 1MiB (2048 records) or larger are commonly used with modern
58 high-speed tape drives. (Note: the terms
59 ’’block’’ and
60 ’’record’’ here are not entirely
61 standard; this document follows the convention established
62 by John Gilmore in documenting <b>pdtar</b>.)</p>
64 <p style="margin-left:6%; margin-top: 1em"><b>Old-Style
65 Archive Format</b> <br>
66 The original tar archive format has been extended many times
67 to include additional information that various implementors
68 found necessary. This section describes the variant
69 implemented by the tar command included in Version 7
70 AT&T UNIX, which seems to be the earliest widely-used
71 version of the tar program.</p>
73 <p style="margin-left:6%; margin-top: 1em">The header
74 record for an old-style <b>tar</b> archive consists of the
77 <p style="margin-left:14%; margin-top: 1em">struct
80 <table width="100%" border="0" rules="none" frame="void"
81 cellspacing="0" cellpadding="0">
82 <tr valign="top" align="left">
87 <p>char name[100];</p></td>
90 <tr valign="top" align="left">
95 <p>char mode[8];</p></td>
98 <tr valign="top" align="left">
103 <p>char uid[8];</p></td>
106 <tr valign="top" align="left">
107 <td width="24%"></td>
111 <p>char gid[8];</p></td>
114 <tr valign="top" align="left">
115 <td width="24%"></td>
119 <p>char size[12];</p></td>
122 <tr valign="top" align="left">
123 <td width="24%"></td>
127 <p>char mtime[12];</p></td>
130 <tr valign="top" align="left">
131 <td width="24%"></td>
135 <p>char checksum[8];</p></td>
138 <tr valign="top" align="left">
139 <td width="24%"></td>
143 <p>char linkflag[1];</p></td>
146 <tr valign="top" align="left">
147 <td width="24%"></td>
151 <p>char linkname[100];</p></td>
154 <tr valign="top" align="left">
155 <td width="24%"></td>
159 <p>char pad[255];</p></td>
164 <p style="margin-left:14%;">};</p>
166 <p style="margin-left:6%;">All unused bytes in the header
167 record are filled with nulls.</p>
169 <p style="margin-top: 1em"><i>name</i></p>
171 <p style="margin-left:17%; margin-top: 1em">Pathname,
172 stored as a null-terminated string. Early tar
173 implementations only stored regular files (including
174 hardlinks to those files). One common early convention used
175 a trailing "/" character to indicate a directory
176 name, allowing directory permissions and owner information
177 to be archived and restored.</p>
179 <p style="margin-top: 1em"><i>mode</i></p>
181 <p style="margin-left:17%; margin-top: 1em">File mode,
182 stored as an octal number in ASCII.</p>
184 <p style="margin-top: 1em"><i>uid</i>, <i>gid</i></p>
186 <p style="margin-left:17%;">User id and group id of owner,
187 as octal numbers in ASCII.</p>
189 <p style="margin-top: 1em"><i>size</i></p>
191 <p style="margin-left:17%; margin-top: 1em">Size of file,
192 as octal number in ASCII. For regular files only, this
193 indicates the amount of data that follows the header. In
194 particular, this field was ignored by early tar
195 implementations when extracting hardlinks. Modern writers
196 should always store a zero length for hardlink entries.</p>
198 <p style="margin-top: 1em"><i>mtime</i></p>
200 <p style="margin-left:17%; margin-top: 1em">Modification
201 time of file, as an octal number in ASCII. This indicates
202 the number of seconds since the start of the epoch, 00:00:00
203 UTC January 1, 1970. Note that negative values should be
204 avoided here, as they are handled inconsistently.</p>
206 <p style="margin-top: 1em"><i>checksum</i></p>
208 <p style="margin-left:17%;">Header checksum, stored as an
209 octal number in ASCII. To compute the checksum, set the
210 checksum field to all spaces, then sum all bytes in the
211 header using unsigned arithmetic. This field should be
212 stored as six octal digits followed by a null and a space
213 character. Note that many early implementations of tar used
214 signed arithmetic for the checksum field, which can cause
215 interoperability problems when transferring archives between
216 systems. Modern robust readers compute the checksum both
217 ways and accept the header if either computation
220 <p style="margin-top: 1em"><i>linkflag</i>,
223 <p style="margin-left:17%;">In order to preserve hardlinks
224 and conserve tape, a file with multiple links is only
225 written to the archive the first time it is encountered. The
226 next time it is encountered, the <i>linkflag</i> is set to
227 an ASCII ’1’ and the <i>linkname</i> field holds
228 the first name under which this file appears. (Note that
229 regular files have a null value in the <i>linkflag</i>
232 <p style="margin-left:6%; margin-top: 1em">Early tar
233 implementations varied in how they terminated these fields.
234 The tar command in Version 7 AT&T UNIX used the
235 following conventions (this is also documented in early BSD
236 manpages): the pathname must be null-terminated; the mode,
237 uid, and gid fields must end in a space and a null byte; the
238 size and mtime fields must end in a space; the checksum is
239 terminated by a null and a space. Early implementations
240 filled the numeric fields with leading spaces. This seems to
241 have been common practice until the IEEE Std 1003.1-1988
242 (’’POSIX.1’’) standard was released.
243 For best portability, modern implementations should fill the
244 numeric fields with leading zeros.</p>
246 <p style="margin-left:6%; margin-top: 1em"><b>Pre-POSIX
248 An early draft of IEEE Std 1003.1-1988
249 (’’POSIX.1’’) served as the basis
250 for John Gilmore’s <b>pdtar</b> program and many
251 system implementations from the late 1980s and early 1990s.
252 These archives generally follow the POSIX ustar format
253 described below with the following variations:</p>
257 <p style="margin-left:17%;">The magic value consists of the
258 five characters ’’ustar’’ followed
259 by a space. The version field contains a space character
260 followed by a null.</p>
264 <p style="margin-left:17%;">The numeric fields are
265 generally filled with leading spaces (not leading zeros as
266 recommended in the final standard).</p>
270 <p style="margin-left:17%;">The prefix field is often not
271 used, limiting pathnames to the 100 characters of old-style
274 <p style="margin-left:6%; margin-top: 1em"><b>POSIX ustar
276 IEEE Std 1003.1-1988 (’’POSIX.1’’)
277 defined a standard tar file format to be read and written by
278 compliant implementations of tar(1). This format is often
279 called the ’’ustar’’ format, after
280 the magic value used in the header. (The name is an acronym
281 for ’’Unix Standard TAR’’.) It
282 extends the historic format with new fields:</p>
284 <p style="margin-left:14%; margin-top: 1em">struct
285 header_posix_ustar {</p>
287 <table width="100%" border="0" rules="none" frame="void"
288 cellspacing="0" cellpadding="0">
289 <tr valign="top" align="left">
290 <td width="24%"></td>
294 <p>char name[100];</p></td>
297 <tr valign="top" align="left">
298 <td width="24%"></td>
302 <p>char mode[8];</p></td>
305 <tr valign="top" align="left">
306 <td width="24%"></td>
310 <p>char uid[8];</p></td>
313 <tr valign="top" align="left">
314 <td width="24%"></td>
318 <p>char gid[8];</p></td>
321 <tr valign="top" align="left">
322 <td width="24%"></td>
326 <p>char size[12];</p></td>
329 <tr valign="top" align="left">
330 <td width="24%"></td>
334 <p>char mtime[12];</p></td>
337 <tr valign="top" align="left">
338 <td width="24%"></td>
342 <p>char checksum[8];</p></td>
345 <tr valign="top" align="left">
346 <td width="24%"></td>
350 <p>char typeflag[1];</p></td>
353 <tr valign="top" align="left">
354 <td width="24%"></td>
358 <p>char linkname[100];</p></td>
361 <tr valign="top" align="left">
362 <td width="24%"></td>
366 <p>char magic[6];</p></td>
369 <tr valign="top" align="left">
370 <td width="24%"></td>
374 <p>char version[2];</p></td>
377 <tr valign="top" align="left">
378 <td width="24%"></td>
382 <p>char uname[32];</p></td>
385 <tr valign="top" align="left">
386 <td width="24%"></td>
390 <p>char gname[32];</p></td>
393 <tr valign="top" align="left">
394 <td width="24%"></td>
398 <p>char devmajor[8];</p></td>
401 <tr valign="top" align="left">
402 <td width="24%"></td>
406 <p>char devminor[8];</p></td>
409 <tr valign="top" align="left">
410 <td width="24%"></td>
414 <p>char prefix[155];</p></td>
417 <tr valign="top" align="left">
418 <td width="24%"></td>
422 <p>char pad[12];</p></td>
427 <p style="margin-left:14%;">};</p>
429 <p style="margin-top: 1em"><i>typeflag</i></p>
431 <p style="margin-left:17%;">Type of entry. POSIX extended
432 the earlier <i>linkflag</i> field with several new type
435 <p>’’0’’</p>
437 <p style="margin-left:27%; margin-top: 1em">Regular file.
438 NUL should be treated as a synonym, for compatibility
441 <p>’’1’’</p>
443 <p style="margin-left:27%; margin-top: 1em">Hard link.</p>
445 <p>’’2’’</p>
447 <p style="margin-left:27%; margin-top: 1em">Symbolic
450 <p>’’3’’</p>
452 <p style="margin-left:27%; margin-top: 1em">Character
455 <p>’’4’’</p>
457 <p style="margin-left:27%; margin-top: 1em">Block device
460 <p>’’5’’</p>
462 <p style="margin-left:27%; margin-top: 1em">Directory.</p>
464 <p>’’6’’</p>
466 <p style="margin-left:27%; margin-top: 1em">FIFO node.</p>
468 <p>’’7’’</p>
470 <p style="margin-left:27%; margin-top: 1em">Reserved.</p>
474 <p style="margin-left:27%; margin-top: 1em">A
475 POSIX-compliant implementation must treat any unrecognized
476 typeflag value as a regular file. In particular, writers
477 should ensure that all entries have a valid filename so that
478 they can be restored by readers that do not support the
479 corresponding extension. Uppercase letters "A"
480 through "Z" are reserved for custom extensions.
481 Note that sockets and whiteout entries are not
484 <p style="margin-left:17%;">It is worth noting that the
485 <i>size</i> field, in particular, has different meanings
486 depending on the type. For regular files, of course, it
487 indicates the amount of data following the header. For
488 directories, it may be used to indicate the total size of
489 all files in the directory, for use by operating systems
490 that pre-allocate directory space. For all other types, it
491 should be set to zero by writers and ignored by readers.</p>
493 <p style="margin-top: 1em"><i>magic</i></p>
495 <p style="margin-left:17%; margin-top: 1em">Contains the
496 magic value ’’ustar’’ followed by a
497 NUL byte to indicate that this is a POSIX standard archive.
498 Full compliance requires the uname and gname fields be
501 <p style="margin-top: 1em"><i>version</i></p>
503 <p style="margin-left:17%;">Version. This should be
504 ’’00’’ (two copies of the ASCII
505 digit zero) for POSIX standard archives.</p>
507 <p style="margin-top: 1em"><i>uname</i>, <i>gname</i></p>
509 <p style="margin-left:17%;">User and group names, as
510 null-terminated ASCII strings. These should be used in
511 preference to the uid/gid values when they are set and the
512 corresponding names exist on the system.</p>
514 <p style="margin-top: 1em"><i>devmajor</i>,
517 <p style="margin-left:17%;">Major and minor numbers for
518 character device or block device entry.</p>
520 <p style="margin-top: 1em"><i>name</i>, <i>prefix</i></p>
522 <p style="margin-left:17%;">If the pathname is too long to
523 fit in the 100 bytes provided by the standard format, it can
524 be split at any <i>/</i> character with the first portion
525 going into the prefix field. If the prefix field is not
526 empty, the reader will prepend the prefix value and a
527 <i>/</i> character to the regular name field to obtain the
528 full pathname. The standard does not require a trailing
529 <i>/</i> character on directory names, though most
530 implementations still include this for compatibility
533 <p style="margin-left:6%; margin-top: 1em">Note that all
534 unused bytes must be set to NUL.</p>
536 <p style="margin-left:6%; margin-top: 1em">Field
537 termination is specified slightly differently by POSIX than
538 by previous implementations. The <i>magic</i>, <i>uname</i>,
539 and <i>gname</i> fields must have a trailing NUL. The
540 <i>pathname</i>, <i>linkname</i>, and <i>prefix</i> fields
541 must have a trailing NUL unless they fill the entire field.
542 (In particular, it is possible to store a 256-character
543 pathname if it happens to have a <i>/</i> as the 156th
544 character.) POSIX requires numeric fields to be zero-padded
545 in the front, and requires them to be terminated with either
546 space or NUL characters.</p>
548 <p style="margin-left:6%; margin-top: 1em">Currently, most
549 tar implementations comply with the ustar format,
550 occasionally extending it by adding new fields to the blank
551 area at the end of the header record.</p>
553 <p style="margin-left:6%; margin-top: 1em"><b>Numeric
555 There have been several attempts to extend the range of
556 sizes or times supported by modifying how numbers are stored
559 <p style="margin-left:6%; margin-top: 1em">One obvious
560 extension to increase the size of files is to eliminate the
561 terminating characters from the various numeric fields. For
562 example, the standard only allows the size field to contain
563 11 octal digits, reserving the twelfth byte for a trailing
564 NUL character. Allowing 12 octal digits allows file sizes up
567 <p style="margin-left:6%; margin-top: 1em">Another
568 extension, utilized by GNU tar, star, and other newer
569 <b>tar</b> implementations, permits binary numbers in the
570 standard numeric fields. This is flagged by setting the high
571 bit of the first byte. The remainder of the field is treated
572 as a signed twos-complement value. This permits 95-bit
573 values for the length and time fields and 63-bit values for
574 the uid, gid, and device numbers. In particular, this
575 provides a consistent way to handle negative time values.
576 GNU tar supports this extension for the length, mtime,
577 ctime, and atime fields. Joerg Schilling’s star
578 program and the libarchive library support this extension
579 for all numeric fields. Note that this extension is largely
580 obsoleted by the extended attribute record provided by the
581 pax interchange format.</p>
583 <p style="margin-left:6%; margin-top: 1em">Another early
584 GNU extension allowed base-64 values rather than octal. This
585 extension was short-lived and is no longer supported by any
588 <p style="margin-left:6%; margin-top: 1em"><b>Pax
589 Interchange Format</b> <br>
590 There are many attributes that cannot be portably stored in
591 a POSIX ustar archive. IEEE Std 1003.1-2001
592 (’’POSIX.1’’) defined a
593 ’’pax interchange format’’ that uses
594 two new types of entries to hold text-formatted metadata
595 that applies to following entries. Note that a pax
596 interchange format archive is a ustar archive in every
597 respect. The new data is stored in ustar-compatible archive
598 entries that use the ’’x’’ or
599 ’’g’’ typeflag. In particular, older
600 implementations that do not fully support these extensions
601 will extract the metadata into regular files, where the
602 metadata can be examined as necessary.</p>
604 <p style="margin-left:6%; margin-top: 1em">An entry in a
605 pax interchange format archive consists of one or two
606 standard ustar entries, each with its own header and data.
607 The first optional entry stores the extended attributes for
608 the following entry. This optional first entry has an
609 "x" typeflag and a size field that indicates the
610 total size of the extended attributes. The extended
611 attributes themselves are stored as a series of text-format
612 lines encoded in the portable UTF-8 encoding. Each line
613 consists of a decimal number, a space, a key string, an
614 equals sign, a value string, and a new line. The decimal
615 number indicates the length of the entire line, including
616 the initial length field and the trailing newline. An
617 example of such a field is:</p>
619 <p style="margin-left:14%;">25 ctime=1084839148.1212\n</p>
621 <p style="margin-left:6%;">Keys in all lowercase are
622 standard keys. Vendors can add their own keys by prefixing
623 them with an all uppercase vendor name and a period. Note
624 that, unlike the historic header, numeric values are stored
625 using decimal, not octal. A description of some common keys
628 <p style="margin-top: 1em"><b>atime</b>, <b>ctime</b>,
631 <p style="margin-left:17%;">File access, inode change, and
632 modification times. These fields can be negative or include
633 a decimal point and a fractional value.</p>
635 <p style="margin-top: 1em"><b>hdrcharset</b></p>
637 <p style="margin-left:17%;">The character set used by the
638 pax extension values. By default, all textual values in the
639 pax extended attributes are assumed to be in UTF-8,
640 including pathnames, user names, and group names. In some
641 cases, it is not possible to translate local conventions
642 into UTF-8. If this key is present and the value is the
643 six-character ASCII string
644 ’’BINARY’’, then all textual values
645 are assumed to be in a platform-dependent multi-byte
646 encoding. Note that there are only two valid values for this
647 key: ’’BINARY’’ or
648 ’’ISO-IR 10646 2000 UTF-8’’.
649 No other values are permitted by the standard, and the
650 latter value should generally not be used as it is the
651 default when this key is not specified. In particular, this
652 flag should not be used as a general mechanism to allow
653 filenames to be stored in arbitrary encodings.</p>
655 <p style="margin-top: 1em"><b>uname</b>, <b>uid</b>,
656 <b>gname</b>, <b>gid</b></p>
658 <p style="margin-left:17%;">User name, group name, and
659 numeric UID and GID values. The user name and group name
660 stored here are encoded in UTF8 and can thus include
661 non-ASCII characters. The UID and GID fields can be of
662 arbitrary length.</p>
664 <p style="margin-top: 1em"><b>linkpath</b></p>
666 <p style="margin-left:17%;">The full path of the linked-to
667 file. Note that this is encoded in UTF8 and can thus include
668 non-ASCII characters.</p>
670 <p style="margin-top: 1em"><b>path</b></p>
672 <p style="margin-left:17%; margin-top: 1em">The full
673 pathname of the entry. Note that this is encoded in UTF8 and
674 can thus include non-ASCII characters.</p>
676 <p style="margin-top: 1em"><b>realtime.*</b>,
677 <b>security.*</b></p>
679 <p style="margin-left:17%;">These keys are reserved and may
680 be used for future standardization.</p>
682 <p style="margin-top: 1em"><b>size</b></p>
684 <p style="margin-left:17%; margin-top: 1em">The size of the
685 file. Note that there is no length limit on this field,
686 allowing conforming archives to store files much larger than
687 the historic 8GB limit.</p>
689 <p style="margin-top: 1em"><b>SCHILY.*</b></p>
691 <p style="margin-left:17%;">Vendor-specific attributes used
692 by Joerg Schilling’s <b>star</b> implementation.</p>
694 <p style="margin-top: 1em"><b>SCHILY.acl.access</b>,
695 <b>SCHILY.acl.default</b></p>
697 <p style="margin-left:17%;">Stores the access and default
698 ACLs as textual strings in a format that is an extension of
699 the format specified by POSIX.1e draft 17. In particular,
700 each user or group access specification can include a fourth
701 colon-separated field with the numeric UID or GID. This
702 allows ACLs to be restored on systems that may not have
703 complete user or group information available (such as when
704 NIS/YP or LDAP services are temporarily unavailable).</p>
706 <p style="margin-top: 1em"><b>SCHILY.devminor</b>,
707 <b>SCHILY.devmajor</b></p>
709 <p style="margin-left:17%;">The full minor and major
710 numbers for device nodes.</p>
712 <p style="margin-top: 1em"><b>SCHILY.fflags</b></p>
714 <p style="margin-left:17%;">The file flags.</p>
716 <p style="margin-top: 1em"><b>SCHILY.realsize</b></p>
718 <p style="margin-left:17%;">The full size of the file on
719 disk. XXX explain? XXX</p>
721 <p style="margin-top: 1em"><b>SCHILY.dev, SCHILY.ino</b>,
722 <b>SCHILY.nlinks</b></p>
724 <p style="margin-left:17%;">The device number, inode
725 number, and link count for the entry. In particular, note
726 that a pax interchange format archive using Joerg
727 Schilling’s <b>SCHILY.*</b> extensions can store all
728 of the data from <i>struct stat</i>.</p>
730 <p style="margin-top: 1em"><b>LIBARCHIVE.*</b></p>
732 <p style="margin-left:17%;">Vendor-specific attributes used
733 by the <b>libarchive</b> library and programs that use
737 <p style="margin-top: 1em"><b>LIBARCHIVE.creationtime</b></p>
739 <p style="margin-left:17%;">The time when the file was
740 created. (This should not be confused with the POSIX
741 ’’ctime’’ attribute, which refers to
742 the time when the file metadata was last changed.)</p>
745 <p style="margin-top: 1em"><b>LIBARCHIVE.xattr.</b><i>namespace</i>.<i>key</i></p>
747 <p style="margin-left:17%;">Libarchive stores
748 POSIX.1e-style extended attributes using keys of this form.
749 The <i>key</i> value is URL-encoded: All non-ASCII
750 characters and the two special characters
751 ’’=’’ and
752 ’’%’’ are encoded as
753 ’’%’’ followed by two uppercase
754 hexadecimal digits. The value of this key is the extended
755 attribute value encoded in base 64. XXX Detail the base-64
758 <p style="margin-top: 1em"><b>VENDOR.*</b></p>
760 <p style="margin-left:17%;">XXX document other
761 vendor-specific extensions XXX</p>
763 <p style="margin-left:6%; margin-top: 1em">Any values
764 stored in an extended attribute override the corresponding
765 values in the regular tar header. Note that compliant
766 readers should ignore the regular fields when they are
767 overridden. This is important, as existing archivers are
768 known to store non-compliant values in the standard header
769 fields in this situation. There are no limits on length for
770 any of these fields. In particular, numeric fields can be
771 arbitrarily large. All text fields are encoded in UTF8.
772 Compliant writers should store only portable 7-bit ASCII
773 characters in the standard ustar header and use extended
774 attributes whenever a text value contains non-ASCII
777 <p style="margin-left:6%; margin-top: 1em">In addition to
778 the <b>x</b> entry described above, the pax interchange
779 format also supports a <b>g</b> entry. The <b>g</b> entry is
780 identical in format, but specifies attributes that serve as
781 defaults for all subsequent archive entries. The <b>g</b>
782 entry is not widely used.</p>
784 <p style="margin-left:6%; margin-top: 1em">Besides the new
785 <b>x</b> and <b>g</b> entries, the pax interchange format
786 has a few other minor variations from the earlier ustar
787 format. The most troubling one is that hardlinks are
788 permitted to have data following them. This allows readers
789 to restore any hardlink to a file without having to rewind
790 the archive to find an earlier entry. However, it creates
791 complications for robust readers, as it is no longer clear
792 whether or not they should ignore the size field for
793 hardlink entries.</p>
795 <p style="margin-left:6%; margin-top: 1em"><b>GNU Tar
797 The GNU tar program started with a pre-POSIX format similar
798 to that described earlier and has extended it using several
799 different mechanisms: It added new fields to the empty space
800 in the header (some of which was later used by POSIX for
801 conflicting purposes); it allowed the header to be continued
802 over multiple records; and it defined new entries that
803 modify following entries (similar in principle to the
804 <b>x</b> entry described above, but each GNU special entry
805 is single-purpose, unlike the general-purpose <b>x</b>
806 entry). As a result, GNU tar archives are not POSIX
807 compatible, although more lenient POSIX-compliant readers
808 can successfully extract most GNU tar archives.</p>
810 <p style="margin-left:14%; margin-top: 1em">struct
813 <table width="100%" border="0" rules="none" frame="void"
814 cellspacing="0" cellpadding="0">
815 <tr valign="top" align="left">
816 <td width="24%"></td>
820 <p>char name[100];</p></td>
821 <td width="10%"></td>
824 <tr valign="top" align="left">
825 <td width="24%"></td>
829 <p>char mode[8];</p></td>
830 <td width="10%"></td>
833 <tr valign="top" align="left">
834 <td width="24%"></td>
838 <p>char uid[8];</p></td>
839 <td width="10%"></td>
842 <tr valign="top" align="left">
843 <td width="24%"></td>
847 <p>char gid[8];</p></td>
848 <td width="10%"></td>
851 <tr valign="top" align="left">
852 <td width="24%"></td>
856 <p>char size[12];</p></td>
857 <td width="10%"></td>
860 <tr valign="top" align="left">
861 <td width="24%"></td>
865 <p>char mtime[12];</p></td>
866 <td width="10%"></td>
869 <tr valign="top" align="left">
870 <td width="24%"></td>
874 <p>char checksum[8];</p></td>
875 <td width="10%"></td>
878 <tr valign="top" align="left">
879 <td width="24%"></td>
883 <p>char typeflag[1];</p></td>
884 <td width="10%"></td>
887 <tr valign="top" align="left">
888 <td width="24%"></td>
892 <p>char linkname[100];</p></td>
893 <td width="10%"></td>
896 <tr valign="top" align="left">
897 <td width="24%"></td>
901 <p>char magic[6];</p></td>
902 <td width="10%"></td>
905 <tr valign="top" align="left">
906 <td width="24%"></td>
910 <p>char version[2];</p></td>
911 <td width="10%"></td>
914 <tr valign="top" align="left">
915 <td width="24%"></td>
919 <p>char uname[32];</p></td>
920 <td width="10%"></td>
923 <tr valign="top" align="left">
924 <td width="24%"></td>
928 <p>char gname[32];</p></td>
929 <td width="10%"></td>
932 <tr valign="top" align="left">
933 <td width="24%"></td>
937 <p>char devmajor[8];</p></td>
938 <td width="10%"></td>
941 <tr valign="top" align="left">
942 <td width="24%"></td>
946 <p>char devminor[8];</p></td>
947 <td width="10%"></td>
950 <tr valign="top" align="left">
951 <td width="24%"></td>
955 <p>char atime[12];</p></td>
956 <td width="10%"></td>
959 <tr valign="top" align="left">
960 <td width="24%"></td>
964 <p>char ctime[12];</p></td>
965 <td width="10%"></td>
968 <tr valign="top" align="left">
969 <td width="24%"></td>
973 <p>char offset[12];</p></td>
974 <td width="10%"></td>
977 <tr valign="top" align="left">
978 <td width="24%"></td>
982 <p>char longnames[4];</p></td>
983 <td width="10%"></td>
986 <tr valign="top" align="left">
987 <td width="24%"></td>
991 <p>char unused[1];</p></td>
992 <td width="10%"></td>
995 <tr valign="top" align="left">
996 <td width="24%"></td>
1000 <p>struct {</p></td>
1001 <td width="10%"></td>
1004 <tr valign="top" align="left">
1005 <td width="24%"></td>
1011 <p>char offset[12];</p></td>
1014 <tr valign="top" align="left">
1015 <td width="24%"></td>
1021 <p>char numbytes[12];</p></td>
1024 <tr valign="top" align="left">
1025 <td width="24%"></td>
1029 <p>} sparse[4];</p></td>
1030 <td width="10%"></td>
1033 <tr valign="top" align="left">
1034 <td width="24%"></td>
1038 <p>char isextended[1];</p></td>
1039 <td width="10%"></td>
1042 <tr valign="top" align="left">
1043 <td width="24%"></td>
1047 <p>char realsize[12];</p></td>
1048 <td width="10%"></td>
1051 <tr valign="top" align="left">
1052 <td width="24%"></td>
1056 <p>char pad[17];</p></td>
1057 <td width="10%"></td>
1062 <p style="margin-left:14%;">};</p>
1064 <p style="margin-top: 1em"><i>typeflag</i></p>
1066 <p style="margin-left:17%;">GNU tar uses the following
1067 special entry types, in addition to those defined by
1070 <p style="margin-top: 1em">7</p>
1072 <p style="margin-left:27%; margin-top: 1em">GNU tar treats
1073 type "7" records identically to type "0"
1074 records, except on one obscure RTOS where they are used to
1075 indicate the pre-allocation of a contiguous file on
1078 <p style="margin-top: 1em">D</p>
1080 <p style="margin-left:27%; margin-top: 1em">This indicates
1081 a directory entry. Unlike the POSIX-standard "5"
1082 typeflag, the header is followed by data records listing the
1083 names of files in this directory. Each name is preceded by
1084 an ASCII "Y" if the file is stored in this archive
1085 or "N" if the file is not stored in this archive.
1086 Each name is terminated with a null, and an extra null marks
1087 the end of the name list. The purpose of this entry is to
1088 support incremental backups; a program restoring from such
1089 an archive may wish to delete files on disk that did not
1090 exist in the directory when the archive was made.</p>
1092 <p style="margin-left:27%; margin-top: 1em">Note that the
1093 "D" typeflag specifically violates POSIX, which
1094 requires that unrecognized typeflags be restored as normal
1095 files. In this case, restoring the "D" entry as a
1096 file could interfere with subsequent creation of the
1097 like-named directory.</p>
1099 <p style="margin-top: 1em">K</p>
1101 <p style="margin-left:27%; margin-top: 1em">The data for
1102 this entry is a long linkname for the following regular
1105 <p style="margin-top: 1em">L</p>
1107 <p style="margin-left:27%; margin-top: 1em">The data for
1108 this entry is a long pathname for the following regular
1111 <p style="margin-top: 1em">M</p>
1113 <p style="margin-left:27%; margin-top: 1em">This is a
1114 continuation of the last file on the previous volume. GNU
1115 multi-volume archives guarantee that each volume begins with
1116 a valid entry header. To ensure this, a file may be split,
1117 with part stored at the end of one volume, and part stored
1118 at the beginning of the next volume. The "M"
1119 typeflag indicates that this entry continues an existing
1120 file. Such entries can only occur as the first or second
1121 entry in an archive (the latter only if the first entry is a
1122 volume label). The <i>size</i> field specifies the size of
1123 this entry. The <i>offset</i> field at bytes 369-380
1124 specifies the offset where this file fragment begins. The
1125 <i>realsize</i> field specifies the total size of the file
1126 (which must equal <i>size</i> plus <i>offset</i>). When
1127 extracting, GNU tar checks that the header file name is the
1128 one it is expecting, that the header offset is in the
1129 correct sequence, and that the sum of offset and size is
1130 equal to realsize.</p>
1132 <p style="margin-top: 1em">N</p>
1134 <p style="margin-left:27%; margin-top: 1em">Type
1135 "N" records are no longer generated by GNU tar.
1136 They contained a list of files to be renamed or symlinked
1137 after extraction; this was originally used to support long
1138 names. The contents of this record are a text description of
1139 the operations to be done, in the form ’’Rename
1140 %s to %s\n’’ or ’’Symlink %s to
1141 %s\n’’; in either case, both filenames are
1142 escaped using K&R C syntax. Due to security concerns,
1143 "N" records are now generally ignored when reading
1146 <p style="margin-top: 1em">S</p>
1148 <p style="margin-left:27%; margin-top: 1em">This is a
1149 ’’sparse’’ regular file. Sparse
1150 files are stored as a series of fragments. The header
1151 contains a list of fragment offset/length pairs. If more
1152 than four such entries are required, the header is extended
1153 as necessary with ’’extra’’ header
1154 extensions (an older format that is no longer used), or
1155 ’’sparse’’ extensions.</p>
1157 <p style="margin-top: 1em">V</p>
1159 <p style="margin-left:27%; margin-top: 1em">The <i>name</i>
1160 field should be interpreted as a tape/volume header name.
1161 This entry should generally be ignored on extraction.</p>
1163 <p style="margin-top: 1em"><i>magic</i></p>
1165 <p style="margin-left:17%; margin-top: 1em">The magic field
1166 holds the five characters ’’ustar’’
1167 followed by a space. Note that POSIX ustar archives have a
1170 <p style="margin-top: 1em"><i>version</i></p>
1172 <p style="margin-left:17%;">The version field holds a space
1173 character followed by a null. Note that POSIX ustar archives
1174 use two copies of the ASCII digit
1175 ’’0’’.</p>
1177 <p style="margin-top: 1em"><i>atime</i>, <i>ctime</i></p>
1179 <p style="margin-left:17%;">The time the file was last
1180 accessed and the time of last change of file information,
1181 stored in octal as with <i>mtime</i>.</p>
1183 <p style="margin-top: 1em"><i>longnames</i></p>
1185 <p style="margin-left:17%;">This field is apparently no
1188 <p style="margin-top: 1em">Sparse <i>offset /
1191 <p style="margin-left:17%;">Each such structure specifies a
1192 single fragment of a sparse file. The two fields store
1193 values as octal numbers. The fragments are each padded to a
1194 multiple of 512 bytes in the archive. On extraction, the
1195 list of fragments is collected from the header (including
1196 any extension headers), and the data is then read and
1197 written to the file at appropriate offsets.</p>
1199 <p style="margin-top: 1em"><i>isextended</i></p>
1201 <p style="margin-left:17%;">If this is set to non-zero, the
1202 header will be followed by additional ’’sparse
1203 header’’ records. Each such record contains
1204 information about as many as 21 additional sparse blocks as
1207 <p style="margin-left:24%; margin-top: 1em">struct
1208 gnu_sparse_header {</p>
1210 <table width="100%" border="0" rules="none" frame="void"
1211 cellspacing="0" cellpadding="0">
1212 <tr valign="top" align="left">
1213 <td width="35%"></td>
1217 <p>struct {</p></td>
1218 <td width="10%"></td>
1221 <tr valign="top" align="left">
1222 <td width="35%"></td>
1228 <p>char offset[12];</p></td>
1231 <tr valign="top" align="left">
1232 <td width="35%"></td>
1238 <p>char numbytes[12];</p></td>
1241 <tr valign="top" align="left">
1242 <td width="35%"></td>
1246 <p>} sparse[21];</p></td>
1247 <td width="10%"></td>
1250 <tr valign="top" align="left">
1251 <td width="35%"></td>
1255 <p>char isextended[1];</p></td>
1256 <td width="10%"></td>
1259 <tr valign="top" align="left">
1260 <td width="35%"></td>
1264 <p>char padding[7];</p></td>
1265 <td width="10%"></td>
1270 <p style="margin-left:24%;">};</p>
1272 <p style="margin-top: 1em"><i>realsize</i></p>
1274 <p style="margin-left:17%;">A binary representation of the
1275 file’s complete size, with a much larger range than
1276 the POSIX file size. In particular, with <b>M</b> type
1277 files, the current entry is only a portion of the file. In
1278 that case, the POSIX size field will indicate the size of
1279 this entry; the <i>realsize</i> field will indicate the
1280 total size of the file.</p>
1282 <p style="margin-left:6%; margin-top: 1em"><b>GNU tar pax
1284 GNU tar 1.14 (XXX check this XXX) and later will write pax
1285 interchange format archives when you specify the
1286 <b>−-posix</b> flag. This format follows the pax
1287 interchange format closely, using some <b>SCHILY</b> tags
1288 and introducing new keywords to store sparse file
1289 information. There have been three iterations of the sparse
1290 file support, referred to as
1291 ’’0.0’’,
1292 ’’0.1’’, and
1293 ’’1.0’’.</p>
1295 <p style="margin-top: 1em"><b>GNU.sparse.numblocks</b>,
1296 <b>GNU.sparse.offset</b>, <b>GNU.sparse.numbytes</b>,
1297 <b>GNU.sparse.size</b></p>
1299 <p style="margin-left:17%;">The
1300 ’’0.0’’ format used an initial
1301 <b>GNU.sparse.numblocks</b> attribute to indicate the number
1302 of blocks in the file, a pair of <b>GNU.sparse.offset</b>
1303 and <b>GNU.sparse.numbytes</b> to indicate the offset and
1304 size of each block, and a single <b>GNU.sparse.size</b> to
1305 indicate the full size of the file. This is not the same as
1306 the size in the tar header because the latter value does not
1307 include the size of any holes. This format required that the
1308 order of attributes be preserved and relied on readers
1309 accepting multiple appearances of the same attribute names,
1310 which is not officially permitted by the standards.</p>
1312 <p style="margin-top: 1em"><b>GNU.sparse.map</b></p>
1314 <p style="margin-left:17%;">The
1315 ’’0.1’’ format used a single
1316 attribute that stored a comma-separated list of decimal
1317 numbers. Each pair of numbers indicated the offset and size,
1318 respectively, of a block of data. This does not work well if
1319 the archive is extracted by an archiver that does not
1320 recognize this extension, since many pax implementations
1321 simply discard unrecognized attributes.</p>
1323 <p style="margin-top: 1em"><b>GNU.sparse.major</b>,
1324 <b>GNU.sparse.minor</b>, <b>GNU.sparse.name</b>,
1325 <b>GNU.sparse.realsize</b></p>
1327 <p style="margin-left:17%;">The
1328 ’’1.0’’ format stores the sparse
1329 block map in one or more 512-byte blocks prepended to the
1330 file data in the entry body. The pax attributes indicate the
1331 existence of this map (via the <b>GNU.sparse.major</b> and
1332 <b>GNU.sparse.minor</b> fields) and the full size of the
1333 file. The <b>GNU.sparse.name</b> holds the true name of the
1334 file. To avoid confusion, the name stored in the regular tar
1335 header is a modified name so that extraction errors will be
1336 apparent to users.</p>
1338 <p style="margin-left:6%; margin-top: 1em"><b>Solaris
1340 XXX More Details Needed XXX</p>
1342 <p style="margin-left:6%; margin-top: 1em">Solaris tar
1343 (beginning with SunOS XXX 5.7 ?? XXX) supports an
1344 ’’extended’’ format that is
1345 fundamentally similar to pax interchange format, with the
1346 following differences:</p>
1348 <p><b>•</b></p>
1350 <p style="margin-left:17%;">Extended attributes are stored
1351 in an entry whose type is <b>X</b>, not <b>x</b>, as used by
1352 pax interchange format. The detailed format of this entry
1353 appears to be the same as detailed above for the <b>x</b>
1356 <p><b>•</b></p>
1358 <p style="margin-left:17%;">An additional <b>A</b> header
1359 is used to store an ACL for the following regular entry. The
1360 body of this entry contains a seven-digit octal number
1361 followed by a zero byte, followed by the textual ACL
1362 description. The octal value is the number of ACL entries
1363 plus a constant that indicates the ACL type: 01000000 for
1364 POSIX.1e ACLs and 03000000 for NFSv4 ACLs.</p>
1366 <p style="margin-left:6%; margin-top: 1em"><b>AIX Tar</b>
1368 XXX More details needed XXX</p>
1370 <p style="margin-left:6%; margin-top: 1em">AIX Tar uses a
1371 ustar-formatted header with the type <b>A</b> for storing
1372 coded ACL information. Unlike the Solaris format, AIX tar
1373 writes this header after the regular file body to which it
1374 applies. The pathname in this header is either <b>NFS4</b>
1375 or <b>AIXC</b> to indicate the type of ACL stored. The
1376 actual ACL is stored in platform-specific binary format.</p>
1378 <p style="margin-left:6%; margin-top: 1em"><b>Mac OS X
1380 The tar distributed with Apple’s Mac OS X stores most
1381 regular files as two separate files in the tar archive. The
1382 two files have the same name except that the first one has
1383 ’’._’’ prepended to the last path
1384 element. This special file stores an AppleDouble-encoded
1385 binary blob with additional metadata about the second file,
1386 including ACL, extended attributes, and resources. To
1387 recreate the original file on disk, each separate file can
1388 be extracted and the Mac OS X <b>copyfile</b>() function can
1389 be used to unpack the separate metadata file and apply it to
1390 th regular file. Conversely, the same function provides a
1391 ’’pack’’ option to encode the
1392 extended metadata from a file into a separate file whose
1393 contents can then be put into a tar archive.</p>
1395 <p style="margin-left:6%; margin-top: 1em">Note that the
1396 Apple extended attributes interact badly with long
1397 filenames. Since each file is stored with the full name, a
1398 separate set of extensions needs to be included in the
1399 archive for each one, doubling the overhead required for
1400 files with long names.</p>
1402 <p style="margin-left:6%; margin-top: 1em"><b>Summary of
1403 tar type codes</b> <br>
1404 The following list is a condensed summary of the type codes
1405 used in tar header records generated by different tar
1406 implementations. More details about specific implementations
1407 can be found above:</p>
1411 <p style="margin-left:13%; margin-top: 1em">Early tar
1412 programs stored a zero byte for regular files.</p>
1416 <p style="margin-left:13%; margin-top: 1em">POSIX standard
1417 type code for a regular file.</p>
1421 <p style="margin-left:13%; margin-top: 1em">POSIX standard
1422 type code for a hard link description.</p>
1426 <p style="margin-left:13%; margin-top: 1em">POSIX standard
1427 type code for a symbolic link description.</p>
1431 <p style="margin-left:13%; margin-top: 1em">POSIX standard
1432 type code for a character device node.</p>
1436 <p style="margin-left:13%; margin-top: 1em">POSIX standard
1437 type code for a block device node.</p>
1441 <p style="margin-left:13%; margin-top: 1em">POSIX standard
1442 type code for a directory.</p>
1446 <p style="margin-left:13%; margin-top: 1em">POSIX standard
1447 type code for a FIFO.</p>
1451 <p style="margin-left:13%; margin-top: 1em">POSIX
1456 <p style="margin-left:13%; margin-top: 1em">GNU tar used
1457 for pre-allocated files on some systems.</p>
1461 <p style="margin-left:13%; margin-top: 1em">Solaris tar ACL
1462 description stored prior to a regular file header.</p>
1466 <p style="margin-left:13%; margin-top: 1em">AIX tar ACL
1467 description stored after the file body.</p>
1471 <p style="margin-left:13%; margin-top: 1em">GNU tar
1476 <p style="margin-left:13%; margin-top: 1em">GNU tar long
1477 linkname for the following header.</p>
1481 <p style="margin-left:13%; margin-top: 1em">GNU tar long
1482 pathname for the following header.</p>
1486 <p style="margin-left:13%; margin-top: 1em">GNU tar
1487 multivolume marker, indicating the file is a continuation of
1488 a file from the previous volume.</p>
1492 <p style="margin-left:13%; margin-top: 1em">GNU tar long
1493 filename support. Deprecated.</p>
1497 <p style="margin-left:13%; margin-top: 1em">GNU tar sparse
1502 <p style="margin-left:13%; margin-top: 1em">GNU tar
1503 tape/volume header name.</p>
1507 <p style="margin-left:13%; margin-top: 1em">Solaris tar
1508 general-purpose extension header.</p>
1512 <p style="margin-left:13%; margin-top: 1em">POSIX pax
1513 interchange format global extensions.</p>
1517 <p style="margin-left:13%; margin-top: 1em">POSIX pax
1518 interchange format per-file extensions.</p>
1520 <p style="margin-top: 1em"><b>SEE ALSO</b></p>
1522 <p style="margin-left:6%;">ar(1), pax(1), tar(1)</p>
1524 <p style="margin-top: 1em"><b>STANDARDS</b></p>
1526 <p style="margin-left:6%;">The <b>tar</b> utility is no
1527 longer a part of POSIX or the Single Unix Standard. It last
1528 appeared in Version 2 of the Single UNIX Specification
1529 (’’SUSv2’’). It has been supplanted
1530 in subsequent standards by pax(1). The ustar format is
1531 currently part of the specification for the pax(1) utility.
1532 The pax interchange file format is new with IEEE Std
1533 1003.1-2001 (’’POSIX.1’’).</p>
1535 <p style="margin-top: 1em"><b>HISTORY</b></p>
1537 <p style="margin-left:6%;">A <b>tar</b> command appeared in
1538 Seventh Edition Unix, which was released in January, 1979.
1539 It replaced the <b>tp</b> program from Fourth Edition Unix
1540 which in turn replaced the <b>tap</b> program from First
1541 Edition Unix. John Gilmore’s <b>pdtar</b>
1542 public-domain implementation (circa 1987) was highly
1543 influential and formed the basis of <b>GNU tar</b> (circa
1544 1988). Joerg Shilling’s <b>star</b> archiver is
1545 another open-source (CDDL) archiver (originally developed
1546 circa 1985) which features complete support for pax
1547 interchange format.</p>
1549 <p style="margin-left:6%; margin-top: 1em">This
1550 documentation was written as part of the <b>libarchive</b>
1551 and <b>bsdtar</b> project by Tim Kientzle
1552 <kientzle@FreeBSD.org>.</p>
1554 <p style="margin-left:6%; margin-top: 1em">BSD
1555 December 23, 2011 BSD</p>