doc: describe how kernel inotify support affects tail -f

[platform/upstream/coreutils.git] / doc / coreutils.texi
diff --git a/doc/coreutils.texi b/doc/coreutils.texi

index 8a1b3b6..dfaf4c9 100644 (file)
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -217,6 +217,7 @@ Common Options
  * Exit status::                  Indicating program success or failure
  * Backup options::               Backup options
  * Block size::                   Block size
+* Floating point::               Floating point number representation
  * Signal specifications::        Specifying signals
  * Disambiguating names and IDs:: chgrp and chown owner and group syntax
  * Random sources::               Sources of random data
@@ -624,7 +625,7 @@ Remove any trailing slashes from each @var{source} argument.
  @macro mayConflictWithShellBuiltIn{cmd}
  @cindex conflicts with shell built-ins
  @cindex built-in shell commands, conflicts with
-Due to shell aliases and built-in @command{\cmd\} command, using an
+Due to shell aliases and built-in @command{\cmd\} functions, using an
  unadorned @command{\cmd\} interactively or in a script may get you
  different functionality than that described here.  Invoke it via
  @command{env} (i.e., @code{env \cmd\ @dots{}}) to avoid interference
@@ -729,6 +730,7 @@ name.
  * Exit status::                 Indicating program success or failure.
  * Backup options::              -b -S, in some programs.
  * Block size::                  BLOCK_SIZE and --block-size, in some programs.
+* Floating point::              Floating point number representation.
  * Signal specifications::       Specifying signals using the --signal option.
  * Disambiguating names and IDs:: chgrp and chown owner and group syntax
  * Random sources::              --random-source, in some programs.
@@ -1011,6 +1013,34 @@ set.  The @option{-h} or @option{--human-readable} option is equivalent to
  @option{--block-size=human-readable}.  The @option{--si} option is
  equivalent to @option{--block-size=si}.
  
+@node Floating point
+@section Floating point numbers
+@cindex floating point
+@cindex IEEE floating point
+
+Commands that accept or produce floating point numbers employ the
+floating point representation of the underlying system, and suffer
+from rounding error, overflow, and similar floating-point issues.
+Almost all modern systems use IEEE-754 floating point, and it is
+typically portable to assume IEEE-754 behavior these days.  IEEE-754
+has positive and negative infinity, distinguishes positive from
+negative zero, and uses special values called NaNs to represent
+invalid computations such as dividing zero by itself.  For more
+information, please see David Goldberg's paper
+@uref{http://@/www.validlab.com/@/goldberg/@/paper.pdf, What Every
+Computer Scientist Should Know About Floating-Point Arithmetic}.
+
+@vindex LC_NUMERIC
+Commands that accept floating point numbers as options, operands or
+input use the standard C functions @code{strtod} and @code{strtold} to
+convert from text to floating point numbers.  These floating point
+numbers therefore can use scientific notation like @code{1.0e-34} and
+@code{-10e100}.  Modern C implementations also accept hexadecimal
+floating point numbers such as @code{-0x.ep-3}, which stands for
+@minus{}14/16 times @math{2^-3}, which equals @minus{}0.109375.  The
+@env{LC_NUMERIC} locale determines the decimal-point character.
+@xref{Parsing of Floats,,, libc, The GNU C Library Reference Manual}.
+
  @node Signal specifications
  @section Signal specifications
  @cindex signals, specifying
@@ -1425,10 +1455,11 @@ The @sc{gnu} utilities normally conform to the version of @acronym{POSIX}
  that is standard for your system.  To cause them to conform to a
  different version of @acronym{POSIX}, define the @env{_POSIX2_VERSION}
  environment variable to a value of the form @var{yyyymm} specifying
-the year and month the standard was adopted.  Two values are currently
+the year and month the standard was adopted.  Three values are currently
  supported for @env{_POSIX2_VERSION}: @samp{199209} stands for
-@acronym{POSIX} 1003.2-1992, and @samp{200112} stands for @acronym{POSIX}
-1003.1-2001.  For example, if you have a newer system but are running software
+@acronym{POSIX} 1003.2-1992, @samp{200112} stands for @acronym{POSIX}
+1003.1-2001, and @samp{200809} stands for @acronym{POSIX} 1003.1-2008.
+For example, if you have a newer system but are running software
  that assumes an older version of @acronym{POSIX} and uses @samp{sort +1}
  or @samp{tail +10}, you can work around any compatibility problems by setting
  @samp{_POSIX2_VERSION=199209} in your environment.
@@ -1852,7 +1883,7 @@ Output at most @var{bytes} bytes of the input.  Prefixes and suffixes on
  Instead of the normal output, output only @dfn{string constants}: at
  least @var{bytes} consecutive @acronym{ASCII} graphic characters,
  followed by a zero byte (@acronym{ASCII} @sc{nul}).
-Prefixes and suffixes on @code{bytes} are interpreted as for the
+Prefixes and suffixes on @var{bytes} are interpreted as for the
  @option{-j} option.
  
  If @var{n} is omitted with @option{--strings}, the default is 3.
@@ -1880,7 +1911,7 @@ named character, ignoring high-order bit
  @item d
  signed decimal
  @item f
-floating point
+floating point (@pxref{Floating point})
  @item o
  octal
  @item u
@@ -2799,6 +2830,11 @@ no @var{file} operand is specified and standard input is a FIFO or a pipe.
  Likewise, the @option{-f} option has no effect for any
  operand specified as @samp{-}, when standard input is a FIFO or a pipe.
  
+With kernel inotify support, output is asynchronous and generally very prompt.
+Otherwise, @command{tail} sleeps for one second between checks---
+use @option{--sleep-interval=@var{N}} to change that default---which can
+make the output appear slightly less responsive or bursty.
+
  @item -F
  @opindex -F
  This option is the same as @option{--follow=name --retry}.  That is, tail
@@ -2820,9 +2856,11 @@ During one iteration, every specified file is checked to see if it has
  changed size.
  Historical implementations of @command{tail} have required that
  @var{number} be an integer.  However, GNU @command{tail} accepts
-an arbitrary floating point number (using a period before any
-fractional digits).
-When @command{tail} uses inotify, this polling-related option is ignored.
+an arbitrary floating point number.  @xref{Floating point}.
+When @command{tail} uses inotify, this polling-related option
+is usually ignored.  However, if you also specify @option{--pid=@var{p}},
+@command{tail} checks whether process @var{p} is alive at least
+every @var{number} seconds.
  
  @itemx --pid=@var{pid}
  @opindex --pid
@@ -2959,8 +2997,8 @@ The program accepts the following options.  Also see @ref{Common options}.
  Put @var{lines} lines of @var{input} into each output file.
  
  For compatibility @command{split} also supports an obsolete
-option syntax @option{-@var{lines}}.  New scripts should use @option{-l
-@var{lines}} instead.
+option syntax @option{-@var{lines}}.  New scripts should use
+@option{-l @var{lines}} instead.
  
  @item -b @var{size}
  @itemx --bytes=@var{size}
@@ -2978,6 +3016,25 @@ possible without exceeding @var{size} bytes.  Individual lines longer than
  @var{size} bytes are broken into multiple files.
  @var{size} has the same format as for the @option{--bytes} option.
  
+@itemx --filter=@var{command}
+@opindex --filter
+With this option, rather than simply writing to each output file,
+write through a pipe to the specified shell @var{command} for each output file.
+@var{command} should use the $FILE environment variable, which is set
+to a different output file name for each invocation of the command.
+For example, imagine that you have a 1TiB compressed file
+that, if uncompressed, would be too large to reside on disk,
+yet you must split it into individually-compressed pieces
+of a more manageable size.
+To do that, you might run this command:
+
+@example
+xz -dc BIG.xz | split -b200G --filter='xz > $FILE.xz' - big-
+@end example
+
+Assuming a 10:1 compression ratio, that would create about fifty 20GiB files
+with names @file{big-xaa.xz}, @file{big-xab.xz}, @file{big-xac.xz}, etc.
+
  @item -n @var{chunks}
  @itemx --number=@var{chunks}
  @opindex -n
@@ -3883,11 +3940,8 @@ the final result, after the throwing away.))
  @opindex --sort
  @cindex general numeric sort
  @vindex LC_NUMERIC
-Sort numerically, using the standard C function @code{strtold} to convert
-a prefix of each line to a long double-precision floating point number.
-This allows floating point numbers to be specified in scientific notation,
-like @code{1.0e-34} and @code{10e100}.
-The @env{LC_NUMERIC} locale determines the decimal-point character.
+Sort numerically, converting a prefix of each line to a long
+double-precision floating point number.  @xref{Floating point}.
  Do not report overflow, underflow, or conversion errors.
  Use the following collating sequence:
  
@@ -4761,11 +4815,17 @@ If there is an error it exits with nonzero status.
  @macro checkOrderOption{cmd}
  If the @option{--check-order} option is given, unsorted inputs will
  cause a fatal error message.  If the option @option{--nocheck-order}
-is given, unsorted inputs will never cause an error message.  If
-neither of these options is given, wrongly sorted inputs are diagnosed
-only if an input file is found to contain unpairable lines.  If an
-input file is diagnosed as being unsorted, the @command{\cmd\} command
-will exit with a nonzero status (and the output should not be used).
+is given, unsorted inputs will never cause an error message.  If neither
+of these options is given, wrongly sorted inputs are diagnosed
+only if an input file is found to contain unpairable
+@ifset JOIN_COMMAND
+lines, and when both input files are non empty.
+@end ifset
+@ifclear JOIN_COMMAND
+lines.
+@end ifclear
+If an input file is diagnosed as being unsorted, the @command{\cmd\}
+command will exit with a nonzero status (and the output should not be used).
  
  Forcing @command{\cmd\} to process wrongly sorted input files
  containing unpairable lines by specifying @option{--nocheck-order} is
@@ -5474,11 +5534,26 @@ Select for printing only the fields listed in @var{field-list}.
  Fields are separated by a TAB character by default.  Also print any
  line that contains no delimiter character, unless the
  @option{--only-delimited} (@option{-s}) option is specified.
-Note @command{cut} does not support specifying runs of whitespace as a
-delimiter, so to achieve that common functionality one can pre-process
-with @command{tr} like:
+
+Note @command{awk} supports more sophisticated field processing,
+and by default will use (and discard) runs of blank characters to
+separate fields, and ignore leading and trailing blanks.
  @example
-tr -s '[:blank:]' '\t' | cut -f@dots{}
+@verbatim
+awk '{print $2}'    # print the second field
+awk '{print $NF-1}' # print the penultimate field
+awk '{print $2,$1}' # reorder the first two fields
+@end verbatim
+@end example
+
+In the unlikely event that @command{awk} is unavailable,
+one can use the @command{join} command, to process blank
+characters as @command{awk} does above.
+@example
+@verbatim
+join -a1 -o 1.2     - /dev/null # print the second field
+join -a1 -o 1.2,1.1 - /dev/null # reorder the first two fields
+@end verbatim
  @end example
  
  @item -d @var{input_delim_byte}
@@ -5646,7 +5721,9 @@ c c1 c2
  b b1 b2
  @end example
  
+@set JOIN_COMMAND
  @checkOrderOption{join}
+@clear JOIN_COMMAND
  
  The defaults are:
  @itemize
@@ -5675,8 +5752,8 @@ Do not check that both input files are in sorted order.  This is the default.
  
  @item -e @var{string}
  @opindex -e
-Replace those output fields that are missing in the input with
-@var{string}.
+Replace those output fields that are missing in the input with @var{string}.
+I.E. missing fields specified with the @option{-12jo} options.
  
  @item --header
  @opindex --header
@@ -5707,10 +5784,17 @@ Join on field @var{field} (a positive integer) of file 2.
  Equivalent to @option{-1 @var{field} -2 @var{field}}.
  
  @item -o @var{field-list}
-Construct each output line according to the format in @var{field-list}.
-Each element in @var{field-list} is either the single character @samp{0} or
-has the form @var{m.n} where the file number, @var{m}, is @samp{1} or
-@samp{2} and @var{n} is a positive field number.
+@itemx -o auto
+If the keyword @samp{auto} is specified, infer the output format from
+the first line in each file.  This is the same as the default output format
+but also ensures the same number of fields are output for each line.
+Missing fields are replaced with the @option{-e} option and extra fields
+are discarded.
+
+Otherwise, construct each output line according to the format in
+@var{field-list}.  Each element in @var{field-list} is either the single
+character @samp{0} or has the form @var{m.n} where the file number, @var{m},
+is @samp{1} or @samp{2} and @var{n} is a positive field number.
  
  A field specification of @samp{0} denotes the join field.
  In most cases, the functionality of the @samp{0} field spec
@@ -6963,6 +7047,23 @@ Piping a colorized listing through a pager like @command{more} or
  @command{less} usually produces unreadable results.  However, using
  @code{more -f} does seem to work.
  
+@vindex LS_COLORS
+@vindex SHELL @r{environment variable, and color}
+Note that using the @option{--color} option may incur a noticeable
+performance penalty when run in a directory with very many entries,
+because the default settings require that @command{ls} @code{stat} every
+single file it lists.
+However, if you would like most of the file-type coloring
+but can live without the other coloring options (e.g.,
+executable, orphan, sticky, other-writable, capability), use
+@command{dircolors} to set the @env{LS_COLORS} environment variable like this,
+@example
+eval $(dircolors -p | perl -pe \
+  's/^((CAP|S[ET]|O[TR]|M|E)\w+).*/$1 00/' | dircolors -)
+@end example
+and on a @code{dirent.d_type}-capable file system, @command{ls}
+will perform only one @code{stat} call per command line argument.
+
  @item -F
  @itemx --classify
  @itemx --indicator-style=classify
@@ -7915,8 +8016,8 @@ Set both input and output block sizes to @var{bytes}.
  This makes @command{dd} read and write @var{bytes} per block,
  overriding any @samp{ibs} and @samp{obs} settings.
  In addition, if no data-transforming @option{conv} option is specified,
-each input block is copied to the output as a single block,
-without aggregating short reads.
+input is copied to the output as soon as it's read,
+even if it is smaller than the block size.
  
  @item cbs=@var{bytes}
  @opindex cbs
@@ -8006,22 +8107,29 @@ Swap every pair of input bytes.  @sc{gnu} @command{dd}, unlike others, works
  when an odd number of bytes are read---the last byte is simply copied
  (since there is nothing to swap it with).
  
-@item noerror
-@opindex noerror
-@cindex read errors, ignoring
-Continue after read errors.
+@item sync
+@opindex sync @r{(padding with @acronym{ASCII} @sc{nul}s)}
+Pad every input block to size of @samp{ibs} with trailing zero bytes.
+When used with @samp{block} or @samp{unblock}, pad with spaces instead of
+zero bytes.
  
-@item nocreat
-@opindex nocreat
-@cindex creating output file, avoiding
-Do not create the output file; the output file must already exist.
+@end table
  
+The following ``conversions'' are really file flags
+and don't affect internal processing:
+
+@table @samp
  @item excl
  @opindex excl
  @cindex creating output file, requiring
  Fail if the output file already exists; @command{dd} must create the
  output file itself.
  
+@item nocreat
+@opindex nocreat
+@cindex creating output file, avoiding
+Do not create the output file; the output file must already exist.
+
  The @samp{excl} and @samp{nocreat} conversions are mutually exclusive.
  
  @item notrunc
@@ -8029,11 +8137,10 @@ The @samp{excl} and @samp{nocreat} conversions are mutually exclusive.
  @cindex truncating output file, avoiding
  Do not truncate the output file.
  
-@item sync
-@opindex sync @r{(padding with @acronym{ASCII} @sc{nul}s)}
-Pad every input block to size of @samp{ibs} with trailing zero bytes.
-When used with @samp{block} or @samp{unblock}, pad with spaces instead of
-zero bytes.
+@item noerror
+@opindex noerror
+@cindex read errors, ignoring
+Continue after read errors.
  
  @item fdatasync
  @opindex fdatasync
@@ -8112,6 +8219,31 @@ last-access and last-modified time) is not necessarily synchronized.
  @cindex synchronized data and metadata I/O
  Use synchronized I/O for both data and metadata.
  
+@item nocache
+@opindex nocache
+@cindex discarding file cache
+Discard the data cache for a file.
+When count=0 all cache is discarded,
+otherwise the cache is dropped for the processed
+portion of the file.  Also when count=0
+failure to discard the cache is diagnosed
+and reflected in the exit status.
+Here as some usage examples:
+
+@example
+# Advise to drop cache for whole file
+dd if=ifile iflag=nocache count=0
+
+# Ensure drop cache for the whole file
+dd of=ofile oflag=nocache conv=notrunc,fdatasync count=0
+
+# Drop cache for part of file
+dd if=ifile iflag=nocache skip=10 count=10 of=/dev/null
+
+# Stream data using just the read-ahead cache
+dd if=ifile of=ofile iflag=nocache oflag=nocache
+@end example
+
  @item nonblock
  @opindex nonblock
  @cindex nonblocking I/O
@@ -11209,6 +11341,7 @@ digits, but is printed according to the @env{LC_NUMERIC} category of the
  current locale.  For example, in a locale whose radix character is a
  comma, the command @samp{printf %g 3.14} outputs @samp{3,14} whereas
  the command @samp{printf %g 3,14} is an error.
+@xref{Floating point}.
  
  @kindex \@var{ooo}
  @kindex \x@var{hh}
@@ -11447,7 +11580,7 @@ Exit status:
  * File type tests::             -[bcdfhLpSt]
  * Access permission tests::     -[gkruwxOG]
  * File characteristic tests::   -e -s -nt -ot -ef
-* String tests::                -z -n = !=
+* String tests::                -z -n = == !=
  * Numeric tests::               -eq -ne -lt -le -gt -ge
  * Connectives for test::        ! -a -o
  @end menu
@@ -11638,6 +11771,11 @@ True if the length of @var{string} is nonzero.
  @cindex equal string check
  True if the strings are equal.
  
+@item @var{string1} == @var{string2}
+@opindex ==
+@cindex equal string check
+True if the strings are equal (synonym for =).
+
  @item @var{string1} != @var{string2}
  @opindex !=
  @cindex not-equal string check
@@ -15664,8 +15802,7 @@ days
  Historical implementations of @command{sleep} have required that
  @var{number} be an integer, and only accepted a single argument
  without a suffix.  However, GNU @command{sleep} accepts
-arbitrary floating point numbers (using a period before any fractional
-digits).
+arbitrary floating point numbers.  @xref{Floating point}.
  
  The only options are @option{--help} and @option{--version}.  @xref{Common
  options}.
@@ -15765,8 +15902,7 @@ When @var{increment} is not specified, it defaults to @samp{1},
  even when @var{first} is larger than @var{last}.
  @var{first} also defaults to @samp{1}.  So @code{seq 1} prints
  @samp{1}, but @code{seq 0} and @code{seq 10 5} produce no output.
-Floating-point numbers
-may be specified (using a period before any fractional digits).
+Floating-point numbers may be specified.  @xref{Floating point}.
  
  The program accepts the following options.  Also see @ref{Common options}.
  Options must precede operands.
@@ -15843,7 +15979,8 @@ of @code{%x}.
  
  On most systems, seq can produce whole-number output for values up to
  at least @math{2^{53}}.  Larger integers are approximated.  The details
-differ depending on your floating-point implementation, but a common
+differ depending on your floating-point implementation.
+@xref{Floating point}.  A common
  case is that @command{seq} works with integers through @math{2^{64}},
  and larger integers may not be numerically correct: