doc: document split's new --filter=CMD option

author Jim Meyering <meyering@redhat.com>

Sat, 30 Apr 2011 07:52:20 +0000 (09:52 +0200)

committer Jim Meyering <meyering@redhat.com>

Fri, 6 May 2011 20:54:51 +0000 (22:54 +0200)
author Jim Meyering <meyering@redhat.com>
Sat, 30 Apr 2011 07:52:20 +0000 (09:52 +0200)
committer Jim Meyering <meyering@redhat.com>
Fri, 6 May 2011 20:54:51 +0000 (22:54 +0200)
diff --git a/NEWS b/NEWS

index c90e02f..82ce53c 100644 (file)
--- a/NEWS
+++ b/NEWS
@@ -2,6 +2,16 @@ GNU coreutils NEWS                                    -*- outline -*-
  
  * Noteworthy changes in release ?.? (????-??-??) [?]
  
+** New features
+
+  split accepts a new --filter=CMD option.  With it, split filters output
+  through CMD.  CMD may use the $FILE environment variable, which is set to
+  the nominal output file name for each invocation of CMD.  For example, to
+  split a file into 3 approximately equal parts, which are then compressed:
+    split -n3 --filter='xz > $FILE.xz' big
+  Note the use of single quotes, not double quotes.
+  That creates files named xaa.xz, xab.xz and xac.xz.
+
  
  * Noteworthy changes in release 8.12 (2011-04-26) [stable]
  
diff --git a/doc/coreutils.texi b/doc/coreutils.texi

index d2377f4..457ecab 100644 (file)
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -2992,8 +2992,8 @@ The program accepts the following options.  Also see @ref{Common options}.
  Put @var{lines} lines of @var{input} into each output file.
  
  For compatibility @command{split} also supports an obsolete
-option syntax @option{-@var{lines}}.  New scripts should use @option{-l
-@var{lines}} instead.
+option syntax @option{-@var{lines}}.  New scripts should use
+@option{-l @var{lines}} instead.
  
  @item -b @var{size}
  @itemx --bytes=@var{size}
@@ -3011,6 +3011,25 @@ possible without exceeding @var{size} bytes.  Individual lines longer than
  @var{size} bytes are broken into multiple files.
  @var{size} has the same format as for the @option{--bytes} option.
  
+@itemx --filter=@var{command}
+@opindex --filter
+With this option, rather than simply writing to each output file,
+write through a pipe to the specified shell @var{command} for each output file.
+@var{command} should use the $FILE environment variable, which is set
+to a different output file name for each invocation of the command.
+For example, imagine that you have a 1TiB compressed file
+that, if uncompressed, would be too large to reside on disk,
+yet you must split it into individually-compressed pieces
+of a more manageable size.
+To do that, you might run this command:
+
+@example
+xz -dc BIG.xz | split -b200G --filter='xz > $FILE.xz' - big-
+@end example
+
+Assuming a 10:1 compression ratio, that would create about fifty 20GiB files
+with names @file{big-xaa.xz}, @file{big-xab.xz}, @file{big-xac.xz}, etc.
+
  @item -n @var{chunks}
  @itemx --number=@var{chunks}
  @opindex -n
author	Jim Meyering <meyering@redhat.com>
	Sat, 30 Apr 2011 07:52:20 +0000 (09:52 +0200)
committer	Jim Meyering <meyering@redhat.com>
	Fri, 6 May 2011 20:54:51 +0000 (22:54 +0200)
NEWS		patch \| blob \| history
doc/coreutils.texi		patch \| blob \| history