1 \input texinfo @c -*- texinfo -*-
2 @comment ========================================================
3 @comment %**start of header
6 @settitle GNU M4 @value{VERSION} macro processor
9 @setcontentsaftertitlepage
15 @c The testsuite expects literal tab output in some examples, but
16 @c literal tabs in texinfo lead to formatting issues.
22 @c -------------------
23 @c The ARG is an optional argument. To be used for macro arguments in
24 @c their documentation (@defmac).
26 @r{[}@var{\varname\}@r{]}@c
29 @c @dvar{ARG, DEFAULT}
30 @c -------------------
31 @c The ARG is an optional argument, defaulting to DEFAULT. To be used
32 @c for macro arguments in their documentation (@defmac).
33 @macro dvar{varname, default}
34 @r{[}@var{\varname\} = @samp{\default\}@r{]}@c
37 @comment %**end of header
38 @comment ========================================================
42 This manual (@value{UPDATED}) is for GNU M4 (version
43 @value{VERSION}), a package containing an implementation of the m4 macro
46 Copyright @copyright{} 1989-1994, 2004-2013 Free Software Foundation,
50 Permission is granted to copy, distribute and/or modify this document
51 under the terms of the GNU Free Documentation License,
52 Version 1.3 or any later version published by the Free Software
53 Foundation; with no Invariant Sections, no Front-Cover Texts, and no
54 Back-Cover Texts. A copy of the license is included in the section
55 entitled ``GNU Free Documentation License.''
59 @dircategory Text creation and manipulation
61 * M4: (m4). A powerful macro processor.
65 @title GNU M4, version @value{VERSION}
66 @subtitle A powerful macro processor
67 @subtitle Edition @value{EDITION}, @value{UPDATED}
68 @author by Ren@'e Seindal, Fran@,{c}ois Pinard,
69 @author Gary V. Vaughan, and Eric Blake
70 @author (@email{bug-m4@@gnu.org})
73 @vskip 0pt plus 1filll
85 GNU @code{m4} is an implementation of the traditional UNIX macro
86 processor. It is mostly SVR4 compatible, although it has some
87 extensions (for example, handling more than 9 positional parameters
88 to macros). @code{m4} also has builtin functions for including
89 files, running shell commands, doing arithmetic, etc. Autoconf needs
90 GNU @code{m4} for generating @file{configure} scripts, but not for
93 GNU @code{m4} was originally written by Ren@'e Seindal, with
94 subsequent changes by Fran@,{c}ois Pinard and other volunteers
95 on the Internet. All names and email addresses can be found in the
96 files @file{m4-@value{VERSION}/@/AUTHORS} and
97 @file{m4-@value{VERSION}/@/THANKS} from the GNU M4
100 This is release @value{VERSION}. It is now considered stable: future
101 releases in the 1.4.x series are only meant to fix bugs, increase speed,
102 or improve documentation. However@dots{}
104 An experimental feature, which would improve @code{m4} usefulness,
105 allows for changing the syntax for what is a @dfn{word} in @code{m4}.
109 ./configure --enable-changeword
112 if you want this feature compiled in. The current implementation
113 slows down @code{m4} considerably and is hardly acceptable. In the
114 future, @code{m4} 2.0 will come with a different set of new features
115 that provide similar capabilities, but without the inefficiencies, so
116 changeword will go away and @emph{you should not count on it}.
119 * Preliminaries:: Introduction and preliminaries
120 * Invoking m4:: Invoking @code{m4}
121 * Syntax:: Lexical and syntactic conventions
123 * Macros:: How to invoke macros
124 * Definitions:: How to define new macros
125 * Conditionals:: Conditionals, loops, and recursion
127 * Debugging:: How to debug macros and input
129 * Input Control:: Input control
130 * File Inclusion:: File inclusion
131 * Diversions:: Diverting and undiverting output
133 * Text handling:: Macros for text handling
134 * Arithmetic:: Macros for doing arithmetic
135 * Shell commands:: Macros for running shell commands
136 * Miscellaneous:: Miscellaneous builtin macros
137 * Frozen files:: Fast loading of frozen state
139 * Compatibility:: Compatibility with other versions of @code{m4}
140 * Answers:: Correct version of some examples
142 * Copying This Package:: How to make copies of the overall M4 package
143 * Copying This Manual:: How to make copies of this manual
144 * Indices:: Indices of concepts and macros
147 --- The Detailed Node Listing ---
149 Introduction and preliminaries
151 * Intro:: Introduction to @code{m4}
152 * History:: Historical references
153 * Bugs:: Problems and bugs
154 * Manual:: Using this manual
158 * Operation modes:: Command line options for operation modes
159 * Preprocessor features:: Command line options for preprocessor features
160 * Limits control:: Command line options for limits control
161 * Frozen state:: Command line options for frozen state
162 * Debugging options:: Command line options for debugging
163 * Command line files:: Specifying input files on the command line
165 Lexical and syntactic conventions
167 * Names:: Macro names
168 * Quoted strings:: Quoting input to @code{m4}
169 * Comments:: Comments in @code{m4} input
170 * Other tokens:: Other kinds of input tokens
171 * Input processing:: How @code{m4} copies input to output
175 * Invocation:: Macro invocation
176 * Inhibiting Invocation:: Preventing macro invocation
177 * Macro Arguments:: Macro arguments
178 * Quoting Arguments:: On Quoting Arguments to macros
179 * Macro expansion:: Expanding macros
181 How to define new macros
183 * Define:: Defining a new macro
184 * Arguments:: Arguments to macros
185 * Pseudo Arguments:: Special arguments to macros
186 * Undefine:: Deleting a macro
187 * Defn:: Renaming macros
188 * Pushdef:: Temporarily redefining macros
190 * Indir:: Indirect call of macros
191 * Builtin:: Indirect call of builtins
193 Conditionals, loops, and recursion
195 * Ifdef:: Testing if a macro is defined
196 * Ifelse:: If-else construct, or multibranch
197 * Shift:: Recursion in @code{m4}
198 * Forloop:: Iteration by counting
199 * Foreach:: Iteration by list contents
200 * Stacks:: Working with definition stacks
201 * Composition:: Building macros with macros
203 How to debug macros and input
205 * Dumpdef:: Displaying macro definitions
206 * Trace:: Tracing macro calls
207 * Debug Levels:: Controlling debugging output
208 * Debug Output:: Saving debugging output
212 * Dnl:: Deleting whitespace in input
213 * Changequote:: Changing the quote characters
214 * Changecom:: Changing the comment delimiters
215 * Changeword:: Changing the lexical structure of words
216 * M4wrap:: Saving text until end of input
220 * Include:: Including named files
221 * Search Path:: Searching for include files
223 Diverting and undiverting output
225 * Divert:: Diverting output
226 * Undivert:: Undiverting output
227 * Divnum:: Diversion numbers
228 * Cleardivert:: Discarding diverted text
230 Macros for text handling
232 * Len:: Calculating length of strings
233 * Index macro:: Searching for substrings
234 * Regexp:: Searching for regular expressions
235 * Substr:: Extracting substrings
236 * Translit:: Translating characters
237 * Patsubst:: Substituting text by regular expression
238 * Format:: Formatting strings (printf-like)
240 Macros for doing arithmetic
242 * Incr:: Decrement and increment operators
243 * Eval:: Evaluating integer expressions
245 Macros for running shell commands
247 * Platform macros:: Determining the platform
248 * Syscmd:: Executing simple commands
249 * Esyscmd:: Reading the output of commands
250 * Sysval:: Exit status
251 * Mkstemp:: Making temporary files
253 Miscellaneous builtin macros
255 * Errprint:: Printing error messages
256 * Location:: Printing current location
257 * M4exit:: Exiting from @code{m4}
259 Fast loading of frozen state
261 * Using frozen files:: Using frozen files
262 * Frozen file format:: Frozen file format
264 Compatibility with other versions of @code{m4}
266 * Extensions:: Extensions in GNU M4
267 * Incompatibilities:: Facilities in System V m4 not in GNU M4
268 * Other Incompatibilities:: Other incompatibilities
270 Correct version of some examples
272 * Improved exch:: Solution for @code{exch}
273 * Improved forloop:: Solution for @code{forloop}
274 * Improved foreach:: Solution for @code{foreach}
275 * Improved copy:: Solution for @code{copy}
276 * Improved m4wrap:: Solution for @code{m4wrap}
277 * Improved cleardivert:: Solution for @code{cleardivert}
278 * Improved capitalize:: Solution for @code{capitalize}
279 * Improved fatal_error:: Solution for @code{fatal_error}
281 How to make copies of the overall M4 package
283 * GNU General Public License:: License for copying the M4 package
285 How to make copies of this manual
287 * GNU Free Documentation License:: License for copying this manual
289 Indices of concepts and macros
291 * Macro index:: Index for all @code{m4} macros
292 * Concept index:: Index for many concepts
298 @chapter Introduction and preliminaries
300 This first chapter explains what GNU @code{m4} is, where @code{m4}
301 comes from, how to read and use this documentation, how to call the
302 @code{m4} program, and how to report bugs about it. It concludes by
303 giving tips for reading the remainder of the manual.
305 The following chapters then detail all the features of the @code{m4}
309 * Intro:: Introduction to @code{m4}
310 * History:: Historical references
311 * Bugs:: Problems and bugs
312 * Manual:: Using this manual
316 @section Introduction to @code{m4}
318 @cindex overview of @code{m4}
319 @code{m4} is a macro processor, in the sense that it copies its
320 input to the output, expanding macros as it goes. Macros are either
321 builtin or user-defined, and can take any number of arguments.
322 Besides just doing macro expansion, @code{m4} has builtin functions
323 for including named files, running shell commands, doing integer
324 arithmetic, manipulating text in various ways, performing recursion,
325 etc.@dots{} @code{m4} can be used either as a front-end to a compiler,
326 or as a macro processor in its own right.
328 The @code{m4} macro processor is widely available on all UNIXes, and has
329 been standardized by POSIX.
330 Usually, only a small percentage of users are aware of its existence.
331 However, those who find it often become committed users. The
332 popularity of GNU Autoconf, which requires GNU
333 @code{m4} for @emph{generating} @file{configure} scripts, is an incentive
334 for many to install it, while these people will not themselves
335 program in @code{m4}. GNU @code{m4} is mostly compatible with the
336 System V, Release 4 version, except for some minor differences.
337 @xref{Compatibility}, for more details.
339 Some people find @code{m4} to be fairly addictive. They first use
340 @code{m4} for simple problems, then take bigger and bigger challenges,
341 learning how to write complex sets of @code{m4} macros along the way.
342 Once really addicted, users pursue writing of sophisticated @code{m4}
343 applications even to solve simple problems, devoting more time
344 debugging their @code{m4} scripts than doing real work. Beware that
345 @code{m4} may be dangerous for the health of compulsive programmers.
348 @section Historical references
350 @cindex history of @code{m4}
351 @cindex GNU M4, history of
352 Macro languages were invented early in the history of computing. In the
353 1950s Alan Perlis suggested that the macro language be independent of the
354 language being processed. Techniques such as conditional and recursive
355 macros, and using macros to define other macros, were described by Doug
356 McIlroy of Bell Labs in ``Macro Instruction Extensions of Compiler
357 Languages'', @emph{Communications of the ACM} 3, 4 (1960), 214--20,
358 @url{http://dx.doi.org/10.1145/367177.367223}.
360 An important precursor of @code{m4} was GPM; see C. Strachey,
361 @c The title uses lower case and has no space between "macro" and "generator".
362 ``A general purpose macrogenerator'', @emph{Computer Journal} 8, 3
363 (1965), 225--41, @url{http://dx.doi.org/10.1093/comjnl/8.3.225}. GPM is
364 also succinctly described in David Gries's book @emph{Compiler
365 Construction for Digital Computers}, Wiley (1971). Strachey was a
366 brilliant programmer: GPM fit into 250 machine instructions!
368 Inspired by GPM while visiting Strachey's Lab in 1968, McIlroy wrote a
369 model preprocessor in that fit into a page of Snobol 3 code, and McIlroy
370 and Robert Morris developed a series of further models at Bell Labs.
371 Andrew D. Hall followed up with M6, a general purpose macro processor
372 used to port the Fortran source code of the Altran computer algebra
373 system; see Hall's ``The M6 Macro Processor'', Computing Science
374 Technical Report #2, Bell Labs (1972),
375 @url{http://cm.bell-labs.com/cm/cs/cstr/2.pdf}. M6's source code
376 consisted of about 600 Fortran statements. Its name was the first of
379 The Brian Kernighan and P.J. Plauger book @emph{Software Tools},
380 Addison-Wesley (1976), describes and implements a Unix
381 macro-processor language, which inspired Dennis Ritchie to write
382 @code{m3}, a macro processor for the AP-3 minicomputer.
384 Kernighan and Ritchie then joined forces to develop the original
385 @code{m4}, described in ``The M4 Macro Processor'', Bell Laboratories
386 (1977), @url{http://wolfram.schneider.org/bsd/7thEdManVol2/m4/m4.pdf}.
387 It had only 21 builtin macros.
389 While @code{GPM} was more @emph{pure}, @code{m4} is meant to deal with
390 the true intricacies of real life: macros can be recognized without
391 being pre-announced, skipping whitespace or end-of-lines is easier,
392 more constructs are builtin instead of derived, etc.
394 Originally, the Kernighan and Plauger macro-processor, and then
395 @code{m3}, formed the engine for the Rational FORTRAN preprocessor,
396 that is, the @code{Ratfor} equivalent of @code{cpp}. Later, @code{m4}
397 was used as a front-end for @code{Ratfor}, @code{C} and @code{Cobol}.
399 Ren@'e Seindal released his implementation of @code{m4}, GNU
401 in 1990, with the aim of removing the artificial limitations in many
402 of the traditional @code{m4} implementations, such as maximum line
403 length, macro size, or number of macros.
405 The late Professor A. Dain Samples described and implemented a further
406 evolution in the form of @code{M5}: ``User's Guide to the M5 Macro
407 Language: 2nd edition'', Electronic Announcement on comp.compilers
410 Fran@,{c}ois Pinard took over maintenance of GNU @code{m4} in
411 1992, until 1994 when he released GNU @code{m4} 1.4, which was
412 the stable release for 10 years. It was at this time that GNU
413 Autoconf decided to require GNU @code{m4} as its underlying
414 engine, since all other implementations of @code{m4} had too many
417 More recently, in 2004, Paul Eggert released 1.4.1 and 1.4.2 which
418 addressed some long standing bugs in the venerable 1.4 release. Then in
419 2005, Gary V. Vaughan collected together the many patches to
420 GNU @code{m4} 1.4 that were floating around the net and
421 released 1.4.3 and 1.4.4. And in 2006, Eric Blake joined the team and
422 prepared patches for the release of 1.4.5, 1.4.6, 1.4.7, and 1.4.8.
423 More bug fixes were incorporated in 2007, with releases 1.4.9 and
424 1.4.10. Eric continued with some portability fixes for 1.4.11 and
425 1.4.12 in 2008, 1.4.13 in 2009, 1.4.14 and 1.4.15 in 2010, and 1.4.16 in
428 Meanwhile, development has continued on new features for @code{m4}, such
429 as dynamic module loading and additional builtins. When complete,
430 GNU @code{m4} 2.0 will start a new series of releases.
433 @section Problems and bugs
435 @cindex reporting bugs
437 @cindex suggestions, reporting
438 If you have problems with GNU M4 or think you've found a bug,
439 please report it. Before reporting a bug, make sure you've actually
440 found a real bug. Carefully reread the documentation and see if it
441 really says you can do what you're trying to do. If it's not clear
442 whether you should be able to do something or not, report that too; it's
443 a bug in the documentation!
445 Before reporting a bug or trying to fix it yourself, try to isolate it
446 to the smallest possible input file that reproduces the problem. Then
447 send us the input file and the exact results @code{m4} gave you. Also
448 say what you expected to occur; this will help us decide whether the
449 problem was really in the documentation.
451 Once you've got a precise problem, send e-mail to
452 @email{bug-m4@@gnu.org}. Please include the version number of @code{m4}
453 you are using. You can get this information with the command
454 @kbd{m4 --version}. Also provide details about the platform you are
457 Non-bug suggestions are always welcome as well. If you have questions
458 about things that are unclear in the documentation or are just obscure
459 features, please report them too.
462 @section Using this manual
464 @cindex examples, understanding
465 This manual contains a number of examples of @code{m4} input and output,
466 and a simple notation is used to distinguish input, output and error
467 messages from @code{m4}. Examples are set out from the normal text, and
468 shown in a fixed width font, like this
472 This is an example of an example!
475 To distinguish input from output, all output from @code{m4} is prefixed
476 by the string @samp{@result{}}, and all error messages by the string
477 @samp{@error{}}. When showing how command line options affect matters,
478 the command line is shown with a prompt @samp{$ @kbd{like this}},
479 otherwise, you can assume that a simple @kbd{m4} invocation will work.
484 $ @kbd{command line to invoke m4}
485 Example of input line
486 @result{}Output line from m4
487 @error{}and an error message
490 The sequence @samp{^D} in an example indicates the end of the input
491 file. The sequence @samp{@key{NL}} refers to the newline character.
492 The majority of these examples are self-contained, and you can run them
493 with similar results by invoking @kbd{m4 -d}. In fact, the testsuite
494 that is bundled in the GNU M4 package consists of the examples
495 in this document! Some of the examples assume that your current
496 directory is located where you unpacked the installation, so if you plan
497 on following along, you may find it helpful to do this now:
501 $ @kbd{cd m4-@value{VERSION}}
504 As each of the predefined macros in @code{m4} is described, a prototype
505 call of the macro will be shown, giving descriptive names to the
508 @deffn Composite example (@var{string}, @dvar{count, 1}, @
509 @ovar{argument}@dots{})
510 This is a sample prototype. There is not really a macro named
511 @code{example}, but this documents that if there were, it would be a
512 Composite macro, rather than a Builtin. It requires at least one
513 argument, @var{string}. Remember that in @code{m4}, there must not be a
514 space between the macro name and the opening parenthesis, unless it was
515 intended to call the macro without any arguments. The brackets around
516 @var{count} and @var{argument} show that these arguments are optional.
517 If @var{count} is omitted, the macro behaves as if count were @samp{1},
518 whereas if @var{argument} is omitted, the macro behaves as if it were
519 the empty string. A blank argument is not the same as an omitted
520 argument. For example, @samp{example(`a')}, @samp{example(`a',`1')},
521 and @samp{example(`a',`1',)} would behave identically with @var{count}
522 set to @samp{1}; while @samp{example(`a',)} and @samp{example(`a',`')}
523 would explicitly pass the empty string for @var{count}. The ellipses
524 (@samp{@dots{}}) show that the macro processes additional arguments
525 after @var{argument}, rather than ignoring them.
529 All macro arguments in @code{m4} are strings, but some are given
530 special interpretation, e.g., as numbers, file names, regular
531 expressions, etc. The documentation for each macro will state how the
532 parameters are interpreted, and what happens if the argument cannot be
533 parsed according to the desired interpretation. Unless specified
534 otherwise, a parameter specified to be a number is parsed as a decimal,
535 even if the argument has leading zeros; and parsing the empty string as
536 a number results in 0 rather than an error, although a warning will be
539 This document consistently writes and uses @dfn{builtin}, without a
540 hyphen, as if it were an English word. This is how the @code{builtin}
541 primitive is spelled within @code{m4}.
544 @chapter Invoking @code{m4}
547 @cindex invoking @code{m4}
548 The format of the @code{m4} command is:
552 @code{m4} @r{[}@var{option}@dots{}@r{]} @r{[}@var{file}@dots{}@r{]}
555 @cindex command line, options
556 @cindex options, command line
557 @cindex @env{POSIXLY_CORRECT}
558 All options begin with @samp{-}, or if long option names are used, with
559 @samp{--}. A long option name need not be written completely, any
560 unambiguous prefix is sufficient. POSIX requires @code{m4} to
561 recognize arguments intermixed with files, even when
562 @env{POSIXLY_CORRECT} is set in the environment. Most options take
563 effect at startup regardless of their position, but some are documented
564 below as taking effect after any files that occurred earlier in the
565 command line. The argument @option{--} is a marker to denote the end of
568 With short options, options that do not take arguments may be combined
569 into a single command line argument with subsequent options, options
570 with mandatory arguments may be provided either as a single command line
571 argument or as two arguments, and options with optional arguments must
572 be provided as a single argument. In other words,
573 @kbd{m4 -QPDfoo -d a -df} is equivalent to
574 @kbd{m4 -Q -P -D foo -d -df -- ./a}, although the latter form is
575 considered canonical.
577 With long options, options with mandatory arguments may be provided with
578 an equal sign (@samp{=}) in a single argument, or as two arguments, and
579 options with optional arguments must be provided as a single argument.
580 In other words, @kbd{m4 --def foo --debug a} is equivalent to
581 @kbd{m4 --define=foo --debug= -- ./a}, although the latter form is
582 considered canonical (not to mention more robust, in case a future
583 version of @code{m4} introduces an option named @option{--default}).
585 @code{m4} understands the following options, grouped by functionality.
588 * Operation modes:: Command line options for operation modes
589 * Preprocessor features:: Command line options for preprocessor features
590 * Limits control:: Command line options for limits control
591 * Frozen state:: Command line options for frozen state
592 * Debugging options:: Command line options for debugging
593 * Command line files:: Specifying input files on the command line
596 @node Operation modes
597 @section Command line options for operation modes
599 Several options control the overall operation of @code{m4}:
603 Print a help summary on standard output, then immediately exit
604 @code{m4} without reading any input files or performing any other
608 Print the version number of the program on standard output, then
609 immediately exit @code{m4} without reading any input files or
610 performing any other actions.
613 @itemx --fatal-warnings
614 @cindex errors, fatal
616 Controls the effect of warnings. If unspecified, then execution
617 continues and exit status is unaffected when a warning is printed. If
618 specified exactly once, warnings become fatal; when one is issued,
619 execution continues, but the exit status will be non-zero. If specified
620 multiple times, then execution halts with non-zero status the first time
621 a warning is issued. The introduction of behavior levels is new to M4
622 1.4.9; for behavior consistent with earlier versions, you should specify
628 Makes this invocation of @code{m4} interactive. This means that all
629 output will be unbuffered, and interrupts will be ignored. The
630 spelling @option{-e} exists for compatibility with other @code{m4}
631 implementations, and issues a warning because it may be withdrawn in a
632 future version of GNU M4.
635 @itemx --prefix-builtins
636 Internally modify @emph{all} builtin macro names so they all start with
637 the prefix @samp{m4_}. For example, using this option, one should write
638 @samp{m4_define} instead of @samp{define}, and @samp{m4___file__}
639 instead of @samp{__file__}. This option has no effect if @option{-R}
645 Suppress warnings, such as missing or superfluous arguments in macro
646 calls, or treating the empty string as zero.
648 @item --warn-macro-sequence@r{[}=@var{regexp}@r{]}
649 Issue a warning if the regular expression @var{regexp} has a non-empty
650 match in any macro definition (either by @code{define} or
651 @code{pushdef}). Empty matches are ignored; therefore, supplying the
652 empty string as @var{regexp} disables any warning. If the optional
653 @var{regexp} is not supplied, then the default regular expression is
654 @samp{\$\(@{[^@}]*@}\|[0-9][0-9]+\)} (a literal @samp{$} followed by
655 multiple digits or by an open brace), since these sequences will
656 change semantics in the default operation of GNU M4 2.0 (due
657 to a change in how more than 9 arguments in a macro definition will be
658 handled, @pxref{Arguments}). Providing an alternate regular
659 expression can provide a useful reverse lookup feature of finding
660 where a macro is defined to have a given definition.
662 @item -W @var{regexp}
663 @itemx --word-regexp=@var{regexp}
664 Use @var{regexp} as an alternative syntax for macro names. This
665 experimental option will not be present in all GNU @code{m4}
666 implementations (@pxref{Changeword}).
669 @node Preprocessor features
670 @section Command line options for preprocessor features
672 @cindex macro definitions, on the command line
673 @cindex command line, macro definitions on the
674 @cindex preprocessor features
675 Several options allow @code{m4} to behave more like a preprocessor.
676 Macro definitions and deletions can be made on the command line, the
677 search path can be altered, and the output file can track where the
678 input came from. These features occur with the following options:
681 @item -D @var{name}@r{[}=@var{value}@r{]}
682 @itemx --define=@var{name}@r{[}=@var{value}@r{]}
683 This enters @var{name} into the symbol table. If @samp{=@var{value}} is
684 missing, the value is taken to be the empty string. The @var{value} can
685 be any string, and the macro can be defined to take arguments, just as
686 if it was defined from within the input. This option may be given more
687 than once; order with respect to file names is significant, and
688 redefining the same @var{name} loses the previous value.
690 @item -I @var{directory}
691 @itemx --include=@var{directory}
692 Make @code{m4} search @var{directory} for included files that are not
693 found in the current working directory. @xref{Search Path}, for more
694 details. This option may be given more than once.
698 @cindex synchronization lines
699 @cindex location, input
700 @cindex input location
701 Generate synchronization lines, for use by the C preprocessor or other
702 similar tools. Order is significant with respect to file names. This
703 option is useful, for example, when @code{m4} is used as a
704 front end to a compiler. Source file name and line number information
705 is conveyed by directives of the form @samp{#line @var{linenum}
706 "@var{file}"}, which are inserted as needed into the middle of the
707 output. Such directives mean that the following line originated or was
708 expanded from the contents of input file @var{file} at line
709 @var{linenum}. The @samp{"@var{file}"} part is often omitted when
710 the file name did not change from the previous directive.
712 Synchronization directives are always given on complete lines by
713 themselves. When a synchronization discrepancy occurs in the middle of
714 an output line, the associated synchronization directive is delayed
715 until the next newline that does not occur in the middle of a quoted
722 @result{}#line 2 "stdin"
724 changecom(`/*', `*/')
726 define(`comment', `/*1
753 @itemx --undefine=@var{name}
754 This deletes any predefined meaning @var{name} might have. Obviously,
755 only predefined macros can be deleted in this way. This option may be
756 given more than once; undefining a @var{name} that does not have a
757 definition is silently ignored. Order is significant with respect to
762 @section Command line options for limits control
764 There are some limits within @code{m4} that can be tuned. For
765 compatibility, @code{m4} also accepts some options that control limits
766 in other implementations, but which are automatically unbounded (limited
767 only by your hardware and operating system constraints) in GNU
773 Enable all the extensions in this implementation. In this release of
774 M4, this option is always on by default; it is currently only useful
775 when overriding a prior use of @option{--traditional}. However, having
776 GNU behavior as default makes it impossible to write a
777 strictly POSIX-compliant client that avoids all incompatible
778 GNU M4 extensions, since such a client would have to use the
779 non-POSIX command-line option to force full POSIX
780 behavior. Thus, a future version of M4 will be changed to implicitly
781 use the option @option{--traditional} if the environment variable
782 @env{POSIXLY_CORRECT} is set. Projects that intentionally use
783 GNU extensions should consider using @option{--gnu} to state
784 their intentions, so that the project will not mysteriously break if the
785 user upgrades to a newer M4 and has @env{POSIXLY_CORRECT} set in their
790 Suppress all the extensions made in this implementation, compared to the
791 System V version. @xref{Compatibility}, for a list of these.
794 @itemx --hashsize=@var{num}
795 Make the internal hash table for symbol lookup be @var{num} entries big.
796 For better performance, the number should be prime, but this is not
797 checked. The default is 509 entries. It should not be necessary to
798 increase this value, unless you define an excessive number of macros.
801 @itemx --nesting-limit=@var{num}
802 @cindex nesting limit
803 @cindex limit, nesting
804 Artificially limit the nesting of macro calls to @var{num} levels,
805 stopping program execution if this limit is ever exceeded. When not
806 specified, nesting defaults to unlimited on platforms that can detect
807 stack overflow, and to 1024 levels otherwise. A value of zero means
808 unlimited; but then heavily nested code could potentially cause a stack
811 The precise effect of this option is more correctly associated
812 with textual nesting than dynamic recursion. It has been useful
813 when some complex @code{m4} input was generated by mechanical means, and
814 also in diagnosing recursive algorithms that do not scale well.
815 Most users never need to change this option from its default.
818 This option does @emph{not} have the ability to break endless
819 rescanning loops, since these do not necessarily consume much memory
820 or stack space. Through clever usage of rescanning loops, one can
821 request complex, time-consuming computations from @code{m4} with useful
822 results. Putting limitations in this area would break @code{m4} power.
823 There are many pathological cases: @w{@samp{define(`a', `a')a}} is
824 only the simplest example (but @pxref{Compatibility}). Expecting GNU
825 @code{m4} to detect these would be a little like expecting a compiler
826 system to detect and diagnose endless loops: it is a quite @emph{hard}
827 problem in general, if not undecidable!
832 These options are present for compatibility with System V @code{m4}, but
833 do nothing in this implementation. They may disappear in future
834 releases, and issue a warning to that effect.
837 @itemx --diversions=@var{num}
838 These options are present only for compatibility with previous
839 versions of GNU @code{m4}, and were controlling the number of
840 possible diversions which could be used at the same time. They do nothing,
841 because there is no fixed limit anymore. They may disappear in future
842 releases, and issue a warning to that effect.
846 @section Command line options for frozen state
848 GNU @code{m4} comes with a feature of freezing internal state
849 (@pxref{Frozen files}). This can be used to speed up @code{m4}
850 execution when reusing a common initialization script.
854 @itemx --freeze-state=@var{file}
855 Once execution is finished, write out the frozen state on the specified
856 @var{file}. It is conventional, but not required, for @var{file} to end
860 @itemx --reload-state=@var{file}
861 Before execution starts, recover the internal state from the specified
862 frozen @var{file}. The options @option{-D}, @option{-U}, and
863 @option{-t} take effect after state is reloaded, but before the input
867 @node Debugging options
868 @section Command line options for debugging
870 Finally, there are several options for aiding in debugging @code{m4}
874 @item -d@r{[}@var{flags}@r{]}
875 @itemx --debug@r{[}=@var{flags}@r{]}
876 Set the debug-level according to the flags @var{flags}. The debug-level
877 controls the format and amount of information presented by the debugging
878 functions. @xref{Debug Levels}, for more details on the format and
879 meaning of @var{flags}. If omitted, @var{flags} defaults to @samp{aeq}.
881 @item --debugfile@r{[}=@var{file}@r{]}
883 @itemx --error-output=@var{file}
884 Redirect @code{dumpdef} output, debug messages, and trace output to the
885 named @var{file}. Warnings, error messages, and @code{errprint} output
886 are still printed to standard error. If these options are not used, or
887 if @var{file} is unspecified (only possible for @option{--debugfile}),
888 debug output goes to standard error; if @var{file} is the empty string,
889 debug output is discarded. @xref{Debug Output}, for more details. The
890 option @option{--debugfile} may be given more than once, and order is
891 significant with respect to file names. The spellings @option{-o} and
892 @option{--error-output} are misleading and inconsistent with other
893 GNU tools; for now they are silently accepted as synonyms of
894 @option{--debugfile} and only recognized once, but in a future version
895 of M4, using them will cause a warning to be issued.
898 @comment not worth including in the manual, but provides a good test
901 @comment options: -Dbar=hello -tbar --debugfile= foo --debugfile -
903 $ @kbd{m4 -d -Iexamples -Dbar=hello -tbar --debugfile= foo --debugfile -
909 @error{}m4trace: -1- bar -> `hello'
915 @itemx --arglength=@var{num}
916 Restrict the size of the output generated by macro tracing to @var{num}
917 characters per trace line. If unspecified or zero, output is
918 unlimited. @xref{Debug Levels}, for more details.
921 @itemx --trace=@var{name}
922 This enables tracing for the macro @var{name}, at any point where it is
923 defined. @var{name} need not be defined when this option is given.
924 This option may be given more than once, and order is significant with
925 respect to file names. @xref{Trace}, for more details.
928 @node Command line files
929 @section Specifying input files on the command line
931 @cindex command line, file names on the
932 @cindex file names, on the command line
933 The remaining arguments on the command line are taken to be input file
934 names. If no names are present, standard input is read. A file
935 name of @file{-} is taken to mean standard input. It is
936 conventional, but not required, for input files to end in @samp{.m4}.
938 The input files are read in the sequence given. Standard input can be
939 read more than once, so the file name @file{-} may appear multiple times
940 on the command line; this makes a difference when input is from a
941 terminal or other special file type. It is an error if an input file
942 ends in the middle of argument collection, a comment, or a quoted
945 The options @option{--define} (@option{-D}), @option{--undefine}
946 (@option{-U}), @option{--synclines} (@option{-s}), and @option{--trace}
947 (@option{-t}) only take effect after processing input from any file
948 names that occur earlier on the command line. For example, assume the
949 file @file{foo} contains:
957 The text @samp{bar} can then be redefined over multiple uses of
960 @comment options: -Dbar=hello foo -Dbar=world foo
962 $ @kbd{m4 -Dbar=hello foo -Dbar=world foo}
967 If none of the input files invoked @code{m4exit} (@pxref{M4exit}), the
968 exit status of @code{m4} will be 0 for success, 1 for general failure
969 (such as problems with reading an input file), and 63 for version
970 mismatch (@pxref{Using frozen files}).
972 If you need to read a file whose name starts with a @file{-}, you can
973 specify it as @samp{./-file}, or use @option{--} to mark the end of
977 @comment Test that 'm4 file/' detects that file is not a directory; we
978 @comment can assume that the current directory contains a Makefile.
979 @comment mingw fails with EINVAL rather than ENOTDIR.
982 @comment xerr: ignore
983 @comment options: Makefile/
985 @error{}m4: cannot open `Makefile/': Not a directory
988 @comment Test that closed stderr does not cause a crash. Not all
989 @comment systems have the same message for EBADF.
991 @comment xerr: ignore
994 `errprint(` skipping: syscmd does not have unix semantics
996 syscmd(`echo | cat >&- 2>/dev/null')ifelse(sysval, `0',
997 `errprint(` skipping: system does not allow closing stdout
999 changequote(`[', `]')dnl
1000 syscmd([echo | ']__program__[' >&-])dnl
1001 @error{}m4: write error: Bad file descriptor
1008 `errprint(` skipping: syscmd does not have unix semantics
1010 syscmd(`echo | cat >&- 2>/dev/null')ifelse(sysval, `0',
1011 `errprint(` skipping: system does not allow closing stdout
1013 changequote(`[', `]')dnl
1014 syscmd([echo 'esyscmd(echo hi >&2 && echo err"print(bye
1015 )d"nl)dnl' > tmp.m4 \
1016 && ']__program__[' tmp.m4 <&- >&- \
1017 && rm tmp.m4])sysval
1023 @comment Test that we obey POSIX semantics with -D interspersed with
1024 @comment files, even with POSIXLY_CORRECT (BSD getopt gets it wrong).
1029 `errprint(` skipping: syscmd does not have unix semantics
1031 changequote(`[', `]')dnl
1032 syscmd([POSIXLY_CORRECT=1 ']__program__[' -Dbar=hello foo -Dbar=world foo])dnl
1041 @chapter Lexical and syntactic conventions
1043 @cindex input tokens
1045 As @code{m4} reads its input, it separates it into @dfn{tokens}. A
1046 token is either a name, a quoted string, or any single character, that
1047 is not a part of either a name or a string. Input to @code{m4} can also
1048 contain comments. GNU @code{m4} does not yet understand
1049 multibyte locales; all operations are byte-oriented rather than
1050 character-oriented (although if your locale uses a single byte
1051 encoding, such as @sc{ISO-8859-1}, you will not notice a difference).
1052 However, @code{m4} is eight-bit clean, so you can
1053 use non-@sc{ascii} characters in quoted strings (@pxref{Changequote}),
1054 comments (@pxref{Changecom}), and macro names (@pxref{Indir}), with the
1055 exception of the @sc{nul} character (the zero byte @samp{'\0'}).
1058 * Names:: Macro names
1059 * Quoted strings:: Quoting input to @code{m4}
1060 * Comments:: Comments in @code{m4} input
1061 * Other tokens:: Other kinds of input tokens
1062 * Input processing:: How @code{m4} copies input to output
1066 @section Macro names
1070 A name is any sequence of letters, digits, and the character @samp{_}
1071 (underscore), where the first character is not a digit. @code{m4} will
1072 use the longest such sequence found in the input. If a name has a
1073 macro definition, it will be subject to macro expansion
1074 (@pxref{Macros}). Names are case-sensitive.
1076 Examples of legal names are: @samp{foo}, @samp{_tmp}, and @samp{name01}.
1078 @node Quoted strings
1079 @section Quoting input to @code{m4}
1081 @cindex quoted string
1082 @cindex string, quoted
1083 A quoted string is a sequence of characters surrounded by quote
1084 strings, defaulting to
1085 @samp{`} and @samp{'}, where the nested begin and end quotes within the
1086 string are balanced. The value of a string token is the text, with one
1087 level of quotes stripped off. Thus
1096 is the empty string, and double-quoting turns into single-quoting.
1104 The quote characters can be changed at any time, using the builtin macro
1105 @code{changequote}. @xref{Changequote}, for more information.
1108 @section Comments in @code{m4} input
1111 Comments in @code{m4} are normally delimited by the characters @samp{#}
1112 and newline. All characters between the comment delimiters are ignored,
1113 but the entire comment (including the delimiters) is passed through to
1114 the output---comments are @emph{not} discarded by @code{m4}.
1116 Comments cannot be nested, so the first newline after a @samp{#} ends
1117 the comment. The commenting effect of the begin-comment string
1118 can be inhibited by quoting it.
1122 `quoted text' # `commented text'
1123 @result{}quoted text # `commented text'
1124 `quoting inhibits' `#' `comments'
1125 @result{}quoting inhibits # comments
1128 The comment delimiters can be changed to any string at any time, using
1129 the builtin macro @code{changecom}. @xref{Changecom}, for more
1133 @comment Detect regression in 1.4.10b in regards to reparsing comments.
1134 @comment Not worth including in the manual.
1136 define(`e', `$@@')define(`q', ``$@@'')define(`foo', `bar')
1142 @result{}',`#two bar
1144 changecom(`<', `>')define(`n', `$#')
1154 @section Other kinds of input tokens
1156 @cindex tokens, special
1157 Any character, that is neither a part of a name, nor of a quoted string,
1158 nor a comment, is a token by itself. When not in the context of macro
1159 expansion, all of these tokens are just copied to output. However,
1160 during macro expansion, whitespace characters (space, tab, newline,
1161 formfeed, carriage return, vertical tab), parentheses (@samp{(} and
1162 @samp{)}), comma (@samp{,}), and dollar (@samp{$}) have additional
1163 roles, explained later.
1165 @node Input processing
1166 @section How @code{m4} copies input to output
1168 As @code{m4} reads the input token by token, it will copy each token
1169 directly to the output immediately.
1171 The exception is when it finds a word with a macro definition. In that
1172 case @code{m4} will calculate the macro's expansion, possibly reading
1173 more input to get the arguments. It then inserts the expansion in front
1174 of the remaining input. In other words, the resulting text from a macro
1175 call will be read and parsed into tokens again.
1177 @code{m4} expands a macro as soon as possible. If it finds a macro call
1178 when collecting the arguments to another, it will expand the second call
1179 first. This process continues until there are no more macro calls to
1180 expand and all the input has been consumed.
1182 For a running example, examine how @code{m4} handles this input:
1186 format(`Result is %d', eval(`2**15'))
1190 First, @code{m4} sees that the token @samp{format} is a macro name, so
1191 it collects the tokens @samp{(}, @samp{`Result is %d'}, @samp{,},
1192 and @samp{@w{ }}, before encountering another potential macro. Sure
1193 enough, @samp{eval} is a macro name, so the nested argument collection
1194 picks up @samp{(}, @samp{`2**15'}, and @samp{)}, invoking the eval macro
1195 with the lone argument of @samp{2**15}. The expansion of
1196 @samp{eval(2**15)} is @samp{32768}, which is then rescanned as the five
1197 tokens @samp{3}, @samp{2}, @samp{7}, @samp{6}, and @samp{8}; and
1198 combined with the next @samp{)}, the format macro now has all its
1199 arguments, as if the user had typed:
1203 format(`Result is %d', 32768)
1207 The format macro expands to @samp{Result is 32768}, and we have another
1208 round of scanning for the tokens @samp{Result}, @samp{@w{ }},
1209 @samp{is}, @samp{@w{ }}, @samp{3}, @samp{2}, @samp{7}, @samp{6}, and
1210 @samp{8}. None of these are macros, so the final output is
1214 @result{}Result is 32768
1217 As a more complicated example, we will contrast an actual code
1218 example from the Gnulib project@footnote{Derived from a patch in
1219 @uref{http://lists.gnu.org/archive/html/bug-gnulib/@/2007-01/@/msg00389.html},
1220 and a followup patch in
1221 @uref{http://lists.gnu.org/archive/html/bug-gnulib/@/2007-02/@/msg00000.html}},
1222 showing both a buggy approach and the desired results. The user desires
1223 to output a shell assignment statement that takes its argument and turns
1224 it into a shell variable by converting it to uppercase and prepending a
1225 prefix. The original attempt looks like this:
1229 define([gl_STRING_MODULE_INDICATOR],
1232 GNULIB_]translit([$1],[a-z],[A-Z])[=1
1234 gl_STRING_MODULE_INDICATOR([strcase])
1236 @result{} GNULIB_strcase=1
1240 Oops -- the argument did not get capitalized. And although the manual
1241 is not able to easily show it, both lines that appear empty actually
1242 contain two trailing spaces. By stepping through the parse, it is easy
1243 to see what happened. First, @code{m4} sees the token
1244 @samp{changequote}, which it recognizes as a macro, followed by
1245 @samp{(}, @samp{[}, @samp{,}, @samp{]}, and @samp{)} to form the
1246 argument list. The macro expands to the empty string, but changes the
1247 quoting characters to something more useful for generating shell code
1248 (unbalanced @samp{`} and @samp{'} appear all the time in shell scripts,
1249 but unbalanced @samp{[]} tend to be rare). Also in the first line,
1250 @code{m4} sees the token @samp{dnl}, which it recognizes as a builtin
1251 macro that consumes the rest of the line, resulting in no output for
1254 The second line starts a macro definition. @code{m4} sees the token
1255 @samp{define}, which it recognizes as a macro, followed by a @samp{(},
1256 @samp{[gl_STRING_MODULE_INDICATOR]}, and @samp{,}. Because an unquoted
1257 comma was encountered, the first argument is known to be the expansion
1258 of the single-quoted string token, or @samp{gl_STRING_MODULE_INDICATOR}.
1259 Next, @code{m4} sees @samp{@key{NL}}, @samp{ }, and @samp{ }, but this
1260 whitespace is discarded as part of argument collection. Then comes a
1261 rather lengthy single-quoted string token, @samp{[@key{NL}@ @ @ @ dnl
1262 comment@key{NL}@ @ @ @ GNULIB_]}. This is followed by the token
1263 @samp{translit}, which @code{m4} recognizes as a macro name, so a nested
1264 macro expansion has started.
1266 The arguments to the @code{translit} are found by the tokens @samp{(},
1267 @samp{[$1]}, @samp{,}, @samp{[a-z]}, @samp{,}, @samp{[A-Z]}, and finally
1268 @samp{)}. All three string arguments are expanded (or in other words,
1269 the quotes are stripped), and since neither @samp{$} nor @samp{1} need
1270 capitalization, the result of the macro is @samp{$1}. This expansion is
1271 rescanned, resulting in the two literal characters @samp{$} and
1274 Scanning of the outer macro resumes, and picks up with
1275 @samp{[=1@key{NL}@ @ ]}, and finally @samp{)}. The collected pieces of
1276 expanded text are concatenated, with the end result that the macro
1277 @samp{gl_STRING_MODULE_INDICATOR} is now defined to be the sequence
1278 @samp{@key{NL}@ @ @ @ dnl comment@key{NL}@ @ @ @ GNULIB_$1=1@key{NL}@ @ }.
1279 Once again, @samp{dnl} is recognized and avoids a newline in the output.
1281 The final line is then parsed, beginning with @samp{ } and @samp{ }
1282 that are output literally. Then @samp{gl_STRING_MODULE_INDICATOR} is
1283 recognized as a macro name, with an argument list of @samp{(},
1284 @samp{[strcase]}, and @samp{)}. Since the definition of the macro
1285 contains the sequence @samp{$1}, that sequence is replaced with the
1286 argument @samp{strcase} prior to starting the rescan. The rescan sees
1287 @samp{@key{NL}} and four spaces, which are output literally, then
1288 @samp{dnl}, which discards the text @samp{ comment@key{NL}}. Next
1289 comes four more spaces, also output literally, and the token
1290 @samp{GNULIB_strcase}, which resulted from the earlier parameter
1291 substitution. Since that is not a macro name, it is output literally,
1292 followed by the literal tokens @samp{=}, @samp{1}, @samp{@key{NL}}, and
1293 two more spaces. Finally, the original @samp{@key{NL}} seen after the
1294 macro invocation is scanned and output literally.
1296 Now for a corrected approach. This rearranges the use of newlines and
1297 whitespace so that less whitespace is output (which, although harmless
1298 to shell scripts, can be visually unappealing), and fixes the quoting
1299 issues so that the capitalization occurs when the macro
1300 @samp{gl_STRING_MODULE_INDICATOR} is invoked, rather then when it is
1301 defined. It also adds another layer of quoting to the first argument of
1302 @code{translit}, to ensure that the output will be rescanned as a string
1303 rather than a potential uppercase macro name needing further expansion.
1307 define([gl_STRING_MODULE_INDICATOR],
1309 GNULIB_[]translit([[$1]], [a-z], [A-Z])=1dnl
1311 gl_STRING_MODULE_INDICATOR([strcase])
1312 @result{} GNULIB_STRCASE=1
1315 The parsing of the first line is unchanged. The second line sees the
1316 name of the macro to define, then sees the discarded @samp{@key{NL}}
1317 and two spaces, as before. But this time, the next token is
1318 @samp{[dnl comment@key{NL}@ @ GNULIB_[]translit([[$1]], [a-z],
1319 [A-Z])=1dnl@key{NL}]}, which includes nested quotes, followed by
1320 @samp{)} to end the macro definition and @samp{dnl} to skip the
1321 newline. No early expansion of @code{translit} occurs, so the entire
1322 string becomes the definition of the macro.
1324 The final line is then parsed, beginning with two spaces that are
1325 output literally, and an invocation of
1326 @code{gl_STRING_MODULE_INDICATOR} with the argument @samp{strcase}.
1327 Again, the @samp{$1} in the macro definition is substituted prior to
1328 rescanning. Rescanning first encounters @samp{dnl}, and discards
1329 @samp{ comment@key{NL}}. Then two spaces are output literally. Next
1330 comes the token @samp{GNULIB_}, but that is not a macro, so it is
1331 output literally. The token @samp{[]} is an empty string, so it does
1332 not affect output. Then the token @samp{translit} is encountered.
1334 This time, the arguments to @code{translit} are parsed as @samp{(},
1335 @samp{[[strcase]]}, @samp{,}, @samp{ }, @samp{[a-z]}, @samp{,}, @samp{ },
1336 @samp{[A-Z]}, and @samp{)}. The two spaces are discarded, and the
1337 translit results in the desired result @samp{[STRCASE]}. This is
1338 rescanned, but since it is a string, the quotes are stripped and the
1339 only output is a literal @samp{STRCASE}.
1340 Then the scanner sees @samp{=} and @samp{1}, which are output
1341 literally, followed by @samp{dnl} which discards the rest of the
1342 definition of @code{gl_STRING_MODULE_INDICATOR}. The newline at the
1343 end of output is the literal @samp{@key{NL}} that appeared after the
1344 invocation of the macro.
1346 The order in which @code{m4} expands the macros can be further explored
1347 using the trace facilities of GNU @code{m4} (@pxref{Trace}).
1350 @chapter How to invoke macros
1352 This chapter covers macro invocation, macro arguments and how macro
1353 expansion is treated.
1356 * Invocation:: Macro invocation
1357 * Inhibiting Invocation:: Preventing macro invocation
1358 * Macro Arguments:: Macro arguments
1359 * Quoting Arguments:: On Quoting Arguments to macros
1360 * Macro expansion:: Expanding macros
1364 @section Macro invocation
1366 @cindex macro invocation
1367 @cindex invoking macros
1368 Macro invocations has one of the forms
1376 which is a macro invocation without any arguments, or
1380 name(arg1, arg2, @dots{}, arg@var{n})
1384 which is a macro invocation with @var{n} arguments. Macros can have any
1385 number of arguments. All arguments are strings, but different macros
1386 might interpret the arguments in different ways.
1388 The opening parenthesis @emph{must} follow the @var{name} directly, with
1389 no spaces in between. If it does not, the macro is called with no
1392 For a macro call to have no arguments, the parentheses @emph{must} be
1393 left out. The macro call
1401 is a macro call with one argument, which is the empty string, not a call
1404 @node Inhibiting Invocation
1405 @section Preventing macro invocation
1407 An innovation of the @code{m4} language, compared to some of its
1408 predecessors (like Strachey's @code{GPM}, for example), is the ability
1409 to recognize macro calls without resorting to any special, prefixed
1410 invocation character. While generally useful, this feature might
1411 sometimes be the source of spurious, unwanted macro calls. So, GNU
1412 @code{m4} offers several mechanisms or techniques for inhibiting the
1413 recognition of names as macro calls.
1415 @cindex GNU extensions
1417 @cindex macro, blind
1418 First of all, many builtin macros cannot meaningfully be called without
1419 arguments. As a GNU extension, for any of these macros,
1420 whenever an opening parenthesis does not immediately follow their name,
1421 the builtin macro call is not triggered. This solves the most usual
1422 cases, like for @samp{include} or @samp{eval}. Later in this document,
1423 the sentence ``This macro is recognized only with parameters'' refers to
1424 this specific provision of GNU M4, also known as a blind
1425 builtin macro. For the builtins defined by POSIX that bear
1426 this disclaimer, POSIX specifically states that invoking those
1427 builtins without arguments is unspecified, because many other
1428 implementations simply invoke the builtin as though it were given one
1429 empty argument instead.
1439 There is also a command line option (@option{--prefix-builtins}, or
1440 @option{-P}, @pxref{Operation modes, , Invoking m4}) that renames all
1441 builtin macros with a prefix of @samp{m4_} at startup. The option has
1442 no effect whatsoever on user defined macros. For example, with this option,
1443 one has to write @code{m4_dnl} and even @code{m4_m4exit}. It also has
1444 no effect on whether a macro requires parameters.
1446 @comment options: -P
1459 Another alternative is to redefine problematic macros to a name less
1460 likely to cause conflicts, using @ref{Definitions}.
1462 If your version of GNU @code{m4} has the @code{changeword} feature
1463 compiled in, it offers far more flexibility in specifying the
1464 syntax of macro names, both builtin or user-defined. @xref{Changeword},
1465 for more information on this experimental feature.
1467 Of course, the simplest way to prevent a name from being interpreted
1468 as a call to an existing macro is to quote it. The remainder of
1469 this section studies a little more deeply how quoting affects macro
1470 invocation, and how quoting can be used to inhibit macro invocation.
1472 Even if quoting is usually done over the whole macro name, it can also
1473 be done over only a few characters of this name (provided, of course,
1474 that the unquoted portions are not also a macro). It is also possible
1475 to quote the empty string, but this works only @emph{inside} the name.
1490 all yield the string @samp{divert}. While in both:
1500 the @code{divert} builtin macro will be called, which expands to the
1504 The output of macro evaluations is always rescanned. In the following
1505 example, the input @samp{x`'y} yields the string @samp{bCD}, exactly as
1507 has been given @w{@samp{substr(ab`'cde, `1', `3')}} as input:
1510 define(`cde', `CDE')
1512 define(`x', `substr(ab')
1514 define(`y', `cde, `1', `3')')
1521 @comment Similar, but with argument references, to ensure good test
1524 define(`x1', `len(`$1'')
1526 define(`y1', ``$1')')
1528 x1(`01234567890123456789')y1(`98765432109876543210')
1533 Unquoted strings on either side of a quoted string are subject to
1534 being recognized as macro names. In the following example, quoting the
1535 empty string allows for the second @code{macro} to be recognized as such:
1538 define(`macro', `m')
1546 Quoting may prevent recognizing as a macro name the concatenation of a
1547 macro expansion with the surrounding characters. In this example:
1550 define(`macro', `di$1')
1559 the input will produce the string @samp{divert}. When the quotes were
1560 removed, the @code{divert} builtin was called instead.
1562 @node Macro Arguments
1563 @section Macro arguments
1565 @cindex macros, arguments to
1566 @cindex arguments to macros
1567 When a name is seen, and it has a macro definition, it will be expanded
1570 If the name is followed by an opening parenthesis, the arguments will be
1571 collected before the macro is called. If too few arguments are
1572 supplied, the missing arguments are taken to be the empty string.
1573 However, some builtins are documented to behave differently for a
1574 missing optional argument than for an explicit empty string. If there
1575 are too many arguments, the excess arguments are ignored. Unquoted
1576 leading whitespace is stripped off all arguments, but whitespace
1577 generated by a macro expansion or occurring after a macro that expanded
1578 to an empty string remains intact. Whitespace includes space, tab,
1579 newline, carriage return, vertical tab, and formfeed.
1582 define(`macro', `$1')
1584 macro( unquoted leading space lost)
1585 @result{}unquoted leading space lost
1586 macro(` quoted leading space kept')
1587 @result{} quoted leading space kept
1589 divert `unquoted space kept after expansion')
1590 @result{} unquoted space kept after expansion
1592 ')`whitespace from expansion kept')
1594 @result{}whitespace from expansion kept
1595 macro(`unquoted trailing whitespace kept'
1597 @result{}unquoted trailing whitespace kept
1601 @cindex warnings, suppressing
1602 @cindex suppressing warnings
1603 Normally @code{m4} will issue warnings if a builtin macro is called
1604 with an inappropriate number of arguments, but it can be suppressed with
1605 the @option{--quiet} command line option (or @option{--silent}, or
1606 @option{-Q}, @pxref{Operation modes, , Invoking m4}). For user
1607 defined macros, there is no check of the number of arguments given.
1612 @error{}m4:stdin:1: Warning: too few arguments to builtin `index'
1616 index(`abc', `b', `ignored')
1617 @error{}m4:stdin:3: Warning: excess arguments to builtin `index' ignored
1621 @comment options: -Q
1628 index(`abc', `b', `ignored')
1632 Macros are expanded normally during argument collection, and whatever
1633 commas, quotes and parentheses that might show up in the resulting
1634 expanded text will serve to define the arguments as well. Thus, if
1635 @var{foo} expands to @samp{, b, c}, the macro call
1643 is a macro call with four arguments, which are @samp{a }, @samp{b},
1644 @samp{c} and @samp{d}. To understand why the first argument contains
1645 whitespace, remember that unquoted leading whitespace is never part
1646 of an argument, but trailing whitespace always is.
1648 It is possible for a macro's definition to change during argument
1649 collection, in which case the expansion uses the definition that was in
1650 effect at the time the opening @samp{(} was seen.
1661 It is an error if the end of file occurs while collecting arguments.
1666 @result{}hello world
1669 @error{}m4:stdin:2: ERROR: end of file in argument list
1672 @node Quoting Arguments
1673 @section On Quoting Arguments to macros
1675 @cindex quoted macro arguments
1676 @cindex macros, quoted arguments to
1677 @cindex arguments, quoted macro
1678 Each argument has unquoted leading whitespace removed. Within each
1679 argument, all unquoted parentheses must match. For example, if
1680 @var{foo} is a macro,
1688 is a macro call, with one argument, whose value is @samp{() (() (}.
1689 Commas separate arguments, except when they occur inside quotes,
1690 comments, or unquoted parentheses. @xref{Pseudo Arguments}, for
1693 It is common practice to quote all arguments to macros, unless you are
1694 sure you want the arguments expanded. Thus, in the above
1695 example with the parentheses, the `right' way to do it is like this:
1702 @cindex quoting rule of thumb
1703 @cindex rule of thumb, quoting
1704 It is, however, in certain cases necessary (because nested expansion
1705 must occur to create the arguments for the outer macro) or convenient
1706 (because it uses fewer characters) to leave out quotes for some
1707 arguments, and there is nothing wrong in doing it. It just makes life a
1708 bit harder, if you are not careful to follow a consistent quoting style.
1709 For consistency, this manual follows the rule of thumb that each layer
1710 of parentheses introduces another layer of single quoting, except when
1711 showing the consequences of quoting rules. This is done even when the
1712 quoted string cannot be a macro, such as with integers when you have not
1713 changed the syntax via @code{changeword} (@pxref{Changeword}).
1715 The quoting rule of thumb of one level of quoting per parentheses has a
1716 nice property: when a macro name appears inside parentheses, you can
1717 determine when it will be expanded. If it is not quoted, it will be
1718 expanded prior to the outer macro, so that its expansion becomes the
1719 argument. If it is single-quoted, it will be expanded after the outer
1720 macro. And if it is double-quoted, it will be used as literal text
1721 instead of a macro name.
1724 define(`active', `ACT, IVE')
1726 define(`show', `$1 $1')
1731 @result{}ACT, IVE ACT, IVE
1733 @result{}active active
1736 @node Macro expansion
1737 @section Macro expansion
1739 @cindex macros, expansion of
1740 @cindex expansion of macros
1741 When the arguments, if any, to a macro call have been collected, the
1742 macro is expanded, and the expansion text is pushed back onto the input
1743 (unquoted), and reread. The expansion text from one macro call might
1744 therefore result in more macros being called, if the calls are included,
1745 completely or partially, in the first macro calls' expansion.
1747 Taking a very simple example, if @var{foo} expands to @samp{bar}, and
1748 @var{bar} expands to @samp{Hello}, the input
1750 @comment options: -Dbar=Hello -Dfoo=bar
1752 $ @kbd{m4 -Dbar=Hello -Dfoo=bar}
1758 will expand first to @samp{bar}, and when this is reread and
1759 expanded, into @samp{Hello}.
1762 @comment not worth documenting, but test that the command line can
1763 @comment define macros that take parameters
1765 @comment options: -Dfoo -Decho=$@
1767 $ @kbd{m4 -Dfoo -Decho='$@'}
1770 foo(`silently ignored')
1778 @chapter How to define new macros
1780 @cindex macros, how to define new
1781 @cindex defining new macros
1782 Macros can be defined, redefined and deleted in several different ways.
1783 Also, it is possible to redefine a macro without losing a previous
1784 value, and bring back the original value at a later time.
1787 * Define:: Defining a new macro
1788 * Arguments:: Arguments to macros
1789 * Pseudo Arguments:: Special arguments to macros
1790 * Undefine:: Deleting a macro
1791 * Defn:: Renaming macros
1792 * Pushdef:: Temporarily redefining macros
1794 * Indir:: Indirect call of macros
1795 * Builtin:: Indirect call of builtins
1799 @section Defining a macro
1801 The normal way to define or redefine macros is to use the builtin
1804 @deffn Builtin define (@var{name}, @ovar{expansion})
1805 Defines @var{name} to expand to @var{expansion}. If
1806 @var{expansion} is not given, it is taken to be empty.
1808 The expansion of @code{define} is void.
1809 The macro @code{define} is recognized only with parameters.
1812 The following example defines the macro @var{foo} to expand to the text
1813 @samp{Hello World.}.
1816 define(`foo', `Hello world.')
1819 @result{}Hello world.
1822 The empty line in the output is there because the newline is not
1823 a part of the macro definition, and it is consequently copied to
1824 the output. This can be avoided by use of the macro @code{dnl}.
1825 @xref{Dnl}, for details.
1827 The first argument to @code{define} should be quoted; otherwise, if the
1828 macro is already defined, you will be defining a different macro. This
1829 example shows the problems with underquoting, since we did not want to
1830 redefine @code{one}:
1841 @cindex GNU extensions
1842 GNU @code{m4} normally replaces only the @emph{topmost}
1843 definition of a macro if it has several definitions from @code{pushdef}
1844 (@pxref{Pushdef}). Some other implementations of @code{m4} replace all
1845 definitions of a macro with @code{define}. @xref{Incompatibilities},
1848 As a GNU extension, the first argument to @code{define} does
1849 not have to be a simple word.
1850 It can be any text string, even the empty string. A macro with a
1851 non-standard name cannot be invoked in the normal way, as the name is
1852 not recognized. It can only be referenced by the builtins @code{indir}
1853 (@pxref{Indir}) and @code{defn} (@pxref{Defn}).
1856 Arrays and associative arrays can be simulated by using non-standard
1859 @deffn Composite array (@var{index})
1860 @deffnx Composite array_set (@var{index}, @ovar{value})
1861 Provide access to entries within an array. @code{array} reads the entry
1862 at location @var{index}, and @code{array_set} assigns @var{value} to
1863 location @var{index}.
1867 define(`array', `defn(format(``array[%d]'', `$1'))')
1869 define(`array_set', `define(format(``array[%d]'', `$1'), `$2')')
1871 array_set(`4', `array element no. 4')
1873 array_set(`17', `array element no. 17')
1876 @result{}array element no. 4
1877 array(eval(`10 + 7'))
1878 @result{}array element no. 17
1881 Change the @samp{%d} to @samp{%s} and it is an associative array.
1884 @section Arguments to macros
1886 @cindex macros, arguments to
1887 @cindex arguments to macros
1888 Macros can have arguments. The @var{n}th argument is denoted by
1889 @code{$n} in the expansion text, and is replaced by the @var{n}th actual
1890 argument, when the macro is expanded. Replacement of arguments happens
1891 before rescanning, regardless of how many nesting levels of quoting
1892 appear in the expansion. Here is an example of a macro with
1895 @deffn Composite exch (@var{arg1}, @var{arg2})
1896 Expands to @var{arg2} followed by @var{arg1}, effectively exchanging
1901 define(`exch', `$2, $1')
1903 exch(`arg1', `arg2')
1907 This can be used, for example, if you like the arguments to
1908 @code{define} to be reversed.
1911 define(`exch', `$2, $1')
1913 define(exch(``expansion text'', ``macro''))
1916 @result{}expansion text
1919 @xref{Quoting Arguments}, for an explanation of the double quotes.
1920 (You should try and improve this example so that clients of @code{exch}
1921 do not have to double quote; or @pxref{Improved exch, , Answers}).
1923 As a special case, the zeroth argument, @code{$0}, is always the name
1924 of the macro being expanded.
1927 define(`test', ``Macro name: $0'')
1930 @result{}Macro name: test
1933 If you want quoted text to appear as part of the expansion text,
1934 remember that quotes can be nested in quoted strings. Thus, in
1937 define(`foo', `This is macro `foo'.')
1940 @result{}This is macro foo.
1944 The @samp{foo} in the expansion text is @emph{not} expanded, since it is
1945 a quoted string, and not a name.
1947 @cindex GNU extensions
1948 @cindex nine arguments, more than
1949 @cindex more than nine arguments
1950 @cindex arguments, more than nine
1951 @cindex positional parameters, more than nine
1952 GNU @code{m4} allows the number following the @samp{$} to
1953 consist of one or more digits, allowing macros to have any number of
1954 arguments. The extension of accepting multiple digits is incompatible
1955 with POSIX, and is different than traditional implementations
1956 of @code{m4}, which only recognize one digit. Therefore, future
1957 versions of GNU M4 will phase out this feature. To portably
1958 access beyond the ninth argument, you can use the @code{argn} macro
1959 documented later (@pxref{Shift}).
1961 POSIX also states that @samp{$} followed immediately by
1962 @samp{@{} in a macro definition is implementation-defined. This version
1963 of M4 passes the literal characters @samp{$@{} through unchanged, but M4
1964 2.0 will implement an optional feature similar to @command{sh}, where
1965 @samp{$@{11@}} expands to the eleventh argument, to replace the current
1966 recognition of @samp{$11}. Meanwhile, if you want to guarantee that you
1967 will get a literal @samp{$@{} in output when expanding a macro, even
1968 when you upgrade to M4 2.0, you can use nested quoting to your
1972 define(`foo', `single quoted $`'@{1@} output')
1974 define(`bar', ``double quoted $'`@{2@} output'')
1977 @result{}single quoted $@{1@} output
1979 @result{}double quoted $@{2@} output
1982 To help you detect places in your M4 input files that might change in
1983 behavior due to the changed behavior of M4 2.0, you can use the
1984 @option{--warn-macro-sequence} command-line option (@pxref{Operation
1985 modes, , Invoking m4}) with the default regular expression. This will
1986 add a warning any time a macro definition includes @samp{$} followed by
1987 multiple digits, or by @samp{@{}. The warning is not enabled by
1988 default, because it triggers a number of warnings in Autoconf 2.61 (and
1989 Autoconf uses @option{-E} to treat warnings as errors), and because it
1990 will still be possible to restore older behavior in M4 2.0.
1992 @comment options: --warn-macro-sequence
1994 $ @kbd{m4 --warn-macro-sequence}
1995 define(`foo', `$001 $@{1@} $1')
1996 @error{}m4:stdin:1: Warning: definition of `foo' contains sequence `$001'
1997 @error{}m4:stdin:1: Warning: definition of `foo' contains sequence `$@{1@}'
2000 @result{}bar $@{1@} bar
2003 @node Pseudo Arguments
2004 @section Special arguments to macros
2006 @cindex special arguments to macros
2007 @cindex macros, special arguments to
2008 @cindex arguments to macros, special
2009 There is a special notation for the number of actual arguments supplied,
2010 and for all the actual arguments.
2012 The number of actual arguments in a macro call is denoted by @code{$#}
2013 in the expansion text.
2015 @deffn Composite nargs (@dots{})
2016 Expands to a count of the number of arguments supplied.
2020 define(`nargs', `$#')
2026 nargs(`arg1', `arg2', `arg3')
2028 nargs(`commas can be quoted, like this')
2030 nargs(arg1#inside comments, commas do not separate arguments
2033 nargs((unquoted parentheses, like this, group arguments))
2037 Remember that @samp{#} defaults to the comment character; if you forget
2038 quotes to inhibit the comment behavior, your macro definition may not
2039 end where you expected.
2042 dnl Attempt to define a macro to just `$#'
2043 define(underquoted, $#)
2051 The notation @code{$*} can be used in the expansion text to denote all
2052 the actual arguments, unquoted, with commas in between. For example
2055 define(`echo', `$*')
2057 echo(arg1, arg2, arg3 , arg4)
2058 @result{}arg1,arg2,arg3 ,arg4
2061 Often each argument should be quoted, and the notation @code{$@@} handles
2062 that. It is just like @code{$*}, except that it quotes each argument.
2063 A simple example of that is:
2066 define(`echo', `$@@')
2068 echo(arg1, arg2, arg3 , arg4)
2069 @result{}arg1,arg2,arg3 ,arg4
2072 Where did the quotes go? Of course, they were eaten, when the expanded
2073 text were reread by @code{m4}. To show the difference, try
2076 define(`echo1', `$*')
2078 define(`echo2', `$@@')
2080 define(`foo', `This is macro `foo'.')
2083 @result{}This is macro This is macro foo..
2085 @result{}This is macro foo.
2087 @result{}This is macro foo.
2093 @xref{Trace}, if you do not understand this. As another example of the
2094 difference, remember that comments encountered in arguments are passed
2095 untouched to the macro, and that quoting disables comments.
2098 define(`echo1', `$*')
2100 define(`echo2', `$@@')
2102 define(`foo', `bar')
2115 @comment Not worth putting in the manual, but this example is needed for
2116 @comment good test coverage of copying large strings across recursion
2120 define(`echo', `$@@')dnl
2121 echo(echo(`01234567890123456789', `01234567890123456789')
2122 echo(`98765432109876543210', `98765432109876543210'))
2123 @result{}01234567890123456789,01234567890123456789
2124 @result{}98765432109876543210,98765432109876543210
2125 len((echo(`01234567890123456789',
2126 `01234567890123456789')echo(`98765432109876543210',
2127 `98765432109876543210')))
2129 indir(`echo', indir(`echo', `01234567890123456789',
2130 `01234567890123456789')
2131 indir(`echo', `98765432109876543210', `98765432109876543210'))
2132 @result{}01234567890123456789,01234567890123456789
2133 @result{}98765432109876543210,98765432109876543210
2134 define(`argn', `$#')dnl
2135 define(`echo1', `-$@@-')define(`echo2', `,$@@,')dnl
2136 echo1(`1', `2', `3') argn(echo1(`1', `2', `3'))
2138 echo2(`1', `2', `3') argn(echo2(`1', `2', `3'))
2143 A @samp{$} sign in the expansion text, that is not followed by anything
2144 @code{m4} understands, is simply copied to the macro expansion, as any
2148 define(`foo', `$$$ hello $$$')
2151 @result{}$$$ hello $$$
2155 @cindex literal output
2156 @cindex output, literal
2157 If you want a macro to expand to something like @samp{$12}, the
2158 judicious use of nested quoting can put a safe character between the
2159 @code{$} and the next character, relying on the rescanning to remove the
2160 nested quote. This will prevent @code{m4} from interpreting the
2161 @code{$} sign as a reference to an argument.
2164 define(`foo', `no nested quote: $1')
2167 @result{}no nested quote: arg
2168 define(`foo', `nested quote around $: `$'1')
2171 @result{}nested quote around $: $1
2172 define(`foo', `nested empty quote after $: $`'1')
2175 @result{}nested empty quote after $: $1
2176 define(`foo', `nested quote around next character: $`1'')
2179 @result{}nested quote around next character: $1
2180 define(`foo', `nested quote around both: `$1'')
2183 @result{}nested quote around both: arg
2187 @section Deleting a macro
2189 @cindex macros, how to delete
2190 @cindex deleting macros
2191 @cindex undefining macros
2192 A macro definition can be removed with @code{undefine}:
2194 @deffn Builtin undefine (@var{name}@dots{})
2195 For each argument, remove the macro @var{name}. The macro names must
2196 necessarily be quoted, since they will be expanded otherwise.
2198 The expansion of @code{undefine} is void.
2199 The macro @code{undefine} is recognized only with parameters.
2204 @result{}foo bar blah
2205 define(`foo', `some')define(`bar', `other')define(`blah', `text')
2208 @result{}some other text
2212 @result{}foo other text
2213 undefine(`bar', `blah')
2216 @result{}foo bar blah
2219 Undefining a macro inside that macro's expansion is safe; the macro
2220 still expands to the definition that was in effect at the @samp{(}.
2223 define(`f', ``$0':$1')
2225 f(f(f(undefine(`f')`hello world')))
2226 @result{}f:f:f:hello world
2231 It is not an error for @var{name} to have no macro definition. In that
2232 case, @code{undefine} does nothing.
2235 @section Renaming macros
2237 @cindex macros, how to rename
2238 @cindex renaming macros
2239 @cindex macros, displaying definitions
2240 @cindex definitions, displaying macro
2241 It is possible to rename an already defined macro. To do this, you need
2242 the builtin @code{defn}:
2244 @deffn Builtin defn (@var{name}@dots{})
2245 Expands to the @emph{quoted definition} of each @var{name}. If an
2246 argument is not a defined macro, the expansion for that argument is
2249 If @var{name} is a user-defined macro, the quoted definition is simply
2250 the quoted expansion text. If, instead, there is only one @var{name}
2251 and it is a builtin, the
2252 expansion is a special token, which points to the builtin's internal
2253 definition. This token is only meaningful as the second argument to
2254 @code{define} (and @code{pushdef}), and is silently converted to an
2255 empty string in most other contexts. Combining a builtin with anything
2256 else is not supported; a warning is issued and the builtin is omitted
2257 from the final expansion.
2259 The macro @code{defn} is recognized only with parameters.
2262 Its normal use is best understood through an example, which shows how to
2263 rename @code{undefine} to @code{zap}:
2266 define(`zap', defn(`undefine'))
2271 @result{}undefine(zap)
2274 In this way, @code{defn} can be used to copy macro definitions, and also
2275 definitions of builtin macros. Even if the original macro is removed,
2276 the other name can still be used to access the definition.
2278 The fact that macro definitions can be transferred also explains why you
2279 should use @code{$0}, rather than retyping a macro's name in its
2283 define(`foo', `This is `$0'')
2285 define(`bar', defn(`foo'))
2288 @result{}This is bar
2291 Macros used as string variables should be referred through @code{defn},
2292 to avoid unwanted expansion of the text:
2295 define(`string', `The macro dnl is very useful
2299 @result{}The macro@w{ }
2301 @result{}The macro dnl is very useful
2306 However, it is important to remember that @code{m4} rescanning is purely
2307 textual. If an unbalanced end-quote string occurs in a macro
2308 definition, the rescan will see that embedded quote as the termination
2309 of the quoted string, and the remainder of the macro's definition will
2310 be rescanned unquoted. Thus it is a good idea to avoid unbalanced
2311 end-quotes in macro definitions or arguments to macros.
2318 define(`echo', `$@@')
2328 On the other hand, it is possible to exploit the fact that @code{defn}
2329 can concatenate multiple macros prior to the rescanning phase, in order
2330 to join the definitions of macros that, in isolation, have unbalanced
2331 quotes. This is particularly useful when one has used several macros to
2332 accumulate text that M4 should rescan as a whole. In the example below,
2333 note how the use of @code{defn} on @code{l} in isolation opens a string,
2334 which is not closed until the next line; but used on @code{l} and
2335 @code{r} together results in nested quoting.
2338 define(`l', `<[>')define(`r', `<]>')
2340 changequote(`[', `]')
2344 @result{}<[>]defn([r])
2350 @cindex builtins, special tokens
2351 @cindex tokens, builtin macro
2352 Using @code{defn} to generate special tokens for builtin macros outside
2353 of expected contexts can sometimes trigger warnings. But most of the
2354 time, such tokens are silently converted to the empty string.
2360 define(defn(`divnum'), `cannot redefine a builtin token')
2361 @error{}m4:stdin:2: Warning: define: invalid macro name ignored
2369 Also note that @code{defn} with multiple arguments can only join text
2370 macros, not builtins, although a future version of GNU M4 may
2371 lift this restriction.
2375 define(`a', `A')define(`AA', `b')
2377 traceon(`defn', `define')
2379 defn(`a', `divnum', `a')
2380 @error{}m4:stdin:3: Warning: cannot concatenate builtin `divnum'
2381 @error{}m4trace: -1- defn(`a', `divnum', `a') -> ``A'`A''
2383 define(`mydivnum', defn(`divnum', `divnum'))mydivnum
2384 @error{}m4:stdin:4: Warning: cannot concatenate builtin `divnum'
2385 @error{}m4:stdin:4: Warning: cannot concatenate builtin `divnum'
2386 @error{}m4trace: -2- defn(`divnum', `divnum')
2387 @error{}m4trace: -1- define(`mydivnum', `')
2389 traceoff(`defn', `define')
2394 @section Temporarily redefining macros
2396 @cindex macros, temporary redefinition of
2397 @cindex temporary redefinition of macros
2398 @cindex redefinition of macros, temporary
2399 @cindex definition stack
2400 @cindex pushdef stack
2401 @cindex stack, macro definition
2402 It is possible to redefine a macro temporarily, reverting to the
2403 previous definition at a later time. This is done with the builtins
2404 @code{pushdef} and @code{popdef}:
2406 @deffn Builtin pushdef (@var{name}, @ovar{expansion})
2407 @deffnx Builtin popdef (@var{name}@dots{})
2408 Analogous to @code{define} and @code{undefine}.
2410 These macros work in a stack-like fashion. A macro is temporarily
2411 redefined with @code{pushdef}, which replaces an existing definition of
2412 @var{name}, while saving the previous definition, before the new one is
2413 installed. If there is no previous definition, @code{pushdef} behaves
2414 exactly like @code{define}.
2416 If a macro has several definitions (of which only one is accessible),
2417 the topmost definition can be removed with @code{popdef}. If there is
2418 no previous definition, @code{popdef} behaves like @code{undefine}.
2420 The expansion of both @code{pushdef} and @code{popdef} is void.
2421 The macros @code{pushdef} and @code{popdef} are recognized only with
2426 define(`foo', `Expansion one.')
2429 @result{}Expansion one.
2430 pushdef(`foo', `Expansion two.')
2433 @result{}Expansion two.
2434 pushdef(`foo', `Expansion three.')
2436 pushdef(`foo', `Expansion four.')
2441 @result{}Expansion three.
2442 popdef(`foo', `foo')
2445 @result{}Expansion one.
2452 If a macro with several definitions is redefined with @code{define}, the
2453 topmost definition is @emph{replaced} with the new definition. If it is
2454 removed with @code{undefine}, @emph{all} the definitions are removed,
2455 and not only the topmost one. However, POSIX allows other
2456 implementations that treat @code{define} as replacing an entire stack
2457 of definitions with a single new definition, so to be portable to other
2458 implementations, it may be worth explicitly using @code{popdef} and
2459 @code{pushdef} rather than relying on the GNU behavior of
2463 define(`foo', `Expansion one.')
2466 @result{}Expansion one.
2467 pushdef(`foo', `Expansion two.')
2470 @result{}Expansion two.
2471 define(`foo', `Second expansion two.')
2474 @result{}Second expansion two.
2481 @cindex local variables
2482 @cindex variables, local
2483 Local variables within macros are made with @code{pushdef} and
2484 @code{popdef}. At the start of the macro a new definition is pushed,
2485 within the macro it is manipulated and at the end it is popped,
2486 revealing the former definition.
2488 It is possible to temporarily redefine a builtin with @code{pushdef}
2492 @section Indirect call of macros
2494 @cindex indirect call of macros
2495 @cindex call of macros, indirect
2496 @cindex macros, indirect call of
2497 @cindex GNU extensions
2498 Any macro can be called indirectly with @code{indir}:
2500 @deffn Builtin indir (@var{name}, @ovar{args@dots{}})
2501 Results in a call to the macro @var{name}, which is passed the
2502 rest of the arguments @var{args}. If @var{name} is not defined, an
2503 error message is printed, and the expansion is void.
2505 The macro @code{indir} is recognized only with parameters.
2508 This can be used to call macros with computed or ``invalid''
2509 names (@code{define} allows such names to be defined):
2512 define(`$$internal$macro', `Internal macro (name `$0')')
2515 @result{}$$internal$macro
2516 indir(`$$internal$macro')
2517 @result{}Internal macro (name $$internal$macro)
2520 The point is, here, that larger macro packages can have private macros
2521 defined, that will not be called by accident. They can @emph{only} be
2522 called through the builtin @code{indir}.
2524 One other point to observe is that argument collection occurs before
2525 @code{indir} invokes @var{name}, so if argument collection changes the
2526 value of @var{name}, that will be reflected in the final expansion.
2527 This is different than the behavior when invoking macros directly,
2528 where the definition that was in effect before argument collection is
2537 indir(`f', define(`f', `3'))
2539 indir(`f', undefine(`f'))
2540 @error{}m4:stdin:4: undefined macro `f'
2544 When handed the result of @code{defn} (@pxref{Defn}) as one of its
2545 arguments, @code{indir} defers to the invoked @var{name} for whether a
2546 token representing a builtin is recognized or flattened to the empty
2551 indir(defn(`defn'), `divnum')
2552 @error{}m4:stdin:1: Warning: indir: invalid macro name ignored
2554 indir(`define', defn(`defn'), `divnum')
2555 @error{}m4:stdin:2: Warning: define: invalid macro name ignored
2557 indir(`define', `foo', defn(`divnum'))
2561 indir(`divert', defn(`foo'))
2562 @error{}m4:stdin:5: empty string treated as 0 in builtin `divert'
2567 @section Indirect call of builtins
2569 @cindex indirect call of builtins
2570 @cindex call of builtins, indirect
2571 @cindex builtins, indirect call of
2572 @cindex GNU extensions
2573 Builtin macros can be called indirectly with @code{builtin}:
2575 @deffn Builtin builtin (@var{name}, @ovar{args@dots{}})
2576 Results in a call to the builtin @var{name}, which is passed the
2577 rest of the arguments @var{args}. If @var{name} does not name a
2578 builtin, an error message is printed, and the expansion is void.
2580 The macro @code{builtin} is recognized only with parameters.
2583 This can be used even if @var{name} has been given another definition
2584 that has covered the original, or been undefined so that no macro
2585 maps to the builtin.
2588 pushdef(`define', `hidden')
2590 undefine(`undefine')
2592 define(`foo', `bar')
2596 builtin(`define', `foo', defn(`divnum'))
2600 builtin(`define', `foo', `BAR')
2605 @result{}undefine(foo)
2608 builtin(`undefine', `foo')
2614 The @var{name} argument only matches the original name of the builtin,
2615 even when the @option{--prefix-builtins} option (or @option{-P},
2616 @pxref{Operation modes, , Invoking m4}) is in effect. This is different
2617 from @code{indir}, which only tracks current macro names.
2619 @comment options: -P
2622 m4_builtin(`divnum')
2624 m4_builtin(`m4_divnum')
2625 @error{}m4:stdin:2: undefined builtin `m4_divnum'
2628 @error{}m4:stdin:3: undefined macro `divnum'
2630 m4_indir(`m4_divnum')
2634 Note that @code{indir} and @code{builtin} can be used to invoke builtins
2635 without arguments, even when they normally require parameters to be
2636 recognized; but it will provoke a warning, and result in a void expansion.
2642 @error{}m4:stdin:2: undefined builtin `'
2645 @error{}m4:stdin:3: Warning: too few arguments to builtin `builtin'
2648 @error{}m4:stdin:4: undefined builtin `'
2650 builtin(`builtin', ``'
2652 @error{}m4:stdin:5: undefined builtin ``'
2656 @error{}m4:stdin:7: Warning: too few arguments to builtin `index'
2661 @comment This example is not worth putting in the manual, but it is
2662 @comment needed for full coverage. Autoconf's m4_include relies heavily
2663 @comment on this feature.
2666 builtin(`include', `foo')dnl
2670 @comment And this example triggers a regression present in 1.4.10b.
2673 define(`s', `builtin(`shift', $@@)')dnl
2674 define(`loop', `ifelse(`$2', `', `-', `$1$2: $0(`$1', s(s($@@)))')')dnl
2681 loop(`1', `2', `3', `4')
2682 @result{}12: 13: 14: -
2683 loop(`1', `2', `3', `4', `5')
2684 @result{}12: 13: 14: 15: -
2689 @chapter Conditionals, loops, and recursion
2691 Macros, expanding to plain text, perhaps with arguments, are not quite
2692 enough. We would like to have macros expand to different things, based
2693 on decisions taken at run-time. For that, we need some kind of conditionals.
2694 Also, we would like to have some kind of loop construct, so we could do
2695 something a number of times, or while some condition is true.
2698 * Ifdef:: Testing if a macro is defined
2699 * Ifelse:: If-else construct, or multibranch
2700 * Shift:: Recursion in @code{m4}
2701 * Forloop:: Iteration by counting
2702 * Foreach:: Iteration by list contents
2703 * Stacks:: Working with definition stacks
2704 * Composition:: Building macros with macros
2708 @section Testing if a macro is defined
2710 @cindex conditionals
2711 There are two different builtin conditionals in @code{m4}. The first is
2714 @deffn Builtin ifdef (@var{name}, @var{string-1}, @ovar{string-2})
2715 If @var{name} is defined as a macro, @code{ifdef} expands to
2716 @var{string-1}, otherwise to @var{string-2}. If @var{string-2} is
2717 omitted, it is taken to be the empty string (according to the normal
2720 The macro @code{ifdef} is recognized only with parameters.
2724 ifdef(`foo', ``foo' is defined', ``foo' is not defined')
2725 @result{}foo is not defined
2728 ifdef(`foo', ``foo' is defined', ``foo' is not defined')
2729 @result{}foo is defined
2730 ifdef(`no_such_macro', `yes', `no', `extra argument')
2731 @error{}m4:stdin:4: Warning: excess arguments to builtin `ifdef' ignored
2736 @section If-else construct, or multibranch
2738 @cindex comparing strings
2739 @cindex discarding input
2740 @cindex input, discarding
2741 The other conditional, @code{ifelse}, is much more powerful. It can be
2742 used as a way to introduce a long comment, as an if-else construct, or
2743 as a multibranch, depending on the number of arguments supplied:
2745 @deffn Builtin ifelse (@var{comment})
2746 @deffnx Builtin ifelse (@var{string-1}, @var{string-2}, @var{equal}, @
2748 @deffnx Builtin ifelse (@var{string-1}, @var{string-2}, @var{equal-1}, @
2749 @var{string-3}, @var{string-4}, @var{equal-2}, @dots{}, @ovar{not-equal})
2750 Used with only one argument, the @code{ifelse} simply discards it and
2753 If called with three or four arguments, @code{ifelse} expands into
2754 @var{equal}, if @var{string-1} and @var{string-2} are equal (character
2755 for character), otherwise it expands to @var{not-equal}. A final fifth
2756 argument is ignored, after triggering a warning.
2758 If called with six or more arguments, and @var{string-1} and
2759 @var{string-2} are equal, @code{ifelse} expands into @var{equal-1},
2760 otherwise the first three arguments are discarded and the processing
2763 The macro @code{ifelse} is recognized only with parameters.
2766 Using only one argument is a common @code{m4} idiom for introducing a
2767 block comment, as an alternative to repeatedly using @code{dnl}. This
2768 special usage is recognized by GNU @code{m4}, so that in this
2769 case, the warning about missing arguments is never triggered.
2772 ifelse(`some comments')
2774 ifelse(`foo', `bar')
2775 @error{}m4:stdin:2: Warning: too few arguments to builtin `ifelse'
2779 Using three or four arguments provides decision points.
2782 ifelse(`foo', `bar', `true')
2784 ifelse(`foo', `foo', `true')
2786 define(`foo', `bar')
2788 ifelse(foo, `bar', `true', `false')
2790 ifelse(foo, `foo', `true', `false')
2794 @cindex macro, blind
2796 Notice how the first argument was used unquoted; it is common to compare
2797 the expansion of a macro with a string. With this macro, you can now
2798 reproduce the behavior of blind builtins, where the macro is recognized
2799 only with arguments.
2802 define(`foo', `ifelse(`$#', `0', ``$0'', `arguments:$#')')
2807 @result{}arguments:1
2809 @result{}arguments:3
2812 For an example of a way to make defining blind macros easier, see
2815 @cindex multibranches
2816 @cindex switch statement
2817 @cindex case statement
2818 The macro @code{ifelse} can take more than four arguments. If given more
2819 than four arguments, @code{ifelse} works like a @code{case} or @code{switch}
2820 statement in traditional programming languages. If @var{string-1} and
2821 @var{string-2} are equal, @code{ifelse} expands into @var{equal-1}, otherwise
2822 the procedure is repeated with the first three arguments discarded. This
2823 calls for an example:
2826 ifelse(`foo', `bar', `third', `gnu', `gnats')
2827 @error{}m4:stdin:1: Warning: excess arguments to builtin `ifelse' ignored
2829 ifelse(`foo', `bar', `third', `gnu', `gnats', `sixth')
2831 ifelse(`foo', `bar', `third', `gnu', `gnats', `sixth', `seventh')
2833 ifelse(`foo', `bar', `3', `gnu', `gnats', `6', `7', `8')
2834 @error{}m4:stdin:4: Warning: excess arguments to builtin `ifelse' ignored
2839 @comment Stress tests, not worth documenting.
2841 @comment Ensure that references compared to strings work regardless of
2842 @comment similar prefixes.
2844 define(`e', `$@@')define(`long', `01234567890123456789')
2846 ifelse(long, `01234567890123456789', `yes', `no')
2848 ifelse(`01234567890123456789', long, `yes', `no')
2850 ifelse(long, `01234567890123456789-', `yes', `no')
2852 ifelse(`01234567890123456789-', long, `yes', `no')
2854 ifelse(e(long), `01234567890123456789', `yes', `no')
2856 ifelse(`01234567890123456789', e(long), `yes', `no')
2858 ifelse(e(long), `01234567890123456789-', `yes', `no')
2860 ifelse(`01234567890123456789-', e(long), `yes', `no')
2862 ifelse(-e(long), `-01234567890123456789', `yes', `no')
2864 ifelse(-`01234567890123456789', -e(long), `yes', `no')
2866 ifelse(-e(long), `-01234567890123456789-', `yes', `no')
2868 ifelse(`-01234567890123456789-', -e(long), `yes', `no')
2870 ifelse(-e(long)-, `-01234567890123456789-', `yes', `no')
2872 ifelse(-`01234567890123456789-', -e(long)-, `yes', `no')
2874 ifelse(-e(long)-, `-01234567890123456789', `yes', `no')
2876 ifelse(`-01234567890123456789', -e(long)-, `yes', `no')
2878 ifelse(`-'e(long), `-01234567890123456789', `yes', `no')
2880 ifelse(-`01234567890123456789', `-'e(long), `yes', `no')
2882 ifelse(`-'e(long), `-01234567890123456789-', `yes', `no')
2884 ifelse(`-01234567890123456789-', `-'e(long), `yes', `no')
2886 ifelse(`-'e(long)`-', `-01234567890123456789-', `yes', `no')
2888 ifelse(-`01234567890123456789-', `-'e(long)`-', `yes', `no')
2890 ifelse(`-'e(long)`-', `-01234567890123456789', `yes', `no')
2892 ifelse(`-01234567890123456789', `-'e(long)`-', `yes', `no')
2897 Naturally, the normal case will be slightly more advanced than these
2898 examples. A common use of @code{ifelse} is in macros implementing loops
2902 @section Recursion in @code{m4}
2904 @cindex recursive macros
2905 @cindex macros, recursive
2906 There is no direct support for loops in @code{m4}, but macros can be
2907 recursive. There is no limit on the number of recursion levels, other
2908 than those enforced by your hardware and operating system.
2911 Loops can be programmed using recursion and the conditionals described
2914 There is a builtin macro, @code{shift}, which can, among other things,
2915 be used for iterating through the actual arguments to a macro:
2917 @deffn Builtin shift (@var{arg1}, @dots{})
2918 Takes any number of arguments, and expands to all its arguments except
2919 @var{arg1}, separated by commas, with each argument quoted.
2921 The macro @code{shift} is recognized only with parameters.
2929 shift(`foo', `bar', `baz')
2933 An example of the use of @code{shift} is this macro:
2935 @cindex reversing arguments
2936 @cindex arguments, reversing
2937 @deffn Composite reverse (@dots{})
2938 Takes any number of arguments, and reverses their order.
2941 It is implemented as:
2944 define(`reverse', `ifelse(`$#', `0', , `$#', `1', ``$1'',
2945 `reverse(shift($@@)), `$1'')')
2951 reverse(`foo', `bar', `gnats', `and gnus')
2952 @result{}and gnus, gnats, bar, foo
2955 While not a very interesting macro, it does show how simple loops can be
2956 made with @code{shift}, @code{ifelse} and recursion. It also shows
2957 that @code{shift} is usually used with @samp{$@@}. Another example of
2958 this is an implementation of a short-circuiting conditional operator.
2960 @cindex short-circuiting conditional
2961 @cindex conditional, short-circuiting
2962 @deffn Composite cond (@var{test-1}, @var{string-1}, @var{equal-1}, @
2963 @ovar{test-2}, @ovar{string-2}, @ovar{equal-2}, @dots{}, @ovar{not-equal})
2964 Similar to @code{ifelse}, where an equal comparison between the first
2965 two strings results in the third, otherwise the first three arguments
2966 are discarded and the process repeats. The difference is that each
2967 @var{test-<n>} is expanded only when it is encountered. This means that
2968 every third argument to @code{cond} is normally given one more level of
2969 quoting than the corresponding argument to @code{ifelse}.
2972 Here is the implementation of @code{cond}, along with a demonstration of
2973 how it can short-circuit the side effects in @code{side}. Notice how
2974 all the unquoted side effects happen regardless of how many comparisons
2975 are made with @code{ifelse}, compared with only the relevant effects
2980 `ifelse(`$#', `1', `$1',
2981 `ifelse($1, `$2', `$3',
2982 `$0(shift(shift(shift($@@))))')')')dnl
2983 define(`side', `define(`counter', incr(counter))$1')dnl
2985 `define(`counter', `0')dnl
2986 ifelse(side(`$1'), `yes', `one comparison: ',
2987 side(`$1'), `no', `two comparisons: ',
2988 side(`$1'), `maybe', `three comparisons: ',
2989 `side(`default answer: ')')counter')dnl
2991 `define(`counter', `0')dnl
2992 cond(`side(`$1')', `yes', `one comparison: ',
2993 `side(`$1')', `no', `two comparisons: ',
2994 `side(`$1')', `maybe', `three comparisons: ',
2995 `side(`default answer: ')')counter')dnl
2997 @result{}one comparison: 3
2999 @result{}two comparisons: 3
3001 @result{}three comparisons: 3
3002 example1(`feeling rather indecisive today')
3003 @result{}default answer: 4
3005 @result{}one comparison: 1
3007 @result{}two comparisons: 2
3009 @result{}three comparisons: 3
3010 example2(`feeling rather indecisive today')
3011 @result{}default answer: 4
3014 @cindex joining arguments
3015 @cindex arguments, joining
3016 @cindex concatenating arguments
3017 Another common task that requires iteration is joining a list of
3018 arguments into a single string.
3020 @deffn Composite join (@ovar{separator}, @ovar{args@dots{}})
3021 @deffnx Composite joinall (@ovar{separator}, @ovar{args@dots{}})
3022 Generate a single-quoted string, consisting of each @var{arg} separated
3023 by @var{separator}. While @code{joinall} always outputs a
3024 @var{separator} between arguments, @code{join} avoids the
3025 @var{separator} for an empty @var{arg}.
3028 Here are some examples of its usage, based on the implementation
3029 @file{m4-@value{VERSION}/@/examples/@/join.m4} distributed in this
3034 $ @kbd{m4 -I examples}
3037 join,join(`-'),join(`-', `'),join(`-', `', `')
3039 joinall,joinall(`-'),joinall(`-', `'),joinall(`-', `', `')
3043 join(`-', `1', `2', `3')
3045 join(`', `1', `2', `3')
3047 join(`-', `', `1', `', `', `2', `')
3049 joinall(`-', `', `1', `', `', `2', `')
3051 join(`,', `1', `2', `3')
3053 define(`nargs', `$#')dnl
3054 nargs(join(`,', `1', `2', `3'))
3058 Examining the implementation shows some interesting points about several
3059 m4 programming idioms.
3063 $ @kbd{m4 -I examples}
3064 undivert(`join.m4')dnl
3065 @result{}divert(`-1')
3066 @result{}# join(sep, args) - join each non-empty ARG into a single
3067 @result{}# string, with each element separated by SEP
3068 @result{}define(`join',
3069 @result{}`ifelse(`$#', `2', ``$2'',
3070 @result{} `ifelse(`$2', `', `', ``$2'_')$0(`$1', shift(shift($@@)))')')
3071 @result{}define(`_join',
3072 @result{}`ifelse(`$#$2', `2', `',
3073 @result{} `ifelse(`$2', `', `', ``$1$2'')$0(`$1', shift(shift($@@)))')')
3074 @result{}# joinall(sep, args) - join each ARG, including empty ones,
3075 @result{}# into a single string, with each element separated by SEP
3076 @result{}define(`joinall', ``$2'_$0(`$1', shift($@@))')
3077 @result{}define(`_joinall',
3078 @result{}`ifelse(`$#', `2', `', ``$1$3'$0(`$1', shift(shift($@@)))')')
3079 @result{}divert`'dnl
3082 First, notice that this implementation creates helper macros
3083 @code{_join} and @code{_joinall}. This division of labor makes it
3084 easier to output the correct number of @var{separator} instances:
3085 @code{join} and @code{joinall} are responsible for the first argument,
3086 without a separator, while @code{_join} and @code{_joinall} are
3087 responsible for all remaining arguments, always outputting a separator
3088 when outputting an argument.
3090 Next, observe how @code{join} decides to iterate to itself, because the
3091 first @var{arg} was empty, or to output the argument and swap over to
3092 @code{_join}. If the argument is non-empty, then the nested
3093 @code{ifelse} results in an unquoted @samp{_}, which is concatenated
3094 with the @samp{$0} to form the next macro name to invoke. The
3095 @code{joinall} implementation is simpler since it does not have to
3096 suppress empty @var{arg}; it always executes once then defers to
3099 Another important idiom is the idea that @var{separator} is reused for
3100 each iteration. Each iteration has one less argument, but rather than
3101 discarding @samp{$1} by iterating with @code{$0(shift($@@))}, the macro
3102 discards @samp{$2} by using @code{$0(`$1', shift(shift($@@)))}.
3104 Next, notice that it is possible to compare more than one condition in a
3105 single @code{ifelse} test. The test of @samp{$#$2} against @samp{2}
3106 allows @code{_join} to iterate for two separate reasons---either there
3107 are still more than two arguments, or there are exactly two arguments
3108 but the last argument is not empty.
3110 Finally, notice that these macros require exactly two arguments to
3111 terminate recursion, but that they still correctly result in empty
3112 output when given no @var{args} (i.e., zero or one macro argument). On
3113 the first pass when there are too few arguments, the @code{shift}
3114 results in no output, but leaves an empty string to serve as the
3115 required second argument for the second pass. Put another way,
3116 @samp{`$1', shift($@@)} is not the same as @samp{$@@}, since only the
3117 former guarantees at least two arguments.
3119 @cindex quote manipulation
3120 @cindex manipulating quotes
3121 Sometimes, a recursive algorithm requires adding quotes to each element,
3122 or treating multiple arguments as a single element:
3124 @deffn Composite quote (@dots{})
3125 @deffnx Composite dquote (@dots{})
3126 @deffnx Composite dquote_elt (@dots{})
3127 Takes any number of arguments, and adds quoting. With @code{quote},
3128 only one level of quoting is added, effectively removing whitespace
3129 after commas and turning multiple arguments into a single string. With
3130 @code{dquote}, two levels of quoting are added, one around each element,
3131 and one around the list. And with @code{dquote_elt}, two levels of
3132 quoting are added around each element.
3135 An actual implementation of these three macros is distributed as
3136 @file{m4-@value{VERSION}/@/examples/@/quote.m4} in this package. First,
3137 let's examine their usage:
3141 $ @kbd{m4 -I examples}
3144 -quote-dquote-dquote_elt-
3146 -quote()-dquote()-dquote_elt()-
3148 -quote(`1')-dquote(`1')-dquote_elt(`1')-
3149 @result{}-1-`1'-`1'-
3150 -quote(`1', `2')-dquote(`1', `2')-dquote_elt(`1', `2')-
3151 @result{}-1,2-`1',`2'-`1',`2'-
3152 define(`n', `$#')dnl
3153 -n(quote(`1', `2'))-n(dquote(`1', `2'))-n(dquote_elt(`1', `2'))-
3155 dquote(dquote_elt(`1', `2'))
3156 @result{}``1'',``2''
3157 dquote_elt(dquote(`1', `2'))
3161 The last two lines show that when given two arguments, @code{dquote}
3162 results in one string, while @code{dquote_elt} results in two. Now,
3163 examine the implementation. Note that @code{quote} and
3164 @code{dquote_elt} make decisions based on their number of arguments, so
3165 that when called without arguments, they result in nothing instead of a
3166 quoted empty string; this is so that it is possible to distinguish
3167 between no arguments and an empty first argument. @code{dquote}, on the
3168 other hand, results in a string no matter what, since it is still
3169 possible to tell whether it was invoked without arguments based on the
3174 $ @kbd{m4 -I examples}
3175 undivert(`quote.m4')dnl
3176 @result{}divert(`-1')
3177 @result{}# quote(args) - convert args to single-quoted string
3178 @result{}define(`quote', `ifelse(`$#', `0', `', ``$*'')')
3179 @result{}# dquote(args) - convert args to quoted list of quoted strings
3180 @result{}define(`dquote', ``$@@'')
3181 @result{}# dquote_elt(args) - convert args to list of double-quoted strings
3182 @result{}define(`dquote_elt', `ifelse(`$#', `0', `', `$#', `1', ```$1''',
3183 @result{} ```$1'',$0(shift($@@))')')
3184 @result{}divert`'dnl
3187 It is worth pointing out that @samp{quote(@var{args})} is more efficient
3188 than @samp{joinall(`,', @var{args})} for producing the same output.
3190 @cindex nine arguments, more than
3191 @cindex more than nine arguments
3192 @cindex arguments, more than nine
3193 One more useful macro based on @code{shift} allows portably selecting
3194 an arbitrary argument (usually greater than the ninth argument), without
3195 relying on the GNU extension of multi-digit arguments
3196 (@pxref{Arguments}).
3198 @deffn Composite argn (@var{n}, @dots{})
3199 Expands to argument @var{n} out of the remaining arguments. @var{n}
3200 must be a positive number. Usually invoked as
3201 @samp{argn(`@var{n}',$@@)}.
3204 It is implemented as:
3207 define(`argn', `ifelse(`$1', 1, ``$2'',
3208 `argn(decr(`$1'), shift(shift($@@)))')')
3212 define(`foo', `argn(`11', $@@)')
3214 foo(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k', `l')
3219 @section Iteration by counting
3222 @cindex loops, counting
3223 @cindex counting loops
3224 Here is an example of a loop macro that implements a simple for loop.
3226 @deffn Composite forloop (@var{iterator}, @var{start}, @var{end}, @var{text})
3227 Takes the name in @var{iterator}, which must be a valid macro name, and
3228 successively assign it each integer value from @var{start} to @var{end},
3229 inclusive. For each assignment to @var{iterator}, append @var{text} to
3230 the expansion of the @code{forloop}. @var{text} may refer to
3231 @var{iterator}. Any definition of @var{iterator} prior to this
3232 invocation is restored.
3235 It can, for example, be used for simple counting:
3239 $ @kbd{m4 -I examples}
3240 include(`forloop.m4')
3242 forloop(`i', `1', `8', `i ')
3243 @result{}1 2 3 4 5 6 7 8@w{ }
3246 For-loops can be nested, like:
3250 $ @kbd{m4 -I examples}
3251 include(`forloop.m4')
3253 forloop(`i', `1', `4', `forloop(`j', `1', `8', ` (i, j)')
3255 @result{} (1, 1) (1, 2) (1, 3) (1, 4) (1, 5) (1, 6) (1, 7) (1, 8)
3256 @result{} (2, 1) (2, 2) (2, 3) (2, 4) (2, 5) (2, 6) (2, 7) (2, 8)
3257 @result{} (3, 1) (3, 2) (3, 3) (3, 4) (3, 5) (3, 6) (3, 7) (3, 8)
3258 @result{} (4, 1) (4, 2) (4, 3) (4, 4) (4, 5) (4, 6) (4, 7) (4, 8)
3262 The implementation of the @code{forloop} macro is fairly
3263 straightforward. The @code{forloop} macro itself is simply a wrapper,
3264 which saves the previous definition of the first argument, calls the
3265 internal macro @code{@w{_forloop}}, and re-establishes the saved
3266 definition of the first argument.
3268 The macro @code{@w{_forloop}} expands the fourth argument once, and
3269 tests to see if the iterator has reached the final value. If it has
3270 not finished, it increments the iterator (using the predefined macro
3271 @code{incr}, @pxref{Incr}), and recurses.
3273 Here is an actual implementation of @code{forloop}, distributed as
3274 @file{m4-@value{VERSION}/@/examples/@/forloop.m4} in this package:
3278 $ @kbd{m4 -I examples}
3279 undivert(`forloop.m4')dnl
3280 @result{}divert(`-1')
3281 @result{}# forloop(var, from, to, stmt) - simple version
3282 @result{}define(`forloop', `pushdef(`$1', `$2')_forloop($@@)popdef(`$1')')
3283 @result{}define(`_forloop',
3284 @result{} `$4`'ifelse($1, `$3', `', `define(`$1', incr($1))$0($@@)')')
3285 @result{}divert`'dnl
3288 Notice the careful use of quotes. Certain macro arguments are left
3289 unquoted, each for its own reason. Try to find out @emph{why} these
3290 arguments are left unquoted, and see what happens if they are quoted.
3291 (As presented, these two macros are useful but not very robust for
3292 general use. They lack even basic error handling for cases like
3293 @var{start} less than @var{end}, @var{end} not numeric, or
3294 @var{iterator} not being a macro name. See if you can improve these
3295 macros; or @pxref{Improved forloop, , Answers}).
3298 @section Iteration by list contents
3300 @cindex for each loops
3301 @cindex loops, list iteration
3302 @cindex iterating over lists
3303 Here is an example of a loop macro that implements list iteration.
3305 @deffn Composite foreach (@var{iterator}, @var{paren-list}, @var{text})
3306 @deffnx Composite foreachq (@var{iterator}, @var{quote-list}, @var{text})
3307 Takes the name in @var{iterator}, which must be a valid macro name, and
3308 successively assign it each value from @var{paren-list} or
3309 @var{quote-list}. In @code{foreach}, @var{paren-list} is a
3310 comma-separated list of elements contained in parentheses. In
3311 @code{foreachq}, @var{quote-list} is a comma-separated list of elements
3312 contained in a quoted string. For each assignment to @var{iterator},
3313 append @var{text} to the overall expansion. @var{text} may refer to
3314 @var{iterator}. Any definition of @var{iterator} prior to this
3315 invocation is restored.
3318 As an example, this displays each word in a list inside of a sentence,
3319 using an implementation of @code{foreach} distributed as
3320 @file{m4-@value{VERSION}/@/examples/@/foreach.m4}, and @code{foreachq}
3321 in @file{m4-@value{VERSION}/@/examples/@/foreachq.m4}.
3325 $ @kbd{m4 -I examples}
3326 include(`foreach.m4')
3328 foreach(`x', (foo, bar, foobar), `Word was: x
3330 @result{}Word was: foo
3331 @result{}Word was: bar
3332 @result{}Word was: foobar
3333 include(`foreachq.m4')
3335 foreachq(`x', `foo, bar, foobar', `Word was: x
3337 @result{}Word was: foo
3338 @result{}Word was: bar
3339 @result{}Word was: foobar
3342 It is possible to be more complex; each element of the @var{paren-list}
3343 or @var{quote-list} can itself be a list, to pass as further arguments
3344 to a helper macro. This example generates a shell case statement:
3348 $ @kbd{m4 -I examples}
3349 include(`foreach.m4')
3351 define(`_case', ` $1)
3354 define(`_cat', `$1$2')dnl
3357 foreach(`x', `(`(`a', `vara')', `(`b', `varb')', `(`c', `varc')')',
3358 `_cat(`_case', x)')dnl
3360 @result{} vara=" a";;
3362 @result{} varb=" b";;
3364 @result{} varc=" c";;
3369 The implementation of the @code{foreach} macro is a bit more involved;
3370 it is a wrapper around two helper macros. First, @code{@w{_arg1}} is
3371 needed to grab the first element of a list. Second,
3372 @code{@w{_foreach}} implements the recursion, successively walking
3373 through the original list. Here is a simple implementation of
3378 $ @kbd{m4 -I examples}
3379 undivert(`foreach.m4')dnl
3380 @result{}divert(`-1')
3381 @result{}# foreach(x, (item_1, item_2, ..., item_n), stmt)
3382 @result{}# parenthesized list, simple version
3383 @result{}define(`foreach', `pushdef(`$1')_foreach($@@)popdef(`$1')')
3384 @result{}define(`_arg1', `$1')
3385 @result{}define(`_foreach', `ifelse(`$2', `()', `',
3386 @result{} `define(`$1', _arg1$2)$3`'$0(`$1', (shift$2), `$3')')')
3387 @result{}divert`'dnl
3390 Unfortunately, that implementation is not robust to macro names as list
3391 elements. Each iteration of @code{@w{_foreach}} is stripping another
3392 layer of quotes, leading to erratic results if list elements are not
3393 already fully expanded. The first cut at implementing @code{foreachq}
3394 takes this into account. Also, when using quoted elements in a
3395 @var{paren-list}, the overall list must be quoted. A @var{quote-list}
3396 has the nice property of requiring fewer characters to create a list
3397 containing the same quoted elements. To see the difference between the
3398 two macros, we attempt to pass double-quoted macro names in a list,
3399 expecting the macro name on output after one layer of quotes is removed
3400 during list iteration and the final layer removed during the final
3405 $ @kbd{m4 -I examples}
3406 define(`a', `1')define(`b', `2')define(`c', `3')
3408 include(`foreach.m4')
3410 include(`foreachq.m4')
3412 foreach(`x', `(``a'', ``(b'', ``c)'')', `x
3419 foreachq(`x', ```a'', ``(b'', ``c)''', `x
3426 Obviously, @code{foreachq} did a better job; here is its implementation:
3430 $ @kbd{m4 -I examples}
3431 undivert(`foreachq.m4')dnl
3432 @result{}include(`quote.m4')dnl
3433 @result{}divert(`-1')
3434 @result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
3435 @result{}# quoted list, simple version
3436 @result{}define(`foreachq', `pushdef(`$1')_foreachq($@@)popdef(`$1')')
3437 @result{}define(`_arg1', `$1')
3438 @result{}define(`_foreachq', `ifelse(quote($2), `', `',
3439 @result{} `define(`$1', `_arg1($2)')$3`'$0(`$1', `shift($2)', `$3')')')
3440 @result{}divert`'dnl
3443 Notice that @code{@w{_foreachq}} had to use the helper macro
3444 @code{quote} defined earlier (@pxref{Shift}), to ensure that the
3445 embedded @code{ifelse} call does not go haywire if a list element
3446 contains a comma. Unfortunately, this implementation of @code{foreachq}
3447 has its own severe flaw. Whereas the @code{foreach} implementation was
3448 linear, this macro is quadratic in the number of list elements, and is
3449 much more likely to trip up the limit set by the command line option
3450 @option{--nesting-limit} (or @option{-L}, @pxref{Limits control, ,
3451 Invoking m4}). Additionally, this implementation does not expand
3452 @samp{defn(`@var{iterator}')} very well, when compared with
3457 $ @kbd{m4 -I examples}
3458 include(`foreach.m4')include(`foreachq.m4')
3460 foreach(`name', `(`a', `b')', ` defn(`name')')
3462 foreachq(`name', ``a', `b'', ` defn(`name')')
3463 @result{} _arg1(`a', `b') _arg1(shift(`a', `b'))
3466 It is possible to have robust iteration with linear behavior and sane
3467 @var{iterator} contents for either list style. See if you can learn
3468 from the best elements of both of these implementations to create robust
3469 macros (or @pxref{Improved foreach, , Answers}).
3472 @section Working with definition stacks
3474 @cindex definition stack
3475 @cindex pushdef stack
3476 @cindex stack, macro definition
3477 Thanks to @code{pushdef}, manipulation of a stack is an intrinsic
3478 operation in @code{m4}. Normally, only the topmost definition in a
3479 stack is important, but sometimes, it is desirable to manipulate the
3480 entire definition stack.
3482 @deffn Composite stack_foreach (@var{macro}, @var{action})
3483 @deffnx Composite stack_foreach_lifo (@var{macro}, @var{action})
3484 For each of the @code{pushdef} definitions associated with @var{macro},
3485 invoke the macro @var{action} with a single argument of that definition.
3486 @code{stack_foreach} visits the oldest definition first, while
3487 @code{stack_foreach_lifo} visits the current definition first.
3488 @var{action} should not modify or dereference @var{macro}. There are a
3489 few special macros, such as @code{defn}, which cannot be used as the
3490 @var{macro} parameter.
3493 A sample implementation of these macros is distributed in the file
3494 @file{m4-@value{VERSION}/@/examples/@/stack.m4}.
3498 $ @kbd{m4 -I examples}
3501 pushdef(`a', `1')pushdef(`a', `2')pushdef(`a', `3')
3503 define(`show', ``$1'
3506 stack_foreach(`a', `show')dnl
3510 stack_foreach_lifo(`a', `show')dnl
3516 Now for the implementation. Note the definition of a helper macro,
3517 @code{_stack_reverse}, which destructively swaps the contents of one
3518 stack of definitions into the reverse order in the temporary macro
3519 @samp{tmp-$1}. By calling the helper twice, the original order is
3520 restored back into the macro @samp{$1}; since the operation is
3521 destructive, this explains why @samp{$1} must not be modified or
3522 dereferenced during the traversal. The caller can then inject
3523 additional code to pass the definition currently being visited to
3524 @samp{$2}. The choice of helper names is intentional; since @samp{-} is
3525 not valid as part of a macro name, there is no risk of conflict with a
3526 valid macro name, and the code is guaranteed to use @code{defn} where
3527 necessary. Finally, note that any macro used in the traversal of a
3528 @code{pushdef} stack, such as @code{pushdef} or @code{defn}, cannot be
3529 handled by @code{stack_foreach}, since the macro would temporarily be
3530 undefined during the algorithm.
3534 $ @kbd{m4 -I examples}
3535 undivert(`stack.m4')dnl
3536 @result{}divert(`-1')
3537 @result{}# stack_foreach(macro, action)
3538 @result{}# Invoke ACTION with a single argument of each definition
3539 @result{}# from the definition stack of MACRO, starting with the oldest.
3540 @result{}define(`stack_foreach',
3541 @result{}`_stack_reverse(`$1', `tmp-$1')'dnl
3542 @result{}`_stack_reverse(`tmp-$1', `$1', `$2(defn(`$1'))')')
3543 @result{}# stack_foreach_lifo(macro, action)
3544 @result{}# Invoke ACTION with a single argument of each definition
3545 @result{}# from the definition stack of MACRO, starting with the newest.
3546 @result{}define(`stack_foreach_lifo',
3547 @result{}`_stack_reverse(`$1', `tmp-$1', `$2(defn(`$1'))')'dnl
3548 @result{}`_stack_reverse(`tmp-$1', `$1')')
3549 @result{}define(`_stack_reverse',
3550 @result{}`ifdef(`$1', `pushdef(`$2', defn(`$1'))$3`'popdef(`$1')$0($@@)')')
3551 @result{}divert`'dnl
3555 @section Building macros with macros
3557 @cindex macro composition
3558 @cindex composing macros
3559 Since m4 is a macro language, it is possible to write macros that
3560 can build other macros. First on the list is a way to automate the
3561 creation of blind macros.
3563 @cindex macro, blind
3565 @deffn Composite define_blind (@var{name}, @ovar{value})
3566 Defines @var{name} as a blind macro, such that @var{name} will expand to
3567 @var{value} only when given explicit arguments. @var{value} should not
3568 be the result of @code{defn} (@pxref{Defn}). This macro is only
3569 recognized with parameters, and results in an empty string.
3572 Defining a macro to define another macro can be a bit tricky. We want
3573 to use a literal @samp{$#} in the argument to the nested @code{define}.
3574 However, if @samp{$} and @samp{#} are adjacent in the definition of
3575 @code{define_blind}, then it would be expanded as the number of
3576 arguments to @code{define_blind} rather than the intended number of
3577 arguments to @var{name}. The solution is to pass the difficult
3578 characters through extra arguments to a helper macro
3579 @code{_define_blind}. When composing macros, it is a common idiom to
3580 need a helper macro to concatenate text that forms parameters in the
3581 composed macro, rather than interpreting the text as a parameter of the
3584 As for the limitation against using @code{defn}, there are two reasons.
3585 If a macro was previously defined with @code{define_blind}, then it can
3586 safely be renamed to a new blind macro using plain @code{define}; using
3587 @code{define_blind} to rename it just adds another layer of
3588 @code{ifelse}, occupying memory and slowing down execution. And if a
3589 macro is a builtin, then it would result in an attempt to define a macro
3590 consisting of both text and a builtin token; this is not supported, and
3591 the builtin token is flattened to an empty string.
3593 With that explanation, here's the definition, and some sample usage.
3594 Notice that @code{define_blind} is itself a blind macro.
3598 define(`define_blind', `ifelse(`$#', `0', ``$0'',
3599 `_$0(`$1', `$2', `$'`#', `$'`0')')')
3601 define(`_define_blind', `define(`$1',
3602 `ifelse(`$3', `0', ``$4'', `$2')')')
3605 @result{}define_blind
3606 define_blind(`foo', `arguments were $*')
3611 @result{}arguments were bar
3612 define(`blah', defn(`foo'))
3617 @result{}arguments were a,b
3619 @result{}ifelse(`$#', `0', ``$0'', `arguments were $*')
3622 @cindex currying arguments
3623 @cindex argument currying
3624 Another interesting composition tactic is argument @dfn{currying}, or
3625 factoring a macro that takes multiple arguments for use in a context
3626 that provides exactly one argument.
3628 @deffn Composite curry (@var{macro}, @dots{})
3629 Expand to a macro call that takes exactly one argument, then appends
3630 that argument to the original arguments and invokes @var{macro} with the
3631 resulting list of arguments.
3634 A demonstration of currying makes the intent of this macro a little more
3635 obvious. The macro @code{stack_foreach} mentioned earlier is an example
3636 of a context that provides exactly one argument to a macro name. But
3637 coupled with currying, we can invoke @code{reverse} with two arguments
3638 for each definition of a macro stack. This example uses the file
3639 @file{m4-@value{VERSION}/@/examples/@/curry.m4} included in the
3644 $ @kbd{m4 -I examples}
3645 include(`curry.m4')include(`stack.m4')
3647 define(`reverse', `ifelse(`$#', `0', , `$#', `1', ``$1'',
3648 `reverse(shift($@@)), `$1'')')
3650 pushdef(`a', `1')pushdef(`a', `2')pushdef(`a', `3')
3652 stack_foreach(`a', `:curry(`reverse', `4')')
3653 @result{}:1, 4:2, 4:3, 4
3654 curry(`curry', `reverse', `1')(`2')(`3')
3658 Now for the implementation. Notice how @code{curry} leaves off with a
3659 macro name but no open parenthesis, while still in the middle of
3660 collecting arguments for @samp{$1}. The macro @code{_curry} is the
3661 helper macro that takes one argument, then adds it to the list and
3662 finally supplies the closing parenthesis. The use of a comma inside the
3663 @code{shift} call allows currying to also work for a macro that takes
3664 one argument, although it often makes more sense to invoke that macro
3665 directly rather than going through @code{curry}.
3669 $ @kbd{m4 -I examples}
3670 undivert(`curry.m4')dnl
3671 @result{}divert(`-1')
3672 @result{}# curry(macro, args)
3673 @result{}# Expand to a macro call that takes one argument, then invoke
3674 @result{}# macro(args, extra).
3675 @result{}define(`curry', `$1(shift($@@,)_$0')
3676 @result{}define(`_curry', ``$1')')
3677 @result{}divert`'dnl
3680 Unfortunately, with M4 1.4.x, @code{curry} is unable to handle builtin
3681 tokens, which are silently flattened to the empty string when passed
3682 through another text macro. This limitation will be lifted in a future
3685 @cindex renaming macros
3686 @cindex copying macros
3687 @cindex macros, copying
3688 Putting the last few concepts together, it is possible to copy or rename
3689 an entire stack of macro definitions.
3691 @deffn Composite copy (@var{source}, @var{dest})
3692 @deffnx Composite rename (@var{source}, @var{dest})
3693 Ensure that @var{dest} is undefined, then define it to the same stack of
3694 definitions currently in @var{source}. @code{copy} leaves @var{source}
3695 unchanged, while @code{rename} undefines @var{source}. There are only a
3696 few macros, such as @code{copy} or @code{defn}, which cannot be copied
3700 The implementation is relatively straightforward (although since it uses
3701 @code{curry}, it is unable to copy builtin macros, such as the second
3702 definition of @code{a} as a synonym for @code{divnum}. See if you can
3703 design a version that works around this limitation, or @pxref{Improved
3708 $ @kbd{m4 -I examples}
3709 include(`curry.m4')include(`stack.m4')
3711 define(`rename', `copy($@@)undefine(`$1')')dnl
3712 define(`copy', `ifdef(`$2', `errprint(`$2 already defined
3714 `stack_foreach(`$1', `curry(`pushdef', `$2')')')')dnl
3715 pushdef(`a', `1')pushdef(`a', defn(`divnum'))pushdef(`a', `2')
3730 @chapter How to debug macros and input
3732 @cindex debugging macros
3733 @cindex macros, debugging
3734 When writing macros for @code{m4}, they often do not work as intended on
3735 the first try (as is the case with most programming languages).
3736 Fortunately, there is support for macro debugging in @code{m4}.
3739 * Dumpdef:: Displaying macro definitions
3740 * Trace:: Tracing macro calls
3741 * Debug Levels:: Controlling debugging output
3742 * Debug Output:: Saving debugging output
3746 @section Displaying macro definitions
3748 @cindex displaying macro definitions
3749 @cindex macros, displaying definitions
3750 @cindex definitions, displaying macro
3751 @cindex standard error, output to
3752 If you want to see what a name expands into, you can use the builtin
3755 @deffn Builtin dumpdef (@ovar{names@dots{}})
3756 Accepts any number of arguments. If called without any arguments,
3757 it displays the definitions of all known names, otherwise it displays
3758 the definitions of the @var{names} given. The output is printed to the
3759 current debug file (usually standard error), and is sorted by name. If
3760 an unknown name is encountered, a warning is printed.
3762 The expansion of @code{dumpdef} is void.
3767 define(`foo', `Hello world.')
3770 @error{}foo:@tabchar{}`Hello world.'
3773 @error{}define:@tabchar{}<define>
3777 The last example shows how builtin macros definitions are displayed.
3778 The definition that is dumped corresponds to what would occur if the
3779 macro were to be called at that point, even if other definitions are
3780 still live due to redefining a macro during argument collection.
3784 pushdef(`f', ``$0'1')pushdef(`f', ``$0'2')
3786 f(popdef(`f')dumpdef(`f'))
3787 @error{}f:@tabchar{}``$0'1'
3789 f(popdef(`f')dumpdef(`f'))
3790 @error{}m4:stdin:3: undefined macro `f'
3794 @xref{Debug Levels}, for information on controlling the details of the
3798 @section Tracing macro calls
3800 @cindex tracing macro expansion
3801 @cindex macro expansion, tracing
3802 @cindex expansion, tracing macro
3803 @cindex standard error, output to
3804 It is possible to trace macro calls and expansions through the builtins
3805 @code{traceon} and @code{traceoff}:
3807 @deffn Builtin traceon (@ovar{names@dots{}})
3808 @deffnx Builtin traceoff (@ovar{names@dots{}})
3809 When called without any arguments, @code{traceon} and @code{traceoff}
3810 will turn tracing on and off, respectively, for all currently defined
3813 When called with arguments, only the macros listed in @var{names} are
3814 affected, whether or not they are currently defined.
3816 The expansion of @code{traceon} and @code{traceoff} is void.
3819 Whenever a traced macro is called and the arguments have been collected,
3820 the call is displayed. If the expansion of the macro call is not void,
3821 the expansion can be displayed after the call. The output is printed
3822 to the current debug file (defaulting to standard error, @pxref{Debug
3827 define(`foo', `Hello World.')
3829 define(`echo', `$@@')
3831 traceon(`foo', `echo')
3834 @error{}m4trace: -1- foo -> `Hello World.'
3835 @result{}Hello World.
3836 echo(`gnus', `and gnats')
3837 @error{}m4trace: -1- echo(`gnus', `and gnats') -> ``gnus',`and gnats''
3838 @result{}gnus,and gnats
3841 The number between dashes is the depth of the expansion. It is one most
3842 of the time, signifying an expansion at the outermost level, but it
3843 increases when macro arguments contain unquoted macro calls. The
3844 maximum number that will appear between dashes is controlled by the
3845 option @option{--nesting-limit} (or @option{-L}, @pxref{Limits control,
3846 , Invoking m4}). Additionally, the option @option{--trace} (or
3847 @option{-t}) can be used to invoke @code{traceon(@var{name})} before
3850 @comment The explicit -dp neutralizes the testsuite default of -d.
3851 @comment options: -dp -L3 -tifelse
3854 $ @kbd{m4 -L 3 -t ifelse}
3856 @error{}m4trace: -1- ifelse
3858 ifelse(ifelse(ifelse(`three levels')))
3859 @error{}m4trace: -3- ifelse
3860 @error{}m4trace: -2- ifelse
3861 @error{}m4trace: -1- ifelse
3863 ifelse(ifelse(ifelse(ifelse(`four levels'))))
3864 @error{}m4:stdin:3: recursion limit of 3 exceeded, use -L<N> to change it
3867 Tracing by name is an attribute that is preserved whether the macro is
3868 defined or not. This allows the selection of macros to trace before
3869 those macros are defined.
3881 define(`foo', `bar')
3884 @error{}m4trace: -1- foo -> `bar'
3888 ifdef(`foo', `yes', `no')
3891 @error{}m4:stdin:9: undefined macro `foo'
3893 define(`foo', `blah')
3896 @error{}m4trace: -1- foo -> `blah'
3904 Tracing even works on builtins. However, @code{defn} (@pxref{Defn})
3905 does not transfer tracing status.
3912 @error{}m4trace: -1- traceon(`traceoff')
3914 traceoff(`traceoff')
3915 @error{}m4trace: -1- traceoff(`traceoff')
3919 traceon(`eval', `m4_divnum')
3921 define(`m4_eval', defn(`eval'))
3923 define(`m4_divnum', defn(`divnum'))
3926 @error{}m4trace: -1- eval(`0') -> `0'
3929 @error{}m4trace: -2- m4_divnum -> `0'
3933 @xref{Debug Levels}, for information on controlling the details of the
3934 display. The format of the trace output is not specified by
3935 POSIX, and varies between implementations of @code{m4}.
3938 @comment not worth including in the manual, but this tests a trace code
3939 @comment path that was temporarily broken
3940 @comment options: -de --trace ifelse
3942 $ @kbd{m4 -de --trace ifelse}
3943 define(`e', `ifelse(`$1', `$2', `ifelse(`$1', `$2', `e(shift($@@))')')')
3946 @error{}m4trace: -1- ifelse -> ifelse(`1', `1', `e(shift(`1',`1'))')
3947 @error{}m4trace: -1- ifelse -> e(shift(`1',`1'))
3948 @error{}m4trace: -1- ifelse
3954 @section Controlling debugging output
3956 @cindex controlling debugging output
3957 @cindex debugging output, controlling
3958 The @option{-d} option to @code{m4} (or @option{--debug},
3959 @pxref{Debugging options, , Invoking m4}) controls the amount of details
3961 categories of output. Trace output is requested by @code{traceon}
3962 (@pxref{Trace}), and each line is prefixed by @samp{m4trace:} in
3963 relation to a macro invocation. Debug output tracks useful events not
3964 associated with a macro invocation, and each line is prefixed by
3965 @samp{m4debug:}. Finally, @code{dumpdef} (@pxref{Dumpdef}) output is
3966 affected, with no prefix added to the output lines.
3968 The @var{flags} following the option can be one or more of the
3973 In trace output, show the actual arguments that were collected before
3974 invoking the macro. This applies to all macro calls if the @samp{t}
3975 flag is used, otherwise only the macros covered by calls of
3976 @code{traceon}. Arguments are subject to length truncation specified by
3977 the command line option @option{--arglength} (or @option{-l}).
3980 In trace output, show several trace lines for each macro call. A line
3981 is shown when the macro is seen, but before the arguments are collected;
3982 a second line when the arguments have been collected and a third line
3983 after the call has completed.
3986 In trace output, show the expansion of each macro call, if it is not
3987 void. This applies to all macro calls if the @samp{t} flag is used,
3988 otherwise only the macros covered by calls of @code{traceon}. The
3989 expansion is subject to length truncation specified by the command line
3990 option @option{--arglength} (or @option{-l}).
3993 In debug and trace output, include the name of the current input file in
3997 In debug output, print a message each time the current input file is
4001 In debug and trace output, include the current input line number in the
4005 In debug output, print a message when a named file is found through the
4006 path search mechanism (@pxref{Search Path}), giving the actual file name
4010 In trace and dumpdef output, quote actual arguments and macro expansions
4011 in the display with the current quotes. This is useful in connection
4012 with the @samp{a} and @samp{e} flags above.
4015 In trace output, trace all macro calls made in this invocation of
4016 @code{m4}, regardless of the settings of @code{traceon}.
4019 In trace output, add a unique `macro call id' to each line of the trace
4020 output. This is useful in connection with the @samp{c} flag above.
4023 A shorthand for all of the above flags.
4026 If no flags are specified with the @option{-d} option, the default is
4027 @samp{aeq}. The examples throughout this manual assume the default
4030 @cindex GNU extensions
4031 There is a builtin macro @code{debugmode}, which allows on-the-fly control of
4032 the debugging output format:
4034 @deffn Builtin debugmode (@ovar{flags})
4035 The argument @var{flags} should be a subset of the letters listed above.
4036 As special cases, if the argument starts with a @samp{+}, the flags are
4037 added to the current debug flags, and if it starts with a @samp{-}, they
4038 are removed. If no argument is present, all debugging flags are cleared
4039 (as if no @option{-d} was given), and with an empty argument the flags
4040 are reset to the default of @samp{aeq}.
4042 The expansion of @code{debugmode} is void.
4045 @comment The explicit -dp neutralizes the testsuite default of -d.
4046 @comment options: -dp
4049 define(`foo', `FOO')
4056 @error{}m4trace: -1- foo -> `FOO'
4061 @error{}m4trace: -1- foo
4066 @error{}m4trace:8: -1- foo
4070 The following example demonstrates the behavior of length truncation,
4071 when specified on the command line. Note that each argument and the
4072 final result are individually truncated. Also, the special tokens for
4073 builtin functions are not truncated.
4075 @comment options: -l6
4078 define(`echo', `$@@')debugmode(`+t')
4080 echo(`1', `long string')
4081 @error{}m4trace: -1- echo(`1', `long s...') -> ``1',`l...'
4082 @result{}1,long string
4083 indir(`echo', defn(`changequote'))
4084 @error{}m4trace: -2- defn(`change...')
4085 @error{}m4trace: -1- indir(`echo', <changequote>) -> ``''
4089 This example shows the effects of the debug flags that are not related
4093 @comment options: -dip
4095 $ @kbd{m4 -dip -I examples}
4096 @error{}m4debug: input read from stdin
4098 @error{}m4debug: path search for `foo' found `examples/foo'
4099 @error{}m4debug: input read from examples/foo
4101 @error{}m4debug: input reverted to stdin, line 1
4103 @error{}m4debug: input exhausted
4107 @section Saving debugging output
4109 @cindex saving debugging output
4110 @cindex debugging output, saving
4111 @cindex output, saving debugging
4112 @cindex GNU extensions
4113 Debug and tracing output can be redirected to files using either the
4114 @option{--debugfile} option to @code{m4} (@pxref{Debugging options, ,
4115 Invoking m4}), or with the builtin macro @code{debugfile}:
4117 @deffn Builtin debugfile (@ovar{file})
4118 Sends all further debug and trace output to @var{file}, opened in append
4119 mode. If @var{file} is the empty string, debug and trace output are
4120 discarded. If @code{debugfile} is called without any arguments, debug
4121 and trace output are sent to standard error. This does not affect
4122 warnings, error messages, or @code{errprint} output, which are
4123 always sent to standard error. If @var{file} cannot be opened, the
4124 current debug file is unchanged, and an error is issued.
4126 The expansion of @code{debugfile} is void.
4134 @error{}m4:stdin:2: Warning: excess arguments to builtin `divnum' ignored
4135 @error{}m4trace: -1- divnum(`extra') -> `0'
4140 @error{}m4:stdin:4: Warning: excess arguments to builtin `divnum' ignored
4145 @error{}m4trace: -1- divnum -> `0'
4150 @chapter Input control
4152 This chapter describes various builtin macros for controlling the input
4156 * Dnl:: Deleting whitespace in input
4157 * Changequote:: Changing the quote characters
4158 * Changecom:: Changing the comment delimiters
4159 * Changeword:: Changing the lexical structure of words
4160 * M4wrap:: Saving text until end of input
4164 @section Deleting whitespace in input
4166 @cindex deleting whitespace in input
4167 @cindex discarding input
4168 @cindex input, discarding
4169 The builtin @code{dnl} stands for ``Discard to Next Line'':
4172 All characters, up to and including the next newline, are discarded
4173 without performing any macro expansion. A warning is issued if the end
4174 of the file is encountered without a newline.
4176 The expansion of @code{dnl} is void.
4179 It is often used in connection with @code{define}, to remove the
4180 newline that follows the call to @code{define}. Thus
4183 define(`foo', `Macro `foo'.')dnl A very simple macro, indeed.
4188 The input up to and including the next newline is discarded, as opposed
4189 to the way comments are treated (@pxref{Comments}).
4191 Usually, @code{dnl} is immediately followed by an end of line or some
4192 other whitespace. GNU @code{m4} will produce a warning diagnostic if
4193 @code{dnl} is followed by an open parenthesis. In this case, @code{dnl}
4194 will collect and process all arguments, looking for a matching close
4195 parenthesis. All predictable side effects resulting from this
4196 collection will take place. @code{dnl} will return no output. The
4197 input following the matching close parenthesis up to and including the
4198 next newline, on whatever line containing it, will still be discarded.
4201 dnl(`args are ignored, but side effects occur',
4202 define(`foo', `like this')) while this text is ignored: undefine(`foo')
4203 @error{}m4:stdin:1: Warning: excess arguments to builtin `dnl' ignored
4204 See how `foo' was defined, foo?
4205 @result{}See how foo was defined, like this?
4208 If the end of file is encountered without a newline character, a
4209 warning is issued and dnl stops consuming input.
4212 m4wrap(`m4wrap(`2 hi
4218 @error{}m4:stdin:1: Warning: end of file treated as newline
4223 @section Changing the quote characters
4225 @cindex changing quote delimiters
4226 @cindex quote delimiters, changing
4227 @cindex delimiters, changing
4228 The default quote delimiters can be changed with the builtin
4231 @deffn Builtin changequote (@dvar{start, `}, @dvar{end, '})
4232 This sets @var{start} as the new begin-quote delimiter and @var{end} as
4233 the new end-quote delimiter. If both arguments are missing, the default
4234 quotes (@code{`} and @code{'}) are used. If @var{start} is void, then
4235 quoting is disabled. Otherwise, if @var{end} is missing or void, the
4236 default end-quote delimiter (@code{'}) is used. The quote delimiters
4237 can be of any length.
4239 The expansion of @code{changequote} is void.
4243 changequote(`[', `]')
4245 define([foo], [Macro [foo].])
4251 The quotation strings can safely contain eight-bit characters.
4253 @comment Yuck. I know of no clean way to render an 8-bit character in
4254 @comment both info and dvi. This example uses the `open-guillemot' and
4255 @comment `close-guillemot' characters of the Latin-1 character set.
4262 changequote(`«', `»')
4268 If no single character is appropriate, @var{start} and @var{end} can be
4269 of any length. Other implementations cap the delimiter length to five
4270 characters, but GNU has no inherent limit.
4273 changequote(`[[[', `]]]')
4275 define([[[foo]]], [[[Macro [[[[[foo]]]]].]]])
4278 @result{}Macro [[foo]].
4281 Calling @code{changequote} with @var{start} as the empty string will
4282 effectively disable the quoting mechanism, leaving no way to quote text.
4283 However, using an empty string is not portable, as some other
4284 implementations of @code{m4} revert to the default quoting, while others
4285 preserve the prior non-empty delimiter. If @var{start} is not empty,
4286 then an empty @var{end} will use the default end-quote delimiter of
4287 @samp{'}, as otherwise, it would be impossible to end a quoted string.
4288 Again, this is not portable, as some other @code{m4} implementations
4289 reuse @var{start} as the end-quote delimiter, while others preserve the
4290 previous non-empty value. Omitting both arguments restores the default
4291 begin-quote and end-quote delimiters; fortunately this behavior is
4292 portable to all implementations of @code{m4}.
4295 define(`foo', `Macro `FOO'.')
4300 @result{}Macro `FOO'.
4302 @result{}`Macro `FOO'.'
4309 There is no way in @code{m4} to quote a string containing an unmatched
4310 begin-quote, except using @code{changequote} to change the current
4313 If the quotes should be changed from, say, @samp{[} to @samp{[[},
4314 temporary quote characters have to be defined. To achieve this, two
4315 calls of @code{changequote} must be made, one for the temporary quotes
4316 and one for the new quotes.
4318 Macros are recognized in preference to the begin-quote string, so if a
4319 prefix of @var{start} can be recognized as part of a potential macro
4320 name, the quoting mechanism is effectively disabled. Unless you use
4321 @code{changeword} (@pxref{Changeword}), this means that @var{start}
4322 should not begin with a letter, digit, or @samp{_} (underscore).
4323 However, even though quoted strings are not recognized, the quote
4324 characters can still be discerned in macro expansion and in trace
4328 define(`echo', `$@@')
4332 changequote(`q', `Q')
4340 changequote(`-', `EOF')
4346 changequote(`1', `2')
4354 Quotes are recognized in preference to argument collection. In
4355 particular, if @var{start} is a single @samp{(}, then argument
4356 collection is effectively disabled. For portability with other
4357 implementations, it is a good idea to avoid @samp{(}, @samp{,}, and
4358 @samp{)} as the first character in @var{start}.
4361 define(`echo', `$#:$@@:')
4365 changequote(`(',`)')
4371 changequote(`((', `))')
4379 changequote(`,', `)')
4385 However, if you are not worried about portability, using @samp{(} and
4386 @samp{)} as quoting characters has an interesting property---you can use
4387 it to compute a quoted string containing the expansion of any quoted
4388 text, as long as the expansion results in both balanced quotes and
4389 balanced parentheses. The trick is realizing @code{expand} uses
4390 @samp{$1} unquoted, to trigger its expansion using the normal quoting
4391 characters, but uses extra parentheses to group unquoted commas that
4392 occur in the expansion without consuming whitespace following those
4393 commas. Then @code{_expand} uses @code{changequote} to convert the
4394 extra parentheses back into quoting characters. Note that it takes two
4395 more @code{changequote} invocations to restore the original quotes.
4396 Contrast the behavior on whitespace when using @samp{$*}, via
4397 @code{quote}, to attempt the same task.
4400 changequote(`[', `]')dnl
4401 define([a], [1, (b)])dnl
4403 define([quote], [[$*]])dnl
4404 define([expand], [_$0(($1))])dnl
4406 [changequote([(], [)])$1changequote`'changequote(`[', `]')])dnl
4407 expand([a, a, [a, a], [[a, a]]])
4408 @result{}1, (2), 1, (2), a, a, [a, a]
4409 quote(a, a, [a, a], [[a, a]])
4410 @result{}1,(2),1,(2),a, a,[a, a]
4413 If @var{end} is a prefix of @var{start}, the end-quote will be
4414 recognized in preference to a nested begin-quote. In particular,
4415 changing the quotes to have the same string for @var{start} and
4416 @var{end} disables nesting of quotes. When quote nesting is disabled,
4417 it is impossible to double-quote strings across macro expansions, so
4418 using the same string is not done very often.
4423 changequote(`""', `"')
4435 changequote(`"', `"')
4442 @comment And another stress test, not worth documenting in the manual.
4444 define(`aaaaaaaaaaaaaaaaaaaa', `A')define(`q', `"$@@"')
4446 changequote(`"', `"')
4448 q(q("aaaaaaaaaaaaaaaaaaaa", "a"))
4453 It is an error if the end of file occurs within a quoted string.
4458 @result{}hello world
4461 @error{}m4:stdin:2: ERROR: end of file in string
4466 ifelse(`dangling quote
4468 @error{}m4:stdin:1: ERROR: end of file in string
4472 @section Changing the comment delimiters
4474 @cindex changing comment delimiters
4475 @cindex comment delimiters, changing
4476 @cindex delimiters, changing
4477 The default comment delimiters can be changed with the builtin
4478 macro @code{changecom}:
4480 @deffn Builtin changecom (@ovar{start}, @dvar{end, @key{NL}})
4481 This sets @var{start} as the new begin-comment delimiter and @var{end}
4482 as the new end-comment delimiter. If both arguments are missing, or
4483 @var{start} is void, then comments are disabled. Otherwise, if
4484 @var{end} is missing or void, the default end-comment delimiter of
4485 newline is used. The comment delimiters can be of any length.
4487 The expansion of @code{changecom} is void.
4491 define(`comment', `COMMENT')
4494 @result{}# A normal comment
4495 changecom(`/*', `*/')
4497 # Not a comment anymore
4498 @result{}# Not a COMMENT anymore
4499 But: /* this is a comment now */ while this is not a comment
4500 @result{}But: /* this is a comment now */ while this is not a COMMENT
4503 @cindex comments, copied to output
4504 Note how comments are copied to the output, much as if they were quoted
4505 strings. If you want the text inside a comment expanded, quote the
4506 begin-comment delimiter.
4508 Calling @code{changecom} without any arguments, or with @var{start} as
4509 the empty string, will effectively disable the commenting mechanism. To
4510 restore the original comment start of @samp{#}, you must explicitly ask
4511 for it. If @var{start} is not empty, then an empty @var{end} will use
4512 the default end-comment delimiter of newline, as otherwise, it would be
4513 impossible to end a comment. However, this is not portable, as some
4514 other @code{m4} implementations preserve the previous non-empty
4518 define(`comment', `COMMENT')
4522 # Not a comment anymore
4523 @result{}# Not a COMMENT anymore
4527 @result{}# comment again
4530 The comment strings can safely contain eight-bit characters.
4532 @comment Yuck. I know of no clean way to render an 8-bit character in
4533 @comment both info and dvi. This example uses the `open-guillemot' and
4534 @comment `close-guillemot' characters of the Latin-1 character set.
4541 changecom(`«', `»')
4547 If no single character is appropriate, @var{start} and @var{end} can be
4548 of any length. Other implementations cap the delimiter length to five
4549 characters, but GNU has no inherent limit.
4551 Comments are recognized in preference to macros. However, this is not
4552 compatible with other implementations, where macros and even quoting
4553 takes precedence over comments, so it may change in a future release.
4554 For portability, this means that @var{start} should not begin with a
4555 letter, digit, or @samp{_} (underscore), and that neither the
4556 start-quote nor the start-comment string should be a prefix of the
4562 define(`hi1hi2', `hello')
4576 Comments are recognized in preference to argument collection. In
4577 particular, if @var{start} is a single @samp{(}, then argument
4578 collection is effectively disabled. For portability with other
4579 implementations, it is a good idea to avoid @samp{(}, @samp{,}, and
4580 @samp{)} as the first character in @var{start}.
4583 define(`echo', `$#:$*:$@@:')
4593 changecom(`((', `))')
4602 @result{}1:HI,hi)bye:HI,hi)bye:
4606 @result{}3:HI,,HI,HI:HI,,`'hi,HI:
4607 echo(hi,`,`'hi',hi`'changecom(`,,', `hi'))
4608 @result{}3:HI,,`'hi,HI:HI,,`'hi,HI:
4611 It is an error if the end of file occurs within a comment.
4615 changecom(`/*', `*/')
4619 @error{}m4:stdin:2: ERROR: end of file in comment
4623 @section Changing the lexical structure of words
4625 @cindex lexical structure of words
4626 @cindex words, lexical structure of
4627 @cindex syntax, changing
4628 @cindex changing syntax
4629 @cindex regular expressions
4631 The macro @code{changeword} and all associated functionality is
4632 experimental. It is only available if the @option{--enable-changeword}
4633 option was given to @command{configure}, at GNU @code{m4}
4635 time. The functionality will go away in the future, to be replaced by
4636 other new features that are more efficient at providing the same
4637 capabilities. @emph{Do not rely on it}. Please direct your comments
4638 about it the same way you would do for bugs.
4641 A file being processed by @code{m4} is split into quoted strings, words
4642 (potential macro names) and simple tokens (any other single character).
4643 Initially a word is defined by the following regular expression:
4647 [_a-zA-Z][_a-zA-Z0-9]*
4650 Using @code{changeword}, you can change this regular expression:
4652 @deffn {Optional builtin} changeword (@var{regex})
4653 Changes the regular expression for recognizing macro names to be
4654 @var{regex}. If @var{regex} is empty, use
4655 @samp{[_a-zA-Z][_a-zA-Z0-9]*}. @var{regex} must obey the constraint
4656 that every prefix of the desired final pattern is also accepted by the
4657 regular expression. If @var{regex} contains grouping parentheses, the
4658 macro invoked is the portion that matched the first group, rather than
4659 the entire matching string.
4661 The expansion of @code{changeword} is void.
4662 The macro @code{changeword} is recognized only with parameters.
4665 Relaxing the lexical rules of @code{m4} might be useful (for example) if
4666 you wanted to apply translations to a file of numbers:
4669 ifdef(`changeword', `', `errprint(` skipping: no changeword support
4671 changeword(`[_a-zA-Z0-9]+')
4677 Tightening the lexical rules is less useful, because it will generally
4678 make some of the builtins unavailable. You could use it to prevent
4679 accidental call of builtins, for example:
4682 ifdef(`changeword', `', `errprint(` skipping: no changeword support
4684 define(`_indir', defn(`indir'))
4686 changeword(`_[_a-zA-Z0-9]*')
4689 @result{}esyscmd(foo)
4690 _indir(`esyscmd', `echo hi')
4695 Because @code{m4} constructs its words a character at a time, there
4696 is a restriction on the regular expressions that may be passed to
4697 @code{changeword}. This is that if your regular expression accepts
4698 @samp{foo}, it must also accept @samp{f} and @samp{fo}.
4701 ifdef(`changeword', `', `errprint(` skipping: no changeword support
4707 dnl This example wants to recognize changeword, dnl, and `foo\n'.
4708 dnl First, we check that our regexp will match.
4709 regexp(`changeword', `[cd][a-z]*\|foo[
4713 ', `[cd][a-z]*\|foo[
4716 regexp(`f', `[cd][a-z]*\|foo[
4721 changeword(`[cd][a-z]*\|foo[
4724 dnl Even though `foo\n' matches, we forgot to allow `f'.
4727 changeword(`[cd][a-z]*\|fo*[
4730 dnl Now we can call `foo\n'.
4736 @comment One more test of including newline in a macro name; but this
4737 @comment does not need to be displayed in the manual. This ensures
4738 @comment that line numbering is correct when dnl cuts across include
4739 @comment file boundaries, and when __file__ or __line__ is the last
4740 @comment token in an include file.
4743 ifdef(`changeword', `', `errprint(` skipping: no changeword support
4748 include(`foo') ignored
4750 changeword(`\([_a-zA-Z][_a-zA-Z0-9]*\|bar
4755 include(`foo') ignored
4762 ', defn(`__file__'))
4765 @result{}examples/foo
4767 ', defn(`__line__'))
4776 @code{changeword} has another function. If the regular expression
4777 supplied contains any grouped subexpressions, then text outside
4778 the first of these is discarded before symbol lookup. So:
4781 ifdef(`changeword', `', `errprint(` skipping: no changeword support
4784 `errprint(` skipping: syscmd does not have unix semantics
4786 changecom(`/*', `*/')dnl
4787 define(`foo', `bar')dnl
4788 changeword(`#\([_a-zA-Z0-9]*\)')
4790 #esyscmd(`echo foo \#foo')
4795 @code{m4} now requires a @samp{#} mark at the beginning of every
4796 macro invocation, so one can use @code{m4} to preprocess plain
4797 text without losing various words like @samp{divert}.
4799 In @code{m4}, macro substitution is based on text, while in @TeX{}, it
4800 is based on tokens. @code{changeword} can throw this difference into
4801 relief. For example, here is the same idea represented in @TeX{} and
4802 @code{m4}. First, the @TeX{} version:
4806 \def\a@{\message@{Hello@}@}
4815 Then, the @code{m4} version:
4818 ifdef(`changeword', `', `errprint(` skipping: no changeword support
4820 define(`a', `errprint(`Hello')')dnl
4821 changeword(`@@\([_a-zA-Z0-9]*\)')
4824 @result{}errprint(Hello)
4827 In the @TeX{} example, the first line defines a macro @code{a} to
4828 print the message @samp{Hello}. The second line defines @key{@@} to
4829 be usable instead of @key{\} as an escape character. The third line
4830 defines @key{\} to be a normal printing character, not an escape.
4831 The fourth line invokes the macro @code{a}. So, when @TeX{} is run
4832 on this file, it displays the message @samp{Hello}.
4834 When the @code{m4} example is passed through @code{m4}, it outputs
4835 @samp{errprint(Hello)}. The reason for this is that @TeX{} does
4836 lexical analysis of macro definition when the macro is @emph{defined}.
4837 @code{m4} just stores the text, postponing the lexical analysis until
4838 the macro is @emph{used}.
4840 You should note that using @code{changeword} will slow @code{m4} down
4841 by a factor of about seven, once it is changed to something other
4842 than the default regular expression. You can invoke @code{changeword}
4843 with the empty string to restore the default word definition, and regain
4847 @section Saving text until end of input
4849 @cindex saving input
4850 @cindex input, saving
4851 @cindex deferring expansion
4852 @cindex expansion, deferring
4853 It is possible to `save' some text until the end of the normal input has
4854 been seen. Text can be saved, to be read again by @code{m4} when the
4855 normal input has been exhausted. This feature is normally used to
4856 initiate cleanup actions before normal exit, e.g., deleting temporary
4859 To save input text, use the builtin @code{m4wrap}:
4861 @deffn Builtin m4wrap (@var{string}, @dots{})
4862 Stores @var{string} in a safe place, to be reread when end of input is
4863 reached. As a GNU extension, additional arguments are
4864 concatenated with a space to the @var{string}.
4866 The expansion of @code{m4wrap} is void.
4867 The macro @code{m4wrap} is recognized only with parameters.
4871 define(`cleanup', `This is the `cleanup' action.
4876 This is the first and last normal input line.
4877 @result{}This is the first and last normal input line.
4879 @result{}This is the cleanup action.
4882 The saved input is only reread when the end of normal input is seen, and
4883 not if @code{m4exit} is used to exit @code{m4}.
4885 @comment FIXME: this contradicts POSIX, which requires that "If the
4886 @comment m4wrap macro is used multiple times, the arguments specified
4887 @comment shall be processed in the order in which the m4wrap macros were
4888 @comment processed."
4889 It is safe to call @code{m4wrap} from saved text, but then the order in
4890 which the saved text is reread is undefined. If @code{m4wrap} is not used
4891 recursively, the saved pieces of text are reread in the opposite order
4892 in which they were saved (LIFO---last in, first out). However, this
4893 behavior is likely to change in a future release, to match
4894 POSIX, so you should not depend on this order.
4896 It is possible to emulate POSIX behavior even
4897 with older versions of GNU M4 by including the file
4898 @file{m4-@value{VERSION}/@/examples/@/wrapfifo.m4} from the
4903 $ @kbd{m4 -I examples}
4904 undivert(`wrapfifo.m4')dnl
4905 @result{}dnl Redefine m4wrap to have FIFO semantics.
4906 @result{}define(`_m4wrap_level', `0')dnl
4907 @result{}define(`m4wrap',
4908 @result{}`ifdef(`m4wrap'_m4wrap_level,
4909 @result{} `define(`m4wrap'_m4wrap_level,
4910 @result{} defn(`m4wrap'_m4wrap_level)`$1')',
4911 @result{} `builtin(`m4wrap', `define(`_m4wrap_level',
4912 @result{} incr(_m4wrap_level))dnl
4913 @result{}m4wrap'_m4wrap_level)dnl
4914 @result{}define(`m4wrap'_m4wrap_level, `$1')')')dnl
4915 include(`wrapfifo.m4')
4917 m4wrap(`a`'m4wrap(`c
4918 ', `d')')m4wrap(`b')
4924 It is likewise possible to emulate LIFO behavior without resorting to
4925 the GNU M4 extension of @code{builtin}, by including the file
4926 @file{m4-@value{VERSION}/@/examples/@/wraplifo.m4} from the
4927 distribution. (Unfortunately, both examples shown here share some
4928 subtle bugs. See if you can find and correct them; or @pxref{Improved
4929 m4wrap, , Answers}).
4933 $ @kbd{m4 -I examples}
4934 undivert(`wraplifo.m4')dnl
4935 @result{}dnl Redefine m4wrap to have LIFO semantics.
4936 @result{}define(`_m4wrap_level', `0')dnl
4937 @result{}define(`_m4wrap', defn(`m4wrap'))dnl
4938 @result{}define(`m4wrap',
4939 @result{}`ifdef(`m4wrap'_m4wrap_level,
4940 @result{} `define(`m4wrap'_m4wrap_level,
4941 @result{} `$1'defn(`m4wrap'_m4wrap_level))',
4942 @result{} `_m4wrap(`define(`_m4wrap_level', incr(_m4wrap_level))dnl
4943 @result{}m4wrap'_m4wrap_level)dnl
4944 @result{}define(`m4wrap'_m4wrap_level, `$1')')')dnl
4945 include(`wraplifo.m4')
4947 m4wrap(`a`'m4wrap(`c
4948 ', `d')')m4wrap(`b')
4954 Here is an example of implementing a factorial function using
4958 define(`f', `ifelse(`$1', `0', `Answer: 0!=1
4959 ', eval(`$1>1'), `0', `Answer: $2$1=eval(`$2$1')
4960 ', `m4wrap(`f(decr(`$1'), `$2$1*')')')')
4965 @result{}Answer: 10*9*8*7*6*5*4*3*2*1=3628800
4968 Invocations of @code{m4wrap} at the same recursion level are
4969 concatenated and rescanned as usual:
4975 m4wrap(`a')m4wrap(`a')
4982 however, the transition between recursion levels behaves like an end of
4983 file condition between two input files.
4987 m4wrap(`m4wrap(`)')len(abc')
4990 @error{}m4:stdin:1: ERROR: end of file in argument list
4993 @node File Inclusion
4994 @chapter File inclusion
4996 @cindex file inclusion
4997 @cindex inclusion, of files
4998 @code{m4} allows you to include named files at any point in the input.
5001 * Include:: Including named files
5002 * Search Path:: Searching for include files
5006 @section Including named files
5008 There are two builtin macros in @code{m4} for including files:
5010 @deffn Builtin include (@var{file})
5011 @deffnx Builtin sinclude (@var{file})
5012 Both macros cause the file named @var{file} to be read by
5013 @code{m4}. When the end of the file is reached, input is resumed from
5014 the previous input file.
5016 The expansion of @code{include} and @code{sinclude} is therefore the
5017 contents of @var{file}.
5019 If @var{file} does not exist, is a directory, or cannot otherwise be
5020 read, the expansion is void,
5021 and @code{include} will fail with an error while @code{sinclude} is
5022 silent. The empty string counts as a file that does not exist.
5024 The macros @code{include} and @code{sinclude} are recognized only with
5031 @error{}m4:stdin:1: cannot open `none': No such file or directory
5034 @error{}m4:stdin:2: cannot open `': No such file or directory
5042 The rest of this section assumes that @code{m4} is invoked with the
5043 @option{-I} option (@pxref{Preprocessor features, , Invoking m4})
5044 pointing to the @file{m4-@value{VERSION}/@/examples}
5045 directory shipped as part of the GNU @code{m4} package. The
5046 file @file{m4-@value{VERSION}/@/examples/@/incl.m4} in the distribution
5051 $ @kbd{cat examples/incl.m4}
5052 @result{}Include file start
5054 @result{}Include file end
5057 Normally file inclusion is used to insert the contents of a file
5058 into the input stream. The contents of the file will be read by
5059 @code{m4} and macro calls in the file will be expanded:
5063 $ @kbd{m4 -I examples}
5064 define(`foo', `FOO')
5067 @result{}Include file start
5069 @result{}Include file end
5073 The fact that @code{include} and @code{sinclude} expand to the contents
5074 of the file can be used to define macros that operate on entire files.
5075 Here is an example, which defines @samp{bar} to expand to the contents
5080 $ @kbd{m4 -I examples}
5081 define(`bar', include(`incl.m4'))
5083 This is `bar': >>bar<<
5084 @result{}This is bar: >>Include file start
5086 @result{}Include file end
5090 This use of @code{include} is not trivial, though, as files can contain
5091 quotes, commas, and parentheses, which can interfere with the way the
5092 @code{m4} parser works. GNU @code{m4} seamlessly concatenates
5093 the file contents with the next character, even if the included file
5094 ended in the middle of a comment, string, or macro call. These
5095 conditions are only treated as end of file errors if specified as input
5096 files on the command line.
5098 In GNU @code{m4}, an alternative method of reading files is
5099 using @code{undivert} (@pxref{Undivert}) on a named file.
5102 @comment Test that include(`file/') detects that file is not a
5103 @comment directory; we can assume that the current directory contains a
5104 @comment Makefile. mingw fails with EINVAL rather than ENOTDIR.
5107 @comment xerr: ignore
5109 include(`Makefile/')
5110 @error{}m4:stdin:1: cannot open `Makefile/': Not a directory
5114 @comment POSIX allows, but doesn't require, failure on reading
5115 @comment directories. But since they aren't text files, it never makes
5116 @comment sense, so we globally forbid it even if fopen doesn't. mingw
5117 @comment fails with EACCES rather than EISDIR.
5120 @comment xerr: ignore
5123 @error{}m4:stdin:1: cannot open `.': Is a directory
5127 @comment Meanwhile, ignore errors with sinclude.
5130 sinclude(`Makefile/')
5138 @section Searching for include files
5140 @cindex search path for included files
5141 @cindex included files, search path for
5142 @cindex GNU extensions
5143 GNU @code{m4} allows included files to be found in other directories
5144 than the current working directory.
5146 @cindex @env{M4PATH}
5147 If the @option{--prepend-include} or @option{-B} command-line option was
5148 provided (@pxref{Preprocessor features, , Invoking m4}), those
5149 directories are searched first, in reverse order that those options were
5150 listed on the command line. Then @code{m4} looks in the current working
5151 directory. Next comes the directories specified with the
5152 @option{--include} or @option{-I} option, in the order found on the
5153 command line. Finally, if the @env{M4PATH} environment variable is set,
5154 it is expected to contain a colon-separated list of directories, which
5155 will be searched in order.
5157 If the automatic search for include-files causes trouble, the @samp{p}
5158 debug flag (@pxref{Debug Levels}) can help isolate the problem.
5161 @chapter Diverting and undiverting output
5163 @cindex deferring output
5164 Diversions are a way of temporarily saving output. The output of
5165 @code{m4} can at any time be diverted to a temporary file, and be
5166 reinserted into the output stream, @dfn{undiverted}, again at a later
5169 @cindex @env{TMPDIR}
5170 Numbered diversions are counted from 0 upwards, diversion number 0
5171 being the normal output stream. GNU
5172 @code{m4} tries to keep diversions in memory. However, there is a
5173 limit to the overall memory usable by all diversions taken together
5174 (512K, currently). When this maximum is about to be exceeded,
5175 a temporary file is opened to receive the contents of the biggest
5176 diversion still in memory, freeing this memory for other diversions.
5177 When creating the temporary file, @code{m4} honors the value of the
5178 environment variable @env{TMPDIR}, and falls back to @file{/tmp}.
5179 Thus, the amount of available disk space provides the only real limit on
5180 the number and aggregate size of diversions.
5183 @comment We need to test spilled diversions, but don't need to expose
5184 @comment this highly repetitive test in the manual.
5187 divert(`-1')define(`f', `.')
5188 define(`f', defn(`f')defn(`f'))
5189 define(`f', defn(`f')defn(`f'))
5190 define(`f', defn(`f')defn(`f'))
5191 define(`f', defn(`f')defn(`f'))
5192 define(`f', defn(`f')defn(`f'))
5193 define(`f', defn(`f')defn(`f'))
5194 define(`f', defn(`f')defn(`f'))
5195 define(`f', defn(`f')defn(`f'))
5196 define(`f', defn(`f')defn(`f'))
5197 define(`f', defn(`f')defn(`f'))
5198 define(`f', defn(`f')defn(`f'))
5199 define(`f', defn(`f')defn(`f'))
5200 define(`f', defn(`f')defn(`f'))
5201 define(`f', defn(`f')defn(`f'))
5202 define(`f', defn(`f')defn(`f'))
5203 define(`f', defn(`f')defn(`f'))
5204 define(`f', defn(`f')defn(`f'))
5205 define(`f', defn(`f')defn(`f'))
5206 define(`f', defn(`f')defn(`f'))
5207 define(`f', defn(`f')defn(`f'))
5215 divert(`-1')undivert
5221 @comment Another test of spilled diversions.
5224 divert(`-1')define(`f', `.')
5225 define(`f', defn(`f')defn(`f'))
5226 define(`f', defn(`f')defn(`f'))
5227 define(`f', defn(`f')defn(`f'))
5228 define(`f', defn(`f')defn(`f'))
5229 define(`f', defn(`f')defn(`f'))
5230 define(`f', defn(`f')defn(`f'))
5231 define(`f', defn(`f')defn(`f'))
5232 define(`f', defn(`f')defn(`f'))
5233 define(`f', defn(`f')defn(`f'))
5234 define(`f', defn(`f')defn(`f'))
5235 define(`f', defn(`f')defn(`f'))
5236 define(`f', defn(`f')defn(`f'))
5237 define(`f', defn(`f')defn(`f'))
5238 define(`f', defn(`f')defn(`f'))
5239 define(`f', defn(`f')defn(`f'))
5240 define(`f', defn(`f')defn(`f'))
5241 define(`f', defn(`f')defn(`f'))
5242 define(`f', defn(`f')defn(`f'))
5243 define(`f', defn(`f')defn(`f'))
5244 define(`f', defn(`f')defn(`f'))
5253 @comment Catch regression in 1.4.10 with spilled diversions.
5257 `errprint(` skipping: syscmd does not have unix semantics
5259 changequote(`[', `]')dnl
5260 syscmd([echo 'divert(1)hi
5261 format(%1000000d, 1)' | ']__program__[' | sed -n 1p])dnl
5267 @comment Avoid quadratic copying time when transferring diversions;
5268 @comment test both in-memory and spilled to file.
5272 $ @kbd{m4 -I examples}
5273 include(`forloop2.m4')dnl
5274 divert(`1')format(`%10000s', `')dnl
5275 forloop(`i', `1', `10000',
5276 `divert(incr(i))undivert(i)')dnl
5277 divert(`9001')format(`%1000000s', `')dnl
5278 forloop(`i', `9001', `10000',
5279 `divert(incr(i))undivert(i)')dnl
5280 divert(`-1')undivert
5284 Diversions make it possible to generate output in a different order than
5285 the input was read. It is possible to implement topological sorting
5286 dependencies. For example, GNU Autoconf makes use of
5287 diversions under the hood to ensure that the expansion of a prerequisite
5288 macro appears in the output prior to the expansion of a dependent macro,
5289 regardless of which order the two macros were invoked in the user's
5293 * Divert:: Diverting output
5294 * Undivert:: Undiverting output
5295 * Divnum:: Diversion numbers
5296 * Cleardivert:: Discarding diverted text
5300 @section Diverting output
5302 @cindex diverting output to files
5303 @cindex output, diverting to files
5304 @cindex files, diverting output to
5305 Output is diverted using @code{divert}:
5307 @deffn Builtin divert (@dvar{number, 0})
5308 The current diversion is changed to @var{number}. If @var{number} is left
5309 out or empty, it is assumed to be zero. If @var{number} cannot be
5310 parsed, the diversion is unchanged.
5312 The expansion of @code{divert} is void.
5315 When all the @code{m4} input will have been processed, all existing
5316 diversions are automatically undiverted, in numerical order.
5320 This text is diverted.
5323 This text is not diverted.
5324 @result{}This text is not diverted.
5327 @result{}This text is diverted.
5330 Several calls of @code{divert} with the same argument do not overwrite
5331 the previous diverted text, but append to it. Diversions are printed
5332 after any wrapped text is expanded.
5335 define(`text', `TEXT')
5337 divert(`1')`diverted text.'
5340 m4wrap(`Wrapped text precedes ')
5343 @result{}Wrapped TEXT precedes diverted text.
5346 @cindex discarding input
5347 @cindex input, discarding
5348 If output is diverted to a negative diversion, it is simply discarded.
5349 This can be used to suppress unwanted output. A common example of
5350 unwanted output is the trailing newlines after macro definitions. Here
5351 is a common programming idiom in @code{m4} for avoiding them.
5355 define(`foo', `Macro `foo'.')
5356 define(`bar', `Macro `bar'.')
5361 @cindex GNU extensions
5362 Traditional implementations only supported ten diversions. But as a
5363 GNU extension, diversion numbers can be as large as positive
5364 integers will allow, rather than treating a multi-digit diversion number
5365 as a request to discard text.
5368 divert(eval(`1<<28'))world
5375 Note that @code{divert} is an English word, but also an active macro
5376 without arguments. When processing plain text, the word might appear in
5377 normal text and be unintentionally swallowed as a macro invocation. One
5378 way to avoid this is to use the @option{-P} option to rename all
5379 builtins (@pxref{Operation modes, , Invoking m4}). Another is to write
5380 a wrapper that requires a parameter to be recognized.
5383 We decided to divert the stream for irrigation.
5384 @result{}We decided to the stream for irrigation.
5385 define(`divert', `ifelse(`$#', `0', ``$0'', `builtin(`$0', $@@)')')
5391 We decided to divert the stream for irrigation.
5392 @result{}We decided to divert the stream for irrigation.
5396 @section Undiverting output
5398 Diverted text can be undiverted explicitly using the builtin
5401 @deffn Builtin undivert (@ovar{diversions@dots{}})
5402 Undiverts the numeric @var{diversions} given by the arguments, in the
5403 order given. If no arguments are supplied, all diversions are
5404 undiverted, in numerical order.
5406 @cindex file inclusion
5407 @cindex inclusion, of files
5408 @cindex GNU extensions
5409 As a GNU extension, @var{diversions} may contain non-numeric
5410 strings, which are treated as the names of files to copy into the output
5411 without expansion. A warning is issued if a file could not be opened.
5413 The expansion of @code{undivert} is void.
5418 This text is diverted.
5421 This text is not diverted.
5422 @result{}This text is not diverted.
5425 @result{}This text is diverted.
5429 Notice the last two blank lines. One of them comes from the newline
5430 following @code{undivert}, the other from the newline that followed the
5431 @code{divert}! A diversion often starts with a blank line like this.
5433 When diverted text is undiverted, it is @emph{not} reread by @code{m4},
5434 but rather copied directly to the current output, and it is therefore
5435 not an error to undivert into a diversion. Undiverting the empty string
5436 is the same as specifying diversion 0; in either case nothing happens
5437 since the output has already been flushed.
5440 divert(`1')diverted text
5448 @result{}diverted text
5451 divert(`2')undivert(`1')diverted text`'divert
5457 @result{}diverted text
5460 When a diversion has been undiverted, the diverted text is discarded,
5461 and it is not possible to bring back diverted text more than once.
5465 This text is diverted first.
5466 divert(`0')undivert(`1')dnl
5468 @result{}This text is diverted first.
5472 This text is also diverted but not appended.
5473 divert(`0')undivert(`1')dnl
5475 @result{}This text is also diverted but not appended.
5478 Attempts to undivert the current diversion are silently ignored. Thus,
5479 when the current diversion is not 0, the current diversion does not get
5480 rearranged among the other diversions.
5486 divert(`2')undivert`'dnl
5487 divert`'undivert`'dnl
5493 @cindex GNU extensions
5494 @cindex file inclusion
5495 @cindex inclusion, of files
5496 GNU @code{m4} allows named files to be undiverted. Given a
5497 non-numeric argument, the contents of the file named will be copied,
5498 uninterpreted, to the current output. This complements the builtin
5499 @code{include} (@pxref{Include}). To illustrate the difference, assume
5500 the file @file{foo} contains:
5512 define(`bar', `BAR')
5522 If the file is not found (or cannot be read), an error message is
5523 issued, and the expansion is void. It is possible to intermix files
5524 and diversion numbers.
5527 divert(`1')diversion one
5528 divert(`2')undivert(`foo')dnl
5529 divert(`3')diversion three
5531 undivert(`1', `2', `foo', `3')dnl
5532 @result{}diversion one
5535 @result{}diversion three
5539 @section Diversion numbers
5541 @cindex diversion numbers
5542 The current diversion is tracked by the builtin @code{divnum}:
5544 @deffn Builtin divnum
5545 Expands to the number of the current diversion.
5552 Diversion one: divnum
5554 Diversion two: divnum
5557 @result{}Diversion one: 1
5559 @result{}Diversion two: 2
5563 @section Discarding diverted text
5565 @cindex discarding diverted text
5566 @cindex diverted text, discarding
5567 Often it is not known, when output is diverted, whether the diverted
5568 text is actually needed. Since all non-empty diversion are brought back
5569 on the main output stream when the end of input is seen, a method of
5570 discarding a diversion is needed. If all diversions should be
5571 discarded, the easiest is to end the input to @code{m4} with
5572 @samp{divert(`-1')} followed by an explicit @samp{undivert}:
5576 Diversion one: divnum
5578 Diversion two: divnum
5585 No output is produced at all.
5587 Clearing selected diversions can be done with the following macro:
5589 @deffn Composite cleardivert (@ovar{diversions@dots{}})
5590 Discard the contents of each of the listed numeric @var{diversions}.
5594 define(`cleardivert',
5595 `pushdef(`_n', divnum)divert(`-1')undivert($@@)divert(_n)popdef(`_n')')
5599 It is called just like @code{undivert}, but the effect is to clear the
5600 diversions, given by the arguments. (This macro has a nasty bug! You
5601 should try to see if you can find it and correct it; or @pxref{Improved
5602 cleardivert, , Answers}).
5605 @chapter Macros for text handling
5607 There are a number of builtins in @code{m4} for manipulating text in
5608 various ways, extracting substrings, searching, substituting, and so on.
5611 * Len:: Calculating length of strings
5612 * Index macro:: Searching for substrings
5613 * Regexp:: Searching for regular expressions
5614 * Substr:: Extracting substrings
5615 * Translit:: Translating characters
5616 * Patsubst:: Substituting text by regular expression
5617 * Format:: Formatting strings (printf-like)
5621 @section Calculating length of strings
5623 @cindex length of strings
5624 @cindex strings, length of
5625 The length of a string can be calculated by @code{len}:
5627 @deffn Builtin len (@var{string})
5628 Expands to the length of @var{string}, as a decimal number.
5630 The macro @code{len} is recognized only with parameters.
5641 @section Searching for substrings
5643 @cindex substrings, locating
5644 Searching for substrings is done with @code{index}:
5646 @deffn Builtin index (@var{string}, @var{substring})
5647 Expands to the index of the first occurrence of @var{substring} in
5648 @var{string}. The first character in @var{string} has index 0. If
5649 @var{substring} does not occur in @var{string}, @code{index} expands to
5652 The macro @code{index} is recognized only with parameters.
5656 index(`gnus, gnats, and armadillos', `nat')
5658 index(`gnus, gnats, and armadillos', `dag')
5662 Omitting @var{substring} evokes a warning, but still produces output;
5663 contrast this with an empty @var{substring}.
5667 @error{}m4:stdin:1: Warning: too few arguments to builtin `index'
5676 @comment Expose a bug in the strstr() algorithm present in glibc
5677 @comment 2.9 through 2.12 and in gnulib up to Sep 2010.
5680 index(`;:11-:12-:12-:12-:12-:12-:12-:12-:12.:12.:12.:12.:12.:12.:12.:12.:12-',
5681 `:12-:12-:12-:12-:12-:12-:12-:12-')
5685 @comment Expose a bug in the gnulib replacement strstr() algorithm
5686 @comment present from Jun 2010 to Feb 2011, including m4 1.4.15.
5689 index(`..wi.d.', `.d.')
5695 @section Searching for regular expressions
5697 @cindex basic regular expressions
5698 @cindex regular expressions
5699 @cindex expressions, regular
5700 @cindex GNU extensions
5701 Searching for regular expressions is done with the builtin
5704 @deffn Builtin regexp (@var{string}, @var{regexp}, @ovar{replacement})
5705 Searches for @var{regexp} in @var{string}. The syntax for regular
5706 expressions is the same as in GNU Emacs, which is similar to
5707 BRE, Basic Regular Expressions in POSIX.
5709 @xref{Regexps, , Syntax of Regular Expressions, emacs, The GNU Emacs
5714 @uref{http://www.gnu.org/@/software/@/emacs/@/manual/@/emacs.html#Regexps,
5715 Syntax of Regular Expressions} in the GNU Emacs Manual.
5717 Support for ERE, Extended Regular Expressions is not
5718 available, but will be added in GNU M4 2.0.
5720 If @var{replacement} is omitted, @code{regexp} expands to the index of
5721 the first match of @var{regexp} in @var{string}. If @var{regexp} does
5722 not match anywhere in @var{string}, it expands to -1.
5724 If @var{replacement} is supplied, and there was a match, @code{regexp}
5725 changes the expansion to this argument, with @samp{\@var{n}} substituted
5726 by the text matched by the @var{n}th parenthesized sub-expression of
5727 @var{regexp}, up to nine sub-expressions. The escape @samp{\&} is
5728 replaced by the text of the entire regular expression matched. For
5729 all other characters, @samp{\} treats the next character literally. A
5730 warning is issued if there were fewer sub-expressions than the
5731 @samp{\@var{n}} requested, or if there is a trailing @samp{\}. If there
5732 was no match, @code{regexp} expands to the empty string.
5734 The macro @code{regexp} is recognized only with parameters.
5738 regexp(`GNUs not Unix', `\<[a-z]\w+')
5740 regexp(`GNUs not Unix', `\<Q\w*')
5742 regexp(`GNUs not Unix', `\w\(\w+\)$', `*** \& *** \1 ***')
5743 @result{}*** Unix *** nix ***
5744 regexp(`GNUs not Unix', `\<Q\w*', `*** \& *** \1 ***')
5748 Here are some more examples on the handling of backslash:
5751 regexp(`abc', `\(b\)', `\\\10\a')
5753 regexp(`abc', `b', `\1\')
5754 @error{}m4:stdin:2: Warning: sub-expression 1 not present
5755 @error{}m4:stdin:2: Warning: trailing \ ignored in replacement
5757 regexp(`abc', `\(\(d\)?\)\(c\)', `\1\2\3\4\5\6')
5758 @error{}m4:stdin:3: Warning: sub-expression 4 not present
5759 @error{}m4:stdin:3: Warning: sub-expression 5 not present
5760 @error{}m4:stdin:3: Warning: sub-expression 6 not present
5764 Omitting @var{regexp} evokes a warning, but still produces output;
5765 contrast this with an empty @var{regexp} argument.
5769 @error{}m4:stdin:1: Warning: too few arguments to builtin `regexp'
5773 regexp(`abc', `', `\\def')
5778 @section Extracting substrings
5780 @cindex extracting substrings
5781 @cindex substrings, extracting
5782 Substrings are extracted with @code{substr}:
5784 @deffn Builtin substr (@var{string}, @var{from}, @ovar{length})
5785 Expands to the substring of @var{string}, which starts at index
5786 @var{from}, and extends for @var{length} characters, or to the end of
5787 @var{string}, if @var{length} is omitted. The starting index of a string
5788 is always 0. The expansion is empty if there is an error parsing
5789 @var{from} or @var{length}, if @var{from} is beyond the end of
5790 @var{string}, or if @var{length} is negative.
5792 The macro @code{substr} is recognized only with parameters.
5796 substr(`gnus, gnats, and armadillos', `6')
5797 @result{}gnats, and armadillos
5798 substr(`gnus, gnats, and armadillos', `6', `5')
5802 Omitting @var{from} evokes a warning, but still produces output.
5806 @error{}m4:stdin:1: Warning: too few arguments to builtin `substr'
5809 @error{}m4:stdin:2: empty string treated as 0 in builtin `substr'
5814 @section Translating characters
5816 @cindex translating characters
5817 @cindex characters, translating
5818 Character translation is done with @code{translit}:
5820 @deffn Builtin translit (@var{string}, @var{chars}, @ovar{replacement})
5821 Expands to @var{string}, with each character that occurs in
5822 @var{chars} translated into the character from @var{replacement} with
5825 If @var{replacement} is shorter than @var{chars}, the excess characters
5826 of @var{chars} are deleted from the expansion; if @var{chars} is
5827 shorter, the excess characters in @var{replacement} are silently
5828 ignored. If @var{replacement} is omitted, all characters in
5829 @var{string} that are present in @var{chars} are deleted from the
5830 expansion. If a character appears more than once in @var{chars}, only
5831 the first instance is used in making the translation. Only a single
5832 translation pass is made, even if characters in @var{replacement} also
5833 appear in @var{chars}.
5835 As a GNU extension, both @var{chars} and @var{replacement} can
5836 contain character-ranges, e.g., @samp{a-z} (meaning all lowercase
5837 letters) or @samp{0-9} (meaning all digits). To include a dash @samp{-}
5838 in @var{chars} or @var{replacement}, place it first or last in the
5839 entire string, or as the last character of a range. Back-to-back ranges
5840 can share a common endpoint. It is not an error for the last character
5841 in the range to be `larger' than the first. In that case, the range
5842 runs backwards, i.e., @samp{9-0} means the string @samp{9876543210}.
5843 The expansion of a range is dependent on the underlying encoding of
5844 characters, so using ranges is not always portable between machines.
5846 The macro @code{translit} is recognized only with parameters.
5850 translit(`GNUs not Unix', `A-Z')
5852 translit(`GNUs not Unix', `a-z', `A-Z')
5853 @result{}GNUS NOT UNIX
5854 translit(`GNUs not Unix', `A-Z', `z-a')
5855 @result{}tmfs not fnix
5856 translit(`+,-12345', `+--1-5', `<;>a-c-a')
5858 translit(`abcdef', `aabdef', `bcged')
5862 In the @sc{ascii} encoding, the first example deletes all uppercase
5863 letters, the second converts lowercase to uppercase, and the third
5864 `mirrors' all uppercase letters, while converting them to lowercase.
5865 The two first cases are by far the most common, even though they are not
5866 portable to @sc{ebcdic} or other encodings. The fourth example shows a
5867 range ending in @samp{-}, as well as back-to-back ranges. The final
5868 example shows that @samp{a} is mapped to @samp{b}, not @samp{c}; the
5869 resulting @samp{b} is not further remapped to @samp{g}; the @samp{d} and
5870 @samp{e} are swapped, and the @samp{f} is discarded.
5873 @comment No need to fight 8-bit characters, as it is difficult to get
5874 @comment rendering right in both info and dvi.
5877 translit(`«abc~', `~-»')
5881 @comment Stress test short arguments, since they use a different code
5884 translit(`abcdeabcde', `a')
5886 translit(`abcdeabcde', `ab')
5888 translit(`abcdeabcde', `a', `f')
5890 translit(`abcdeabcde', `a', `f')
5892 translit(`abcdeabcde', `a', `fg')
5894 translit(`abcdeabcde', `ab', `f')
5896 translit(`abcdeabcde', `ab', `fg')
5898 translit(`abcdeabcde', `ab', `ba')
5900 translit(`abcdeabcde', `e', `f')
5902 translit(`abc', `', `cde')
5904 translit(`', `a', `bc')
5909 Omitting @var{chars} evokes a warning, but still produces output.
5913 @error{}m4:stdin:1: Warning: too few arguments to builtin `translit'
5918 @section Substituting text by regular expression
5920 @cindex basic regular expressions
5921 @cindex regular expressions
5922 @cindex expressions, regular
5923 @cindex pattern substitution
5924 @cindex substitution by regular expression
5925 @cindex GNU extensions
5926 Global substitution in a string is done by @code{patsubst}:
5928 @deffn Builtin patsubst (@var{string}, @var{regexp}, @ovar{replacement})
5929 Searches @var{string} for matches of @var{regexp}, and substitutes
5930 @var{replacement} for each match. The syntax for regular expressions
5931 is the same as in GNU Emacs (@pxref{Regexp}).
5933 The parts of @var{string} that are not covered by any match of
5934 @var{regexp} are copied to the expansion. Whenever a match is found, the
5935 search proceeds from the end of the match, so a character from
5936 @var{string} will never be substituted twice. If @var{regexp} matches a
5937 string of zero length, the start position for the search is incremented,
5938 to avoid infinite loops.
5940 When a replacement is to be made, @var{replacement} is inserted into
5941 the expansion, with @samp{\@var{n}} substituted by the text matched by
5942 the @var{n}th parenthesized sub-expression of @var{patsubst}, for up to
5943 nine sub-expressions. The escape @samp{\&} is replaced by the text of
5944 the entire regular expression matched. For all other characters,
5945 @samp{\} treats the next character literally. A warning is issued if
5946 there were fewer sub-expressions than the @samp{\@var{n}} requested, or
5947 if there is a trailing @samp{\}.
5949 The @var{replacement} argument can be omitted, in which case the text
5950 matched by @var{regexp} is deleted.
5952 The macro @code{patsubst} is recognized only with parameters.
5956 patsubst(`GNUs not Unix', `^', `OBS: ')
5957 @result{}OBS: GNUs not Unix
5958 patsubst(`GNUs not Unix', `\<', `OBS: ')
5959 @result{}OBS: GNUs OBS: not OBS: Unix
5960 patsubst(`GNUs not Unix', `\w*', `(\&)')
5961 @result{}(GNUs)() (not)() (Unix)()
5962 patsubst(`GNUs not Unix', `\w+', `(\&)')
5963 @result{}(GNUs) (not) (Unix)
5964 patsubst(`GNUs not Unix', `[A-Z][a-z]+')
5965 @result{}GN not@w{ }
5966 patsubst(`GNUs not Unix', `not', `NOT\')
5967 @error{}m4:stdin:6: Warning: trailing \ ignored in replacement
5968 @result{}GNUs NOT Unix
5971 Here is a slightly more realistic example, which capitalizes individual
5972 words or whole sentences, by substituting calls of the macros
5973 @code{upcase} and @code{downcase} into the strings.
5975 @deffn Composite upcase (@var{text})
5976 @deffnx Composite downcase (@var{text})
5977 @deffnx Composite capitalize (@var{text})
5978 Expand to @var{text}, but with capitalization changed: @code{upcase}
5979 changes all letters to upper case, @code{downcase} changes all letters
5980 to lower case, and @code{capitalize} changes the first character of each
5981 word to upper case and the remaining characters to lower case.
5984 First, an example of their usage, using implementations distributed in
5985 @file{m4-@value{VERSION}/@/examples/@/capitalize.m4}.
5989 $ @kbd{m4 -I examples}
5990 include(`capitalize.m4')
5992 upcase(`GNUs not Unix')
5993 @result{}GNUS NOT UNIX
5994 downcase(`GNUs not Unix')
5995 @result{}gnus not unix
5996 capitalize(`GNUs not Unix')
5997 @result{}Gnus Not Unix
6000 Now for the implementation. There is a helper macro @code{_capitalize}
6001 which puts only its first word in mixed case. Then @code{capitalize}
6002 merely parses out the words, and replaces them with an invocation of
6003 @code{_capitalize}. (As presented here, the @code{capitalize} macro has
6004 some subtle flaws. You should try to see if you can find and correct
6005 them; or @pxref{Improved capitalize, , Answers}).
6009 $ @kbd{m4 -I examples}
6010 undivert(`capitalize.m4')dnl
6011 @result{}divert(`-1')
6012 @result{}# upcase(text)
6013 @result{}# downcase(text)
6014 @result{}# capitalize(text)
6015 @result{}# change case of text, simple version
6016 @result{}define(`upcase', `translit(`$*', `a-z', `A-Z')')
6017 @result{}define(`downcase', `translit(`$*', `A-Z', `a-z')')
6018 @result{}define(`_capitalize',
6019 @result{} `regexp(`$1', `^\(\w\)\(\w*\)',
6020 @result{} `upcase(`\1')`'downcase(`\2')')')
6021 @result{}define(`capitalize', `patsubst(`$1', `\w+', `_$0(`\&')')')
6022 @result{}divert`'dnl
6025 While @code{regexp} replaces the whole input with the replacement as
6026 soon as there is a match, @code{patsubst} replaces each
6027 @emph{occurrence} of a match and preserves non-matching pieces:
6033 patreg(`bar foo baz Foo', `foo\|Foo', `FOO')
6034 @result{}bar FOO baz FOO
6036 patreg(`aba abb 121', `\(.\)\(.\)\1', `\2\1\2')
6037 @result{}bab abb 212
6041 Omitting @var{regexp} evokes a warning, but still produces output;
6042 contrast this with an empty @var{regexp} argument.
6046 @error{}m4:stdin:1: Warning: too few arguments to builtin `patsubst'
6050 patsubst(`abc', `', `\\-')
6051 @result{}\-a\-b\-c\-
6055 @section Formatting strings (printf-like)
6057 @cindex formatted output
6058 @cindex output, formatted
6059 @cindex GNU extensions
6060 Formatted output can be made with @code{format}:
6062 @deffn Builtin format (@var{format-string}, @dots{})
6063 Works much like the C function @code{printf}. The first argument
6064 @var{format-string} can contain @samp{%} specifications which are
6065 satisfied by additional arguments, and the expansion of @code{format} is
6066 the formatted string.
6068 The macro @code{format} is recognized only with parameters.
6071 Its use is best described by a few examples:
6073 @comment This test is a bit fragile, if someone tries to port to a
6074 @comment platform without infinity.
6076 define(`foo', `The brown fox jumped over the lazy dog')
6078 format(`The string "%s" uses %d characters', foo, len(foo))
6079 @result{}The string "The brown fox jumped over the lazy dog" uses 38 characters
6080 format(`%*.*d', `-1', `-1', `1')
6082 format(`%.0f', `56789.9876')
6084 len(format(`%-*X', `5000', `1'))
6086 ifelse(format(`%010F', `infinity'), ` INF', `success',
6087 format(`%010F', `infinity'), ` INFINITY', `success',
6088 format(`%010F', `infinity'))
6090 ifelse(format(`%.1A', `1.999'), `0X1.0P+1', `success',
6091 format(`%.1A', `1.999'), `0X2.0P+0', `success',
6092 format(`%.1A', `1.999'))
6094 format(`%g', `0xa.P+1')
6098 Using the @code{forloop} macro defined earlier (@pxref{Forloop}), this
6099 example shows how @code{format} can be used to produce tabular output.
6103 $ @kbd{m4 -I examples}
6104 include(`forloop.m4')
6106 forloop(`i', `1', `10', `format(`%6d squared is %10d
6108 @result{} 1 squared is 1
6109 @result{} 2 squared is 4
6110 @result{} 3 squared is 9
6111 @result{} 4 squared is 16
6112 @result{} 5 squared is 25
6113 @result{} 6 squared is 36
6114 @result{} 7 squared is 49
6115 @result{} 8 squared is 64
6116 @result{} 9 squared is 81
6117 @result{} 10 squared is 100
6121 The builtin @code{format} is modeled after the ANSI C @samp{printf}
6122 function, and supports these @samp{%} specifiers: @samp{c}, @samp{s},
6123 @samp{d}, @samp{o}, @samp{x}, @samp{X}, @samp{u}, @samp{a}, @samp{A},
6124 @samp{e}, @samp{E}, @samp{f}, @samp{F}, @samp{g}, @samp{G}, and
6125 @samp{%}; it supports field widths and precisions, and the flags
6126 @samp{+}, @samp{-}, @samp{ }, @samp{0}, @samp{#}, and @samp{'}. For
6127 integer specifiers, the width modifiers @samp{hh}, @samp{h}, and
6128 @samp{l} are recognized, and for floating point specifiers, the width
6129 modifier @samp{l} is recognized. Items not yet supported include
6130 positional arguments, the @samp{n}, @samp{p}, @samp{S}, and @samp{C}
6131 specifiers, the @samp{z}, @samp{t}, @samp{j}, @samp{L} and @samp{ll}
6132 modifiers, and any platform extensions available in the native
6133 @code{printf}. For more details on the functioning of @code{printf},
6134 see the C Library Manual, or the POSIX specification (for
6135 example, @samp{%a} is supported even on platforms that haven't yet
6136 implemented C99 hexadecimal floating point output natively).
6138 Unrecognized specifiers result in a warning. It is anticipated that a
6139 future release of GNU @code{m4} will support more specifiers,
6140 and give better warnings when various problems such as overflow are
6141 encountered. Likewise, escape sequences are not yet recognized.
6145 @error{}m4:stdin:1: Warning: unrecognized specifier in `%p'
6150 @comment Expose a crash with a bad format string fixed in 1.4.15.
6151 @comment Unfortuntely, 8-bit bytes are hard to check for; but the
6152 @comment exit status is enough to sniff the crash in broken versions.
6154 @comment xerr: ignore
6156 format(`%'format(`%c', `128'))
6162 @chapter Macros for doing arithmetic
6165 @cindex integer arithmetic
6166 Integer arithmetic is included in @code{m4}, with a C-like syntax. As
6167 convenient shorthands, there are builtins for simple increment and
6168 decrement operations.
6171 * Incr:: Decrement and increment operators
6172 * Eval:: Evaluating integer expressions
6176 @section Decrement and increment operators
6178 @cindex decrement operator
6179 @cindex increment operator
6180 Increment and decrement of integers are supported using the builtins
6181 @code{incr} and @code{decr}:
6183 @deffn Builtin incr (@var{number})
6184 @deffnx Builtin decr (@var{number})
6185 Expand to the numerical value of @var{number}, incremented
6186 or decremented, respectively, by one. Except for the empty string, the
6187 expansion is empty if @var{number} could not be parsed.
6189 The macros @code{incr} and @code{decr} are recognized only with
6199 @error{}m4:stdin:3: empty string treated as 0 in builtin `incr'
6202 @error{}m4:stdin:4: empty string treated as 0 in builtin `decr'
6207 @section Evaluating integer expressions
6209 @cindex integer expression evaluation
6210 @cindex evaluation, of integer expressions
6211 @cindex expressions, evaluation of integer
6212 Integer expressions are evaluated with @code{eval}:
6214 @deffn Builtin eval (@var{expression}, @dvar{radix, 10}, @ovar{width})
6215 Expands to the value of @var{expression}. The expansion is empty
6216 if a problem is encountered while parsing the arguments. If specified,
6217 @var{radix} and @var{width} control the format of the output.
6219 Calculations are done with 32-bit signed numbers. Overflow silently
6220 results in wraparound. A warning is issued if division by zero is
6221 attempted, or if @var{expression} could not be parsed.
6223 Expressions can contain the following operators, listed in order of
6224 decreasing precedence.
6230 Unary plus and minus, and bitwise and logical negation
6234 Multiplication, division, and modulo
6236 Addition and subtraction
6240 Relational operators
6246 Bitwise exclusive-or
6255 The macro @code{eval} is recognized only with parameters.
6258 All binary operators, except exponentiation, are left associative. C
6259 operators that perform variable assignment, such as @samp{+=} or
6260 @samp{--}, are not implemented, since @code{eval} only operates on
6261 constants, not variables. Attempting to use them results in an error.
6262 However, since traditional implementations treated @samp{=} as an
6263 undocumented alias for @samp{==} as opposed to an assignment operator,
6264 this usage is supported as a special case. Be aware that a future
6265 version of GNU M4 may support assignment semantics as an
6266 extension when POSIX mode is not requested, and that using
6267 @samp{=} to check equality is not portable.
6272 @error{}m4:stdin:1: Warning: recommend ==, not =, for equality operator
6275 @error{}m4:stdin:2: invalid operator in eval: ++0
6278 @error{}m4:stdin:3: invalid operator in eval: 0 |= 1
6282 Note that some older @code{m4} implementations use @samp{^} as an
6283 alternate operator for the exponentiation, although POSIX
6284 requires the C behavior of bitwise exclusive-or. The precedence of the
6285 negation operators, @samp{~} and @samp{!}, was traditionally lower than
6286 equality. The unary operators could not be used reliably more than once
6287 on the same term without intervening parentheses. The traditional
6288 precedence of the equality operators @samp{==} and @samp{!=} was
6289 identical instead of lower than the relational operators such as
6290 @samp{<}, even through GNU M4 1.4.8. Starting with version
6291 1.4.9, GNU M4 correctly follows POSIX precedence
6292 rules. M4 scripts designed to be portable between releases must be
6293 aware that parentheses may be required to enforce C precedence rules.
6294 Likewise, division by zero, even in the unused branch of a
6295 short-circuiting operator, is not always well-defined in other
6298 Following are some examples where the current version of M4 follows C
6299 precedence rules, but where older versions and some other
6300 implementations of @code{m4} require explicit parentheses to get the
6306 eval(`(1 == 2) > 0')
6316 eval(`+ + - ~ ! ~ 0')
6321 @error{}m4:stdin:9: divide by zero in eval: 0 || 1 / 0
6326 @error{}m4:stdin:11: modulo by zero in eval: 2 && 1 % 0
6330 @cindex GNU extensions
6331 As a GNU extension, the operator @samp{**} performs integral
6332 exponentiation. The operator is right-associative, and if evaluated,
6333 the exponent must be non-negative, and at least one of the arguments
6334 must be non-zero, or a warning is issued.
6339 eval(`(2 ** 3) ** 2')
6347 @error{}m4:stdin:5: divide by zero in eval: 0 ** 0
6349 @error{}m4:stdin:6: negative exponent in eval: 4 ** -2
6353 Within @var{expression}, (but not @var{radix} or @var{width}), numbers
6354 without a special prefix are decimal. A simple @samp{0} prefix
6355 introduces an octal number. @samp{0x} introduces a hexadecimal number.
6356 As GNU extensions, @samp{0b} introduces a binary number.
6357 @samp{0r} introduces a number expressed in any radix between 1 and 36:
6358 the prefix should be immediately followed by the decimal expression of
6359 the radix, a colon, then the digits making the number. For radix 1,
6360 leading zeros are ignored, and all remaining digits must be @samp{1};
6361 for all other radices, the digits are @samp{0}, @samp{1}, @samp{2},
6362 @dots{}. Beyond @samp{9}, the digits are @samp{a}, @samp{b} @dots{} up
6363 to @samp{z}. Lower and upper case letters can be used interchangeably
6364 in numbers prefixes and as number digits.
6366 Parentheses may be used to group subexpressions whenever needed. For the
6367 relational operators, a true relation returns @code{1}, and a false
6368 relation return @code{0}.
6370 Here are a few examples of use of @code{eval}.
6381 eval(index(`Hello world', `llo') >= 0)
6383 eval(`0r1:0111 + 0b100 + 0r3:12')
6385 define(`square', `eval(`($1) ** 2')')
6389 square(square(`5')` + 1')
6391 define(`foo', `666')
6394 @error{}m4:stdin:11: bad expression in eval: foo / 6
6400 As the last two lines show, @code{eval} does not handle macro
6401 names, even if they expand to a valid expression (or part of a valid
6402 expression). Therefore all macros must be expanded before they are
6403 passed to @code{eval}.
6405 Some calculations are not portable to other implementations, since they
6406 have undefined semantics in C, but GNU @code{m4} has
6407 well-defined behavior on overflow. When shifting, an out-of-range shift
6408 amount is implicitly brought into the range of 32-bit signed integers
6409 using an implicit bit-wise and with 0x1f).
6412 define(`max_int', eval(`0x7fffffff'))
6414 define(`min_int', incr(max_int))
6420 ifelse(eval(min_int` / -1'), min_int, `overflow occurred')
6421 @result{}overflow occurred
6423 @result{}-2147483648
6424 eval(`0x80000000 % -1')
6432 If @var{radix} is specified, it specifies the radix to be used in the
6433 expansion. The default radix is 10; this is also the case if
6434 @var{radix} is the empty string. A warning results if the radix is
6435 outside the range of 1 through 36, inclusive. The result of @code{eval}
6436 is always taken to be signed. No radix prefix is output, and for
6437 radices greater than 10, the digits are lower case. The @var{width}
6438 argument specifies the minimum output width, excluding any negative
6439 sign. The result is zero-padded to extend the expansion to the
6440 requested width. A warning results if the width is negative. If
6441 @var{radix} or @var{width} is out of bounds, the expansion of
6442 @code{eval} is empty.
6451 eval(`666', `6', `10')
6453 eval(`-666', `6', `10')
6454 @result{}-0000003030
6457 `0r1:'eval(`10', `1', `11')
6458 @result{}0r1:01111111111
6462 @error{}m4:stdin:9: radix 37 in builtin `eval' out of range
6465 @error{}m4:stdin:10: negative width to builtin `eval'
6468 @error{}m4:stdin:11: empty string treated as 0 in builtin `eval'
6472 @node Shell commands
6473 @chapter Macros for running shell commands
6475 @cindex UNIX commands, running
6476 @cindex executing shell commands
6477 @cindex running shell commands
6478 @cindex shell commands, running
6479 @cindex commands, running shell
6480 There are a few builtin macros in @code{m4} that allow you to run shell
6481 commands from within @code{m4}.
6483 Note that the definition of a valid shell command is system dependent.
6484 On UNIX systems, this is the typical @command{/bin/sh}. But on other
6485 systems, such as native Windows, the shell has a different syntax of
6486 commands that it understands. Some examples in this chapter assume
6487 @command{/bin/sh}, and also demonstrate how to quit early with a known
6488 exit value if this is not the case.
6491 * Platform macros:: Determining the platform
6492 * Syscmd:: Executing simple commands
6493 * Esyscmd:: Reading the output of commands
6494 * Sysval:: Exit status
6495 * Mkstemp:: Making temporary files
6498 @node Platform macros
6499 @section Determining the platform
6501 @cindex platform macros
6502 Sometimes it is desirable for an input file to know which platform
6503 @code{m4} is running on. GNU @code{m4} provides several
6504 macros that are predefined to expand to the empty string; checking for
6505 their existence will confirm platform details.
6507 @deffn {Optional builtin} __gnu__
6508 @deffnx {Optional builtin} __os2__
6509 @deffnx {Optional builtin} os2
6510 @deffnx {Optional builtin} __unix__
6511 @deffnx {Optional builtin} unix
6512 @deffnx {Optional builtin} __windows__
6513 @deffnx {Optional builtin} windows
6514 Each of these macros is conditionally defined as needed to describe the
6515 environment of @code{m4}. If defined, each macro expands to the empty
6516 string. For now, these macros silently ignore all arguments, but in a
6517 future release of M4, they might warn if arguments are present.
6520 When GNU extensions are in effect (that is, when you did not
6521 use the @option{-G} option, @pxref{Limits control, , Invoking m4}),
6522 GNU @code{m4} will define the macro @code{@w{__gnu__}} to
6523 expand to the empty string.
6531 Extensions are ifdef(`__gnu__', `active', `inactive')
6532 @result{}Extensions are active
6535 @comment options: -G
6541 @result{}__gnu__(ignored)
6542 Extensions are ifdef(`__gnu__', `active', `inactive')
6543 @result{}Extensions are inactive
6546 On UNIX systems, GNU @code{m4} will define @code{@w{__unix__}}
6547 by default, or @code{unix} when the @option{-G} option is specified.
6549 On native Windows systems, GNU @code{m4} will define
6550 @code{@w{__windows__}} by default, or @code{windows} when the
6551 @option{-G} option is specified.
6553 On OS/2 systems, GNU @code{m4} will define @code{@w{__os2__}}
6554 by default, or @code{os2} when the @option{-G} option is specified.
6556 If GNU @code{m4} does not provide a platform macro for your system,
6557 please report that as a bug.
6560 define(`provided', `0')
6562 ifdef(`__unix__', `define(`provided', incr(provided))')
6564 ifdef(`__windows__', `define(`provided', incr(provided))')
6566 ifdef(`__os2__', `define(`provided', incr(provided))')
6573 @section Executing simple commands
6575 Any shell command can be executed, using @code{syscmd}:
6577 @deffn Builtin syscmd (@var{shell-command})
6578 Executes @var{shell-command} as a shell command.
6580 The expansion of @code{syscmd} is void, @emph{not} the output from
6581 @var{shell-command}! Output or error messages from @var{shell-command}
6582 are not read by @code{m4}. @xref{Esyscmd}, if you need to process the
6585 Prior to executing the command, @code{m4} flushes its buffers.
6586 The default standard input, output and error of @var{shell-command} are
6587 the same as those of @code{m4}.
6589 By default, the @var{shell-command} will be used as the argument to the
6590 @option{-c} option of the @command{/bin/sh} shell (or the version of
6591 @command{sh} specified by @samp{command -p getconf PATH}, if your system
6592 supports that). If you prefer a different shell, the
6593 @command{configure} script can be given the option
6594 @option{--with-syscmd-shell=@var{location}} to set the location of an
6595 alternative shell at GNU @code{m4} installation; the
6596 alternative shell must still support @option{-c}.
6598 The macro @code{syscmd} is recognized only with parameters.
6602 define(`foo', `FOO')
6609 Note how the expansion of @code{syscmd} keeps the trailing newline of
6610 the command, as well as using the newline that appeared after the macro.
6612 The following is an example of @var{shell-command} using the same
6613 standard input as @code{m4}:
6617 $ @kbd{echo "m4wrap(\`syscmd(\`cat')')" | m4}
6622 @comment If the user types the example below with stdin being an
6623 @comment interactive terminal, then cat will hang waiting for additional
6624 @comment input after m4 has exited. But the testsuite is using a pipe
6625 @comment for stdin. Hence, we have two versions - the one we feed the
6626 @comment testsuite below, and the one we display to the user above that
6627 @comment more accurately shows what the testsuite is really doing but
6628 @comment which the testsuite cannot parse.
6631 m4wrap(`syscmd(`cat')')
6637 It tells @code{m4} to read all of its input before executing the wrapped
6638 text, then hand a valid (albeit emptied) pipe as standard input for the
6639 @code{cat} subcommand. Therefore, you should be careful when using
6640 standard input (either by specifying no files, or by passing @samp{-} as
6641 a file name on the command line, @pxref{Command line files, , Invoking
6642 m4}), and also invoking subcommands via @code{syscmd} or @code{esyscmd}
6643 that consume data from standard input. When standard input is a
6644 seekable file, the subprocess will pick up with the next character not
6645 yet processed by @code{m4}; when it is a pipe or other non-seekable
6646 file, there is no guarantee how much data will already be buffered by
6647 @code{m4} and thus unavailable to the child.
6650 @section Reading the output of commands
6652 @cindex GNU extensions
6653 If you want @code{m4} to read the output of a shell command, use
6656 @deffn Builtin esyscmd (@var{shell-command})
6657 Expands to the standard output of the shell command
6658 @var{shell-command}.
6660 Prior to executing the command, @code{m4} flushes its buffers.
6661 The default standard input and standard error of @var{shell-command} are
6662 the same as those of @code{m4}. The error output of @var{shell-command}
6663 is not a part of the expansion: it will appear along with the error
6664 output of @code{m4}.
6666 By default, the @var{shell-command} will be used as the argument to the
6667 @option{-c} option of the @command{/bin/sh} shell (or the version of
6668 @command{sh} specified by @samp{command -p getconf PATH}, if your system
6669 supports that). If you prefer a different shell, the
6670 @command{configure} script can be given the option
6671 @option{--with-syscmd-shell=@var{location}} to set the location of an
6672 alternative shell at GNU @code{m4} installation; the
6673 alternative shell must still support @option{-c}.
6675 The macro @code{esyscmd} is recognized only with parameters.
6679 define(`foo', `FOO')
6686 Note how the expansion of @code{esyscmd} keeps the trailing newline of
6687 the command, as well as using the newline that appeared after the macro.
6689 Just as with @code{syscmd}, care must be exercised when sharing standard
6690 input between @code{m4} and the child process of @code{esyscmd}.
6693 @section Exit status
6695 @cindex UNIX commands, exit status from
6696 @cindex exit status from shell commands
6697 @cindex shell commands, exit status from
6698 @cindex commands, exit status from shell
6699 @cindex status of shell commands
6700 To see whether a shell command succeeded, use @code{sysval}:
6702 @deffn Builtin sysval
6703 Expands to the exit status of the last shell command run with
6704 @code{syscmd} or @code{esyscmd}. Expands to 0 if no command has been
6713 ifelse(sysval, `0', `zero', `non-zero')
6725 ifelse(sysval, `0', `zero', `non-zero')
6727 esyscmd(`echo dnl && exit 127')
6737 @code{sysval} results in 127 if there was a problem executing the
6738 command, for example, if the system-imposed argument length is exceeded,
6739 or if there were not enough resources to fork. It is not possible to
6740 distinguish between failed execution and successful execution that had
6741 an exit status of 127, unless there was output from the child process.
6743 On UNIX platforms, where it is possible to detect when command execution
6744 is terminated by a signal, rather than a normal exit, the result is the
6745 signal number shifted left by eight bits.
6747 @comment This test has difficulties being portable, even on platforms
6748 @comment where syscmd invokes /bin/sh. Kill is not portable with signal
6749 @comment names. According to autoconf, the only portable signal numbers
6750 @comment are 1 (HUP), 2 (INT), 9 (KILL), 13 (PIPE) and 15 (TERM). But
6751 @comment all shells handle SIGINT, and ksh handles HUP (as in, the shell
6752 @comment exits normally rather than letting the signal terminate it).
6753 @comment Also, TERM is flaky, as it can also kill the running m4 on
6754 @comment systems where /bin/sh does not create its own process group.
6755 @comment And PIPE is unreliable, since people tend to run with it
6756 @comment ignored, with m4 inheriting that choice. That leaves KILL as
6757 @comment the only signal we can reliably test.
6759 dnl This test assumes kill is a shell builtin, and that signals are
6762 `errprint(` skipping: syscmd does not have unix semantics
6764 syscmd(`kill -9 $$')
6772 esyscmd(`kill -9 $$')
6779 @section Making temporary files
6781 @cindex temporary file names
6782 @cindex files, names of temporary
6783 Commands specified to @code{syscmd} or @code{esyscmd} might need a
6784 temporary file, for output or for some other purpose. There is a
6785 builtin macro, @code{mkstemp}, for making a temporary file:
6787 @deffn Builtin mkstemp (@var{template})
6788 @deffnx Builtin maketemp (@var{template})
6789 Expands to the quoted name of a new, empty file, made from the string
6790 @var{template}, which should end with the string @samp{XXXXXX}. The six
6791 @samp{X} characters are then replaced with random characters matching
6792 the regular expression @samp{[a-zA-Z0-9._-]}, in order to make the file
6793 name unique. If fewer than six @samp{X} characters are found at the end
6794 of @code{template}, the result will be longer than the template. The
6795 created file will have access permissions as if by @kbd{chmod =rw,go=},
6796 meaning that the current umask of the @code{m4} process is taken into
6797 account, and at most only the current user can read and write the file.
6799 The traditional behavior, standardized by POSIX, is that
6800 @code{maketemp} merely replaces the trailing @samp{X} with the process
6801 id, without creating a file or quoting the expansion, and without
6802 ensuring that the resulting
6803 string is a unique file name. In part, this means that using the same
6804 @var{template} twice in the same input file will result in the same
6805 expansion. This behavior is a security hole, as it is very easy for
6806 another process to guess the name that will be generated, and thus
6807 interfere with a subsequent use of @code{syscmd} trying to manipulate
6808 that file name. Hence, POSIX has recommended that all new
6809 implementations of @code{m4} provide the secure @code{mkstemp} builtin,
6810 and that users of @code{m4} check for its existence.
6812 The expansion is void and an error issued if a temporary file could
6815 The macros @code{mkstemp} and @code{maketemp} are recognized only with
6819 If you try this next example, you will most likely get different output
6820 for the two file names, since the replacement characters are randomly
6826 define(`tmp', `oops')
6828 maketemp(`/tmp/fooXXXXXX')
6829 @result{}/tmp/fooa07346
6830 ifdef(`mkstemp', `define(`maketemp', defn(`mkstemp'))',
6831 `define(`mkstemp', defn(`maketemp'))dnl
6832 errprint(`warning: potentially insecure maketemp implementation
6839 @cindex GNU extensions
6840 Unless you use the @option{--traditional} command line option (or
6841 @option{-G}, @pxref{Limits control, , Invoking m4}), the GNU
6842 version of @code{maketemp} is secure. This means that using the same
6843 template to multiple calls will generate multiple files. However, we
6844 recommend that you use the new @code{mkstemp} macro, introduced in
6845 GNU M4 1.4.8, which is secure even in traditional mode. Also,
6846 as of M4 1.4.11, the secure implementation quotes the resulting file
6847 name, so that you are guaranteed to know what file was created even if
6848 the random file name happens to match an existing macro. Notice that
6849 this example is careful to use @code{defn} to avoid unintended expansion
6854 define(`foo', `errprint(`oops')')
6856 syscmd(`rm -f foo-??????')sysval
6858 define(`file1', maketemp(`foo-XXXXXX'))dnl
6859 ifelse(esyscmd(`echo \` foo-?????? \''), ` foo-?????? ',
6860 `no file', `created')
6862 define(`file2', maketemp(`foo-XX'))dnl
6863 define(`file3', mkstemp(`foo-XXXXXX'))dnl
6864 ifelse(len(defn(`file1')), len(defn(`file2')),
6865 `same length', `different')
6866 @result{}same length
6867 ifelse(defn(`file1'), defn(`file2'), `same', `different file')
6868 @result{}different file
6869 ifelse(defn(`file2'), defn(`file3'), `same', `different file')
6870 @result{}different file
6871 ifelse(defn(`file1'), defn(`file3'), `same', `different file')
6872 @result{}different file
6873 syscmd(`rm 'defn(`file1') defn(`file2') defn(`file3'))
6880 @c Not worth documenting, but make sure we don't leave trailing NUL in
6884 syscmd(`rm -rf foodir')sysval
6886 syscmd(`mkdir foodir')sysval
6888 len(mkstemp(`foodir/fooXXXXX'))
6890 syscmd(`rm -r foodir')sysval
6894 @c Likewise, and ensure that traditional mode leaves the result unquoted
6895 @c without creating a file.
6897 @comment options: -G
6899 syscmd(`rm -f foo-*')sysval
6901 len(maketemp(`foo-XXXXX'))
6902 @error{}m4:stdin:2: recommend using mkstemp instead
6904 define(`abc', `def')
6908 @error{}m4:stdin:4: recommend using mkstemp instead
6909 syscmd(`test -f foo-*')ifelse(sysval, `0', `0', `1')
6915 @chapter Miscellaneous builtin macros
6917 This chapter describes various builtins, that do not really belong in
6918 any of the previous chapters.
6921 * Errprint:: Printing error messages
6922 * Location:: Printing current location
6923 * M4exit:: Exiting from @code{m4}
6927 @section Printing error messages
6929 @cindex printing error messages
6930 @cindex error messages, printing
6931 @cindex messages, printing error
6932 @cindex standard error, output to
6933 You can print error messages using @code{errprint}:
6935 @deffn Builtin errprint (@var{message}, @dots{})
6936 Prints @var{message} and the rest of the arguments to standard error,
6937 separated by spaces. Standard error is used, regardless of the
6938 @option{--debugfile} option (@pxref{Debugging options, , Invoking m4}).
6940 The expansion of @code{errprint} is void.
6941 The macro @code{errprint} is recognized only with parameters.
6945 errprint(`Invalid arguments to forloop
6947 @error{}Invalid arguments to forloop
6949 errprint(`1')errprint(`2',`3
6955 A trailing newline is @emph{not} printed automatically, so it should be
6956 supplied as part of the argument, as in the example. Unfortunately, the
6957 exact output of @code{errprint} is not very portable to other @code{m4}
6958 implementations: POSIX requires that all arguments be printed,
6959 but some implementations of @code{m4} only print the first.
6960 Furthermore, some BSD implementations always append a newline
6961 for each @code{errprint} call, regardless of whether the last argument
6962 already had one, and POSIX is silent on whether this is
6966 @section Printing current location
6968 @cindex location, input
6969 @cindex input location
6970 To make it possible to specify the location of an error, three
6971 utility builtins exist:
6973 @deffn Builtin __file__
6974 @deffnx Builtin __line__
6975 @deffnx Builtin __program__
6976 Expand to the quoted name of the current input file, the
6977 current input line number in that file, and the quoted name of the
6978 current invocation of @code{m4}.
6982 errprint(__program__:__file__:__line__: `input error
6984 @error{}m4:stdin:1: input error
6988 Line numbers start at 1 for each file. If the file was found due to the
6989 @option{-I} option or @env{M4PATH} environment variable, that is
6990 reflected in the file name. The syncline option (@option{-s},
6991 @pxref{Preprocessor features, , Invoking m4}), and the
6992 @samp{f} and @samp{l} flags of @code{debugmode} (@pxref{Debug Levels}),
6993 also use this notion of current file and line. Redefining the three
6994 location macros has no effect on syncline, debug, warning, or error
6997 This example reuses the file @file{incl.m4} mentioned earlier
7002 $ @kbd{m4 -I examples}
7003 define(`foo', ``$0' called at __file__:__line__')
7006 @result{}foo called at stdin:2
7008 @result{}Include file start
7009 @result{}foo called at examples/incl.m4:2
7010 @result{}Include file end
7014 The location of macros invoked during the rescanning of macro expansion
7015 text corresponds to the location in the file where the expansion was
7016 triggered, regardless of how many newline characters the expansion text
7017 contains. As of GNU M4 1.4.8, the location of text wrapped
7018 with @code{m4wrap} (@pxref{M4wrap}) is the point at which the
7019 @code{m4wrap} was invoked. Previous versions, however, behaved as
7020 though wrapped text came from line 0 of the file ``''.
7023 define(`echo', `$@@')
7025 define(`foo', `echo(__line__
7035 foo(errprint(__line__
7053 The @code{@w{__program__}} macro behaves like @samp{$0} in shell
7054 terminology. If you invoke @code{m4} through an absolute path or a link
7055 with a different spelling, rather than by relying on a @env{PATH} search
7056 for plain @samp{m4}, it will affect how @code{@w{__program__}} expands.
7057 The intent is that you can use it to produce error messages with the
7058 same formatting that @code{m4} produces internally. It can also be used
7059 within @code{syscmd} (@pxref{Syscmd}) to pick the same version of
7060 @code{m4} that is currently running, rather than whatever version of
7061 @code{m4} happens to be first in @env{PATH}. It was first introduced in
7065 @section Exiting from @code{m4}
7067 @cindex exiting from @code{m4}
7068 @cindex status, setting @code{m4} exit
7069 If you need to exit from @code{m4} before the entire input has been
7070 read, you can use @code{m4exit}:
7072 @deffn Builtin m4exit (@dvar{code, 0})
7073 Causes @code{m4} to exit, with exit status @var{code}. If @var{code} is
7074 left out, the exit status is zero. If @var{code} cannot be parsed, or
7075 is outside the range of 0 to 255, the exit status is one. No further
7076 input is read, and all wrapped and diverted text is discarded.
7080 m4wrap(`This text is lost due to `m4exit'.')
7082 divert(`1') So is this.
7085 m4exit And this is never read.
7088 A common use of this is to abort processing:
7090 @deffn Composite fatal_error (@var{message})
7091 Abort processing with an error message and non-zero status. Prefix
7092 @var{message} with details about where the error occurred, and print the
7093 resulting string to standard error.
7098 define(`fatal_error',
7099 `errprint(__program__:__file__:__line__`: fatal error: $*
7102 fatal_error(`this is a BAD one, buster')
7103 @error{}m4:stdin:4: fatal error: this is a BAD one, buster
7106 After this macro call, @code{m4} will exit with exit status 1. This macro
7107 is only intended for error exits, since the normal exit procedures are
7108 not followed, i.e., diverted text is not undiverted, and saved text
7109 (@pxref{M4wrap}) is not reread. (This macro could be made more robust
7110 to earlier versions of @code{m4}. You should try to see if you can find
7111 weaknesses and correct them; or @pxref{Improved fatal_error, , Answers}).
7113 Note that it is still possible for the exit status to be different than
7114 what was requested by @code{m4exit}. If @code{m4} detects some other
7115 error, such as a write error on standard output, the exit status will be
7116 non-zero even if @code{m4exit} requested zero.
7118 If standard input is seekable, then the file will be positioned at the
7119 next unread character. If it is a pipe or other non-seekable file,
7120 then there are no guarantees how much data @code{m4} might have read
7121 into buffers, and thus discarded.
7124 @chapter Fast loading of frozen state
7126 Some bigger @code{m4} applications may be built over a common base
7127 containing hundreds of definitions and other costly initializations.
7128 Usually, the common base is kept in one or more declarative files,
7129 which files are listed on each @code{m4} invocation prior to the
7130 user's input file, or else each input file uses @code{include}.
7132 Reading the common base of a big application, over and over again, may
7133 be time consuming. GNU @code{m4} offers some machinery to
7134 speed up the start of an application using lengthy common bases.
7137 * Using frozen files:: Using frozen files
7138 * Frozen file format:: Frozen file format
7141 @node Using frozen files
7142 @section Using frozen files
7144 @cindex fast loading of frozen files
7145 @cindex frozen files for fast loading
7146 @cindex initialization, frozen state
7147 @cindex dumping into frozen file
7148 @cindex reloading a frozen file
7149 @cindex GNU extensions
7150 Suppose a user has a library of @code{m4} initializations in
7151 @file{base.m4}, which is then used with multiple input files:
7155 $ @kbd{m4 base.m4 input1.m4}
7156 $ @kbd{m4 base.m4 input2.m4}
7157 $ @kbd{m4 base.m4 input3.m4}
7160 Rather than spending time parsing the fixed contents of @file{base.m4}
7161 every time, the user might rather execute:
7165 $ @kbd{m4 -F base.m4f base.m4}
7169 once, and further execute, as often as needed:
7173 $ @kbd{m4 -R base.m4f input1.m4}
7174 $ @kbd{m4 -R base.m4f input2.m4}
7175 $ @kbd{m4 -R base.m4f input3.m4}
7179 with the varying input. The first call, containing the @option{-F}
7180 option, only reads and executes file @file{base.m4}, defining
7181 various application macros and computing other initializations.
7182 Once the input file @file{base.m4} has been completely processed, GNU
7183 @code{m4} produces in @file{base.m4f} a @dfn{frozen} file, that is, a
7184 file which contains a kind of snapshot of the @code{m4} internal state.
7186 Later calls, containing the @option{-R} option, are able to reload
7187 the internal state of @code{m4}, from @file{base.m4f},
7188 @emph{prior} to reading any other input files. This means
7189 instead of starting with a virgin copy of @code{m4}, input will be
7190 read after having effectively recovered the effect of a prior run.
7191 In our example, the effect is the same as if file @file{base.m4} has
7192 been read anew. However, this effect is achieved a lot faster.
7194 Only one frozen file may be created or read in any one @code{m4}
7195 invocation. It is not possible to recover two frozen files at once.
7196 However, frozen files may be updated incrementally, through using
7197 @option{-R} and @option{-F} options simultaneously. For example, if
7198 some care is taken, the command:
7202 $ @kbd{m4 file1.m4 file2.m4 file3.m4 file4.m4}
7206 could be broken down in the following sequence, accumulating the same
7211 $ @kbd{m4 -F file1.m4f file1.m4}
7212 $ @kbd{m4 -R file1.m4f -F file2.m4f file2.m4}
7213 $ @kbd{m4 -R file2.m4f -F file3.m4f file3.m4}
7214 $ @kbd{m4 -R file3.m4f file4.m4}
7217 Some care is necessary because not every effort has been made for
7218 this to work in all cases. In particular, the trace attribute of
7219 macros is not handled, nor the current setting of @code{changeword}.
7220 Currently, @code{m4wrap} and @code{sysval} also have problems.
7221 Also, interactions for some options of @code{m4}, being used in one call
7222 and not in the next, have not been fully analyzed yet. On the other
7223 end, you may be confident that stacks of @code{pushdef} definitions
7224 are handled correctly, as well as undefined or renamed builtins, and
7225 changed strings for quotes or comments. And future releases of
7226 GNU M4 will improve on the utility of frozen files.
7229 @c This example is not worth putting in the manual, but caused core
7230 @c dumps in all versions prior to 1.4.11.
7232 @comment options: -F /dev/null
7234 traceon(`undefined')dnl
7237 @c Make sure freezing is successful.
7241 `errprint(` skipping: syscmd does not have unix semantics
7243 changequote(`[', `]')dnl
7244 syscmd([echo 'changequote([,])pushdef([divnum],[hi])dnl' \
7245 | ']__program__[' -F in.m4f \
7246 && echo 'divnum popdef([divnum])divnum' \
7247 | ']__program__[' -R in.m4f \
7248 && rm in.m4f])status sysval
7253 @c Detect inability to freeze.
7254 @c Some systems harden /, and fail with EACCES rather than ENOENT.
7256 @comment options: -F /none/such
7257 @comment xerr: ignore
7260 $ @kbd{m4 -F /none/such}
7262 @error{}m4: cannot open `/none/such': No such file or directory
7266 When an @code{m4} run is to be frozen, the automatic undiversion
7267 which takes place at end of execution is inhibited. Instead, all
7268 positively numbered diversions are saved into the frozen file.
7269 The active diversion number is also transmitted.
7271 A frozen file to be reloaded need not reside in the current directory.
7272 It is looked up the same way as an @code{include} file (@pxref{Search
7275 If the frozen file was generated with a newer version of @code{m4}, and
7276 contains directives that an older @code{m4} cannot parse, attempting to
7277 load the frozen file with option @option{-R} will cause @code{m4} to
7278 exit with status 63 to indicate version mismatch.
7280 @node Frozen file format
7281 @section Frozen file format
7283 @cindex frozen file format
7284 @cindex file format, frozen file
7285 Frozen files are sharable across architectures. It is safe to write
7286 a frozen file on one machine and read it on another, given that the
7287 second machine uses the same or newer version of GNU @code{m4}.
7288 It is conventional, but not required, to give a frozen file the suffix
7291 These are simple (editable) text files, made up of directives,
7292 each starting with a capital letter and ending with a newline
7293 (@key{NL}). Wherever a directive is expected, the character
7294 @samp{#} introduces a comment line; empty lines are also ignored if they
7295 are not part of an embedded string.
7296 In the following descriptions, each @var{len} refers to the length of
7297 the corresponding strings @var{str} in the next line of input. Numbers
7298 are always expressed in decimal. There are no escape characters. The
7302 @item C @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
7303 Uses @var{str1} and @var{str2} as the begin-comment and
7304 end-comment strings. If omitted, then @samp{#} and @key{NL} are the
7307 @item D @var{number}, @var{len} @key{NL} @var{str} @key{NL}
7308 Selects diversion @var{number}, making it current, then copy
7309 @var{str} in the current diversion. @var{number} may be a negative
7310 number for a non-existing diversion. To merely specify an active
7311 selection, use this command with an empty @var{str}. With 0 as the
7312 diversion @var{number}, @var{str} will be issued on standard output
7313 at reload time. GNU @code{m4} will not produce the @samp{D}
7314 directive with non-zero length for diversion 0, but this can be done
7315 with manual edits. This directive may
7316 appear more than once for the same diversion, in which case the
7317 diversion is the concatenation of the various uses. If omitted, then
7318 diversion 0 is current.
7320 @item F @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
7321 Defines, through @code{pushdef}, a definition for @var{str1}
7322 expanding to the function whose builtin name is @var{str2}. If the
7323 builtin does not exist (for example, if the frozen file was produced by
7324 a copy of @code{m4} compiled with changeword support, but the version
7325 of @code{m4} reloading was compiled without it), the reload is silent,
7326 but any subsequent use of the definition of @var{str1} will result in
7327 a warning. This directive may appear more than once for the same name,
7328 and its order, along with @samp{T}, is important. If omitted, you will
7329 have no access to any builtins.
7331 @item Q @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
7332 Uses @var{str1} and @var{str2} as the begin-quote and end-quote
7333 strings. If omitted, then @samp{`} and @samp{'} are the quote
7336 @item T @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
7337 Defines, though @code{pushdef}, a definition for @var{str1}
7338 expanding to the text given by @var{str2}. This directive may appear
7339 more than once for the same name, and its order, along with @samp{F}, is
7342 @item V @var{number} @key{NL}
7343 Confirms the format of the file. @code{m4} @value{VERSION} only creates
7344 and understands frozen files where @var{number} is 1. This directive
7345 must be the first non-comment in the file, and may not appear more than
7350 @chapter Compatibility with other versions of @code{m4}
7352 @cindex compatibility
7353 This chapter describes the many of the differences between this
7354 implementation of @code{m4}, and of other implementations found under
7355 UNIX, such as System V Release 4, Solaris, and BSD flavors.
7356 In particular, it lists the known differences and extensions to
7357 POSIX. However, the list is not necessarily comprehensive.
7359 At the time of this writing, POSIX 2001 (also known as IEEE
7360 Std 1003.1-2001) is the latest standard, although a new version of
7361 POSIX is under development and includes several proposals for
7362 modifying what @code{m4} is required to do. The requirements for
7363 @code{m4} are shared between SUSv3 and POSIX, and
7365 @uref{http://www.opengroup.org/onlinepubs/@/000095399/@/utilities/@/m4.html}.
7368 * Extensions:: Extensions in GNU M4
7369 * Incompatibilities:: Facilities in System V m4 not in GNU M4
7370 * Other Incompatibilities:: Other incompatibilities
7374 @section Extensions in GNU M4
7376 @cindex GNU extensions
7378 This version of @code{m4} contains a few facilities that do not exist
7379 in System V @code{m4}. These extra facilities are all suppressed by
7380 using the @option{-G} command line option (@pxref{Limits control, ,
7381 Invoking m4}), unless overridden by other command line options.
7385 In the @code{$@var{n}} notation for macro arguments, @var{n} can contain
7386 several digits, while the System V @code{m4} only accepts one digit.
7387 This allows macros in GNU @code{m4} to take any number of
7388 arguments, and not only nine (@pxref{Arguments}).
7390 This means that @code{define(`foo', `$11')} is ambiguous between
7391 implementations. To portably choose between grabbing the first
7392 parameter and appending 1 to the expansion, or grabbing the eleventh
7393 parameter, you can do the following:
7398 dnl First argument, concatenated with 1
7399 define(`_1', `$1')define(`first1', `_1($@@)1')
7401 dnl Eleventh argument, portable
7402 define(`_9', `$9')define(`eleventh', `_9(shift(shift($@@)))')
7404 dnl Eleventh argument, GNU style
7405 define(`Eleventh', `$11')
7407 first1(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k')
7409 eleventh(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k')
7411 Eleventh(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k')
7416 Also see the @code{argn} macro (@pxref{Shift}).
7419 The @code{divert} (@pxref{Divert}) macro can manage more than 9
7420 diversions. GNU @code{m4} treats all positive numbers as valid
7421 diversions, rather than discarding diversions greater than 9.
7424 Files included with @code{include} and @code{sinclude} are sought in a
7425 user specified search path, if they are not found in the working
7426 directory. The search path is specified by the @option{-I} option and the
7427 @env{M4PATH} environment variable (@pxref{Search Path}).
7430 Arguments to @code{undivert} can be non-numeric, in which case the named
7431 file will be included uninterpreted in the output (@pxref{Undivert}).
7434 Formatted output is supported through the @code{format} builtin, which
7435 is modeled after the C library function @code{printf} (@pxref{Format}).
7438 Searches and text substitution through basic regular expressions are
7439 supported by the @code{regexp} (@pxref{Regexp}) and @code{patsubst}
7440 (@pxref{Patsubst}) builtins. Some BSD implementations use
7441 extended regular expressions instead.
7444 The output of shell commands can be read into @code{m4} with
7445 @code{esyscmd} (@pxref{Esyscmd}).
7448 There is indirect access to any builtin macro with @code{builtin}
7452 Macros can be called indirectly through @code{indir} (@pxref{Indir}).
7455 The name of the program, the current input file, and the current input
7456 line number are accessible through the builtins @code{@w{__program__}},
7457 @code{@w{__file__}}, and @code{@w{__line__}} (@pxref{Location}).
7460 The format of the output from @code{dumpdef} and macro tracing can be
7461 controlled with @code{debugmode} (@pxref{Debug Levels}).
7464 The destination of trace and debug output can be controlled with
7465 @code{debugfile} (@pxref{Debug Output}).
7468 The @code{maketemp} (@pxref{Mkstemp}) macro behaves like @code{mkstemp},
7469 creating a new file with a unique name on every invocation, rather than
7470 following the insecure behavior of replacing the trailing @samp{X}
7471 characters with the @code{m4} process id.
7474 POSIX only requires support for the command line options
7475 @option{-s}, @option{-D}, and @option{-U}, so all other options accepted
7476 by GNU M4 are extensions. @xref{Invoking m4}, for a
7477 description of these options.
7479 The debugging and tracing facilities in GNU @code{m4} are much
7480 more extensive than in most other versions of @code{m4}.
7483 @node Incompatibilities
7484 @section Facilities in System V @code{m4} not in GNU @code{m4}
7486 The version of @code{m4} from System V contains a few facilities that
7487 have not been implemented in GNU @code{m4} yet. Additionally,
7488 POSIX requires some behaviors that GNU @code{m4} has not
7489 implemented yet. Relying on these behaviors is non-portable, as a
7490 future release of GNU @code{m4} may change.
7494 POSIX requires support for multiple arguments to @code{defn},
7495 without any clarification on how @code{defn} behaves when one of the
7496 multiple arguments names a builtin. System V @code{m4} and some other
7497 implementations allow mixing builtins and text macros into a single
7498 macro. GNU @code{m4} only supports joining multiple text
7499 arguments, although a future implementation may lift this restriction to
7500 behave more like System V@. The only portable way to join text macros
7501 with builtins is via helper macros and implicit concatenation of macro
7505 POSIX requires an application to exit with non-zero status if
7506 it wrote an error message to stderr. This has not yet been consistently
7507 implemented for the various builtins that are required to issue an error
7508 (such as @code{eval} (@pxref{Eval}) when an argument cannot be parsed).
7511 Some traditional implementations only allow reading standard input
7512 once, but GNU @code{m4} correctly handles multiple instances
7513 of @samp{-} on the command line.
7516 POSIX requires @code{m4wrap} (@pxref{M4wrap}) to act in FIFO
7517 (first-in, first-out) order, but GNU @code{m4} currently uses
7518 LIFO order. Furthermore, POSIX states that only the first
7519 argument to @code{m4wrap} is saved for later evaluation, but
7520 GNU @code{m4} saves and processes all arguments, with output
7521 separated by spaces.
7524 POSIX states that builtins that require arguments, but are
7525 called without arguments, have undefined behavior. Traditional
7526 implementations simply behave as though empty strings had been passed.
7527 For example, @code{a`'define`'b} would expand to @code{ab}. But
7528 GNU @code{m4} ignores certain builtins if they have missing
7529 arguments, giving @code{adefineb} for the above example.
7532 Traditional implementations handle @code{define(`f',`1')} (@pxref{Define})
7533 by undefining the entire stack of previous definitions, and if doing
7534 @code{undefine(`f')} first. GNU @code{m4} replaces just the top
7535 definition on the stack, as if doing @code{popdef(`f')} followed by
7536 @code{pushdef(`f',`1')}. POSIX allows either behavior.
7539 POSIX 2001 requires @code{syscmd} (@pxref{Syscmd}) to evaluate
7540 command output for macro expansion, but this was a mistake that is
7541 anticipated to be corrected in the next version of POSIX.
7542 GNU @code{m4} follows traditional behavior in @code{syscmd}
7543 where output is not rescanned, and provides the extension @code{esyscmd}
7544 that does scan the output.
7547 At one point, POSIX required @code{changequote(@var{arg})}
7548 (@pxref{Changequote}) to use newline as the close quote, but this was a
7549 bug, and the next version of POSIX is anticipated to state
7550 that using empty strings or just one argument is unspecified.
7551 Meanwhile, the GNU @code{m4} behavior of treating an empty
7552 end-quote delimiter as @samp{'} is not portable, as Solaris treats it as
7553 repeating the start-quote delimiter, and BSD treats it as leaving the
7554 previous end-quote delimiter unchanged. For predictable results, never
7555 call changequote with just one argument, or with empty strings for
7559 At one point, POSIX required @code{changecom(@var{arg},)}
7560 (@pxref{Changecom}) to make it impossible to end a comment, but this is
7561 a bug, and the next version of POSIX is anticipated to state
7562 that using empty strings is unspecified. Meanwhile, the GNU
7563 @code{m4} behavior of treating an empty end-comment delimiter as newline
7564 is not portable, as BSD treats it as leaving the previous end-comment
7565 delimiter unchanged. It is also impossible in BSD implementations to
7566 disable comments, even though that is required by POSIX. For
7567 predictable results, never call changecom with empty strings for
7571 Most implementations of @code{m4} give macros a higher precedence than
7572 comments when parsing, meaning that if the start delimiter given to
7573 @code{changecom} (@pxref{Changecom}) starts with a macro name, comments
7574 are effectively disabled. POSIX does not specify what the
7575 precedence is, so this version of GNU @code{m4} parser
7576 recognizes comments, then macros, then quoted strings.
7579 Traditional implementations allow argument collection, but not string
7580 and comment processing, to span file boundaries. Thus, if @file{a.m4}
7581 contains @samp{len(}, and @file{b.m4} contains @samp{abc)},
7582 @kbd{m4 a.m4 b.m4} outputs @samp{3} with traditional @code{m4}, but
7583 gives an error message that the end of file was encountered inside a
7584 macro with GNU @code{m4}. On the other hand, traditional
7585 implementations do end of file processing for files included with
7586 @code{include} or @code{sinclude} (@pxref{Include}), while GNU
7587 @code{m4} seamlessly integrates the content of those files. Thus
7588 @code{include(`a.m4')include(`b.m4')} will output @samp{3} instead of
7592 Traditional @code{m4} treats @code{traceon} (@pxref{Trace}) without
7593 arguments as a global variable, independent of named macro tracing.
7594 Also, once a macro is undefined, named tracing of that macro is lost.
7595 On the other hand, when GNU @code{m4} encounters
7596 @code{traceon} without
7597 arguments, it turns tracing on for all existing definitions at the time,
7598 but does not trace future definitions; @code{traceoff} without arguments
7599 turns tracing off for all definitions regardless of whether they were
7600 also traced by name; and tracing by name, such as with @option{-tfoo} at
7601 the command line or @code{traceon(`foo')} in the input, is an attribute
7602 that is preserved even if the macro is currently undefined.
7604 Additionally, while POSIX requires trace output, it makes no
7605 demands on the formatting of that output. Parsing trace output is not
7606 guaranteed to be reliable, even between different releases of
7607 GNU M4; however, the intent is that any future changes in
7608 trace output will only occur under the direction of additional
7609 @code{debugmode} flags (@pxref{Debug Levels}).
7612 POSIX requires @code{eval} (@pxref{Eval}) to treat all
7613 operators with the same precedence as C@. However, earlier versions of
7614 GNU @code{m4} followed the traditional behavior of other
7615 @code{m4} implementations, where bitwise and logical negation (@samp{~}
7616 and @samp{!}) have lower precedence than equality operators; and where
7617 equality operators (@samp{==} and @samp{!=}) had the same precedence as
7618 relational operators (such as @samp{<}). Use explicit parentheses to
7619 ensure proper precedence. As extensions to POSIX,
7620 GNU @code{m4} gives well-defined semantics to operations that
7621 C leaves undefined, such as when overflow occurs, when shifting negative
7622 numbers, or when performing division by zero. POSIX also
7623 requires @samp{=} to cause an error, but many traditional
7624 implementations allowed it as an alias for @samp{==}.
7627 POSIX 2001 requires @code{translit} (@pxref{Translit}) to
7628 treat each character of the second and third arguments literally.
7629 However, it is anticipated that the next version of POSIX will
7630 allow the GNU @code{m4} behavior of treating @samp{-} as a
7634 POSIX requires @code{m4} to honor the locale environment
7635 variables of @env{LANG}, @env{LC_ALL}, @env{LC_CTYPE},
7636 @env{LC_MESSAGES}, and @env{NLSPATH}, but this has not yet been
7637 implemented in GNU @code{m4}.
7640 POSIX states that only unquoted leading newlines and blanks
7641 (that is, space and tab) are ignored when collecting macro arguments.
7642 However, this appears to be a bug in POSIX, since most
7643 traditional implementations also ignore all whitespace (formfeed,
7644 carriage return, and vertical tab). GNU @code{m4} follows
7645 tradition and ignores all leading unquoted whitespace.
7648 @cindex @env{POSIXLY_CORRECT}
7649 A strictly-compliant POSIX client is not allowed to use
7650 command-line arguments not specified by POSIX. However, since
7651 this version of M4 ignores @env{POSIXLY_CORRECT} and enables the option
7652 @code{--gnu} by default (@pxref{Limits control, , Invoking m4}), a
7653 client desiring to be strictly compliant has no way to disable
7654 GNU extensions that conflict with POSIX when
7655 directly invoking the compiled @code{m4}. A future version of
7656 @code{GNU} M4 will honor the environment variable @env{POSIXLY_CORRECT},
7657 implicitly enabling @option{--traditional} if it is set, in order to
7658 allow a strictly-compliant client. In the meantime, a client needing
7659 strict POSIX compliance can use the workaround of invoking a
7660 shell script wrapper, where the wrapper then adds @option{--traditional}
7661 to the arguments passed to the compiled @code{m4}.
7664 @node Other Incompatibilities
7665 @section Other incompatibilities
7667 There are a few other incompatibilities between this implementation of
7668 @code{m4}, and the System V version.
7672 GNU @code{m4} implements sync lines differently from System V
7673 @code{m4}, when text is being diverted. GNU @code{m4} outputs
7674 the sync lines when the text is being diverted, and System V @code{m4}
7675 when the diverted text is being brought back.
7677 The problem is which lines and file names should be attached to text
7678 that is being, or has been, diverted. System V @code{m4} regards all
7679 the diverted text as being generated by the source line containing the
7680 @code{undivert} call, whereas GNU @code{m4} regards the
7681 diverted text as being generated at the time it is diverted.
7683 The sync line option is used mostly when using @code{m4} as
7684 a front end to a compiler. If a diverted line causes a compiler error,
7685 the error messages should most probably refer to the place where the
7686 diversion was made, and not where it was inserted again.
7688 @comment options: -s
7693 @result{}#line 3 "stdin"
7696 @result{}#line 2 "stdin"
7698 @result{}#line 1 "stdin"
7702 The current @code{m4} implementation has a limitation that the syncline
7703 output at the start of each diversion occurs no matter what, even if the
7704 previous diversion did not end with a newline. This goes contrary to
7705 the claim that synclines appear on a line by themselves, so this
7706 limitation may be corrected in a future version of @code{m4}. In the
7707 meantime, when using @option{-s}, it is wisest to make sure all
7708 diversions end with newline.
7711 GNU @code{m4} makes no attempt at prohibiting self-referential
7722 There is nothing inherently wrong with defining @samp{x} to
7723 return @samp{x}. The wrong thing is to expand @samp{x} unquoted,
7724 because that would cause an infinite rescan loop.
7725 In @code{m4}, one might use macros to hold strings, as we do for
7726 variables in other programming languages, further checking them with:
7730 ifelse(defn(`@var{holder}'), `@var{value}', @dots{})
7734 In cases like this one, an interdiction for a macro to hold its own name
7735 would be a useless limitation. Of course, this leaves more rope for the
7736 GNU @code{m4} user to hang himself! Rescanning hangs may be
7737 avoided through careful programming, a little like for endless loops in
7738 traditional programming languages.
7742 @chapter Correct version of some examples
7744 Some of the examples in this manuals are buggy or not very robust, for
7745 demonstration purposes. Improved versions of these composite macros are
7749 * Improved exch:: Solution for @code{exch}
7750 * Improved forloop:: Solution for @code{forloop}
7751 * Improved foreach:: Solution for @code{foreach}
7752 * Improved copy:: Solution for @code{copy}
7753 * Improved m4wrap:: Solution for @code{m4wrap}
7754 * Improved cleardivert:: Solution for @code{cleardivert}
7755 * Improved capitalize:: Solution for @code{capitalize}
7756 * Improved fatal_error:: Solution for @code{fatal_error}
7760 @section Solution for @code{exch}
7762 The @code{exch} macro (@pxref{Arguments}) as presented requires clients
7763 to double quote their arguments. A nicer definition, which lets
7764 clients follow the rule of thumb of one level of quoting per level of
7765 parentheses, involves adding quotes in the definition of @code{exch}, as
7769 define(`exch', ``$2', `$1'')
7771 define(exch(`expansion text', `macro'))
7774 @result{}expansion text
7777 @node Improved forloop
7778 @section Solution for @code{forloop}
7780 The @code{forloop} macro (@pxref{Forloop}) as presented earlier can go
7781 into an infinite loop if given an iterator that is not parsed as a macro
7782 name. It does not do any sanity checking on its numeric bounds, and
7783 only permits decimal numbers for bounds. Here is an improved version,
7784 shipped as @file{m4-@value{VERSION}/@/examples/@/forloop2.m4}; this
7785 version also optimizes overhead by calling four macros instead of six
7786 per iteration (excluding those in @var{text}), by not dereferencing the
7787 @var{iterator} in the helper @code{@w{_forloop}}.
7791 $ @kbd{m4 -d -I examples}
7792 undivert(`forloop2.m4')dnl
7793 @result{}divert(`-1')
7794 @result{}# forloop(var, from, to, stmt) - improved version:
7795 @result{}# works even if VAR is not a strict macro name
7796 @result{}# performs sanity check that FROM is larger than TO
7797 @result{}# allows complex numerical expressions in TO and FROM
7798 @result{}define(`forloop', `ifelse(eval(`($2) <= ($3)'), `1',
7799 @result{} `pushdef(`$1')_$0(`$1', eval(`$2'),
7800 @result{} eval(`$3'), `$4')popdef(`$1')')')
7801 @result{}define(`_forloop',
7802 @result{} `define(`$1', `$2')$4`'ifelse(`$2', `$3', `',
7803 @result{} `$0(`$1', incr(`$2'), `$3', `$4')')')
7804 @result{}divert`'dnl
7805 include(`forloop2.m4')
7807 forloop(`i', `2', `1', `no iteration occurs')
7809 forloop(`', `1', `2', ` odd iterator name')
7810 @result{} odd iterator name odd iterator name
7811 forloop(`i', `5 + 5', `0xc', ` 0x`'eval(i, `16')')
7812 @result{} 0xa 0xb 0xc
7813 forloop(`i', `a', `b', `non-numeric bounds')
7814 @error{}m4:stdin:6: bad expression in eval (bad input): (a) <= (b)
7818 One other change to notice is that the improved version used @samp{_$0}
7819 rather than @samp{_foreach} to invoke the helper routine. In general,
7820 this is a good practice to follow, because then the set of macros can be
7821 uniformly transformed. The following example shows a transformation
7822 that doubles the current quoting and appends a suffix @samp{2} to each
7823 transformed macro. If @code{foreach} refers to the literal
7824 @samp{_foreach}, then @code{foreach2} invokes @code{_foreach} instead of
7825 the intended @code{_foreach2}, and the mixing of quoting paradigms leads
7826 to an infinite recursion loop in this example.
7828 @comment options: -L9
7832 $ @kbd{m4 -d -L 9 -I examples}
7833 define(`arg1', `$1')include(`forloop2.m4')include(`quote.m4')
7835 define(`double', `define(`$1'`2',
7836 arg1(patsubst(dquote(defn(`$1')), `[`']', `\&\&')))')
7838 double(`forloop')double(`_forloop')defn(`forloop2')
7839 @result{}ifelse(eval(``($2) <= ($3)''), ``1'',
7840 @result{} ``pushdef(``$1'')_$0(``$1'', eval(``$2''),
7841 @result{} eval(``$3''), ``$4'')popdef(``$1'')'')
7842 forloop(i, 1, 5, `ifelse(')forloop(i, 1, 5, `)')
7844 changequote(`[', `]')changequote([``], [''])
7846 forloop2(i, 1, 5, ``ifelse('')forloop2(i, 1, 5, ``)'')
7848 changequote`'include(`forloop.m4')
7850 double(`forloop')double(`_forloop')defn(`forloop2')
7851 @result{}pushdef(``$1'', ``$2'')_forloop($@@)popdef(``$1'')
7852 forloop(i, 1, 5, `ifelse(')forloop(i, 1, 5, `)')
7854 changequote(`[', `]')changequote([``], [''])
7856 forloop2(i, 1, 5, ``ifelse('')forloop2(i, 1, 5, ``)'')
7857 @error{}m4:stdin:12: recursion limit of 9 exceeded, use -L<N> to change it
7860 One more optimization is still possible. Instead of repeatedly
7861 assigning a variable then invoking or dereferencing it, it is possible
7862 to pass the current iterator value as a single argument. Coupled with
7863 @code{curry} if other arguments are needed (@pxref{Composition}), or
7864 with helper macros if the argument is needed in more than one place in
7865 the expansion, the output can be generated with three, rather than four,
7866 macros of overhead per iteration. Notice how the file
7867 @file{m4-@value{VERSION}/@/examples/@/forloop3.m4} rearranges the
7868 arguments of the helper @code{_forloop} to take two arguments that are
7869 placed around the current value. By splitting a balanced set of
7870 parantheses across multiple arguments, the helper macro can now be
7871 shared by @code{forloop} and the new @code{forloop_arg}.
7875 $ @kbd{m4 -I examples}
7876 include(`forloop3.m4')
7878 undivert(`forloop3.m4')dnl
7879 @result{}divert(`-1')
7880 @result{}# forloop_arg(from, to, macro) - invoke MACRO(value) for
7881 @result{}# each value between FROM and TO, without define overhead
7882 @result{}define(`forloop_arg', `ifelse(eval(`($1) <= ($2)'), `1',
7883 @result{} `_forloop(`$1', eval(`$2'), `$3(', `)')')')
7884 @result{}# forloop(var, from, to, stmt) - refactored to share code
7885 @result{}define(`forloop', `ifelse(eval(`($2) <= ($3)'), `1',
7886 @result{} `pushdef(`$1')_forloop(eval(`$2'), eval(`$3'),
7887 @result{} `define(`$1',', `)$4')popdef(`$1')')')
7888 @result{}define(`_forloop',
7889 @result{} `$3`$1'$4`'ifelse(`$1', `$2', `',
7890 @result{} `$0(incr(`$1'), `$2', `$3', `$4')')')
7891 @result{}divert`'dnl
7892 forloop(`i', `1', `3', ` i')
7894 define(`echo', `$@@')
7896 forloop_arg(`1', `3', ` echo')
7900 forloop_arg(`1', `3', `curry(`pushdef', `a')')
7912 Of course, it is possible to make even more improvements, such as
7913 adding an optional step argument, or allowing iteration through
7914 descending sequences. GNU Autoconf provides some of these
7915 additional bells and whistles in its @code{m4_for} macro.
7917 @node Improved foreach
7918 @section Solution for @code{foreach}
7920 The @code{foreach} and @code{foreachq} macros (@pxref{Foreach}) as
7921 presented earlier each have flaws. First, we will examine and fix the
7922 quadratic behavior of @code{foreachq}:
7926 $ @kbd{m4 -I examples}
7927 include(`foreachq.m4')
7929 traceon(`shift')debugmode(`aq')
7931 foreachq(`x', ``1', `2', `3', `4'', `x
7934 @error{}m4trace: -3- shift(`1', `2', `3', `4')
7935 @error{}m4trace: -2- shift(`1', `2', `3', `4')
7937 @error{}m4trace: -4- shift(`1', `2', `3', `4')
7938 @error{}m4trace: -3- shift(`2', `3', `4')
7939 @error{}m4trace: -3- shift(`1', `2', `3', `4')
7940 @error{}m4trace: -2- shift(`2', `3', `4')
7942 @error{}m4trace: -5- shift(`1', `2', `3', `4')
7943 @error{}m4trace: -4- shift(`2', `3', `4')
7944 @error{}m4trace: -3- shift(`3', `4')
7945 @error{}m4trace: -4- shift(`1', `2', `3', `4')
7946 @error{}m4trace: -3- shift(`2', `3', `4')
7947 @error{}m4trace: -2- shift(`3', `4')
7949 @error{}m4trace: -6- shift(`1', `2', `3', `4')
7950 @error{}m4trace: -5- shift(`2', `3', `4')
7951 @error{}m4trace: -4- shift(`3', `4')
7952 @error{}m4trace: -3- shift(`4')
7955 @cindex quadratic behavior, avoiding
7956 @cindex avoiding quadratic behavior
7957 Each successive iteration was adding more quoted @code{shift}
7958 invocations, and the entire list contents were passing through every
7959 iteration. In general, when recursing, it is a good idea to make the
7960 recursion use fewer arguments, rather than adding additional quoted
7961 uses of @code{shift}. By doing so, @code{m4} uses less memory, invokes
7962 fewer macros, is less likely to run into machine limits, and most
7963 importantly, performs faster. The fixed version of @code{foreachq} can
7964 be found in @file{m4-@value{VERSION}/@/examples/@/foreachq2.m4}:
7968 $ @kbd{m4 -I examples}
7969 include(`foreachq2.m4')
7971 undivert(`foreachq2.m4')dnl
7972 @result{}include(`quote.m4')dnl
7973 @result{}divert(`-1')
7974 @result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
7975 @result{}# quoted list, improved version
7976 @result{}define(`foreachq', `pushdef(`$1')_$0($@@)popdef(`$1')')
7977 @result{}define(`_arg1q', ``$1'')
7978 @result{}define(`_rest', `ifelse(`$#', `1', `', `dquote(shift($@@))')')
7979 @result{}define(`_foreachq', `ifelse(`$2', `', `',
7980 @result{} `define(`$1', _arg1q($2))$3`'$0(`$1', _rest($2), `$3')')')
7981 @result{}divert`'dnl
7982 traceon(`shift')debugmode(`aq')
7984 foreachq(`x', ``1', `2', `3', `4'', `x
7987 @error{}m4trace: -3- shift(`1', `2', `3', `4')
7989 @error{}m4trace: -3- shift(`2', `3', `4')
7991 @error{}m4trace: -3- shift(`3', `4')
7995 Note that the fixed version calls unquoted helper macros in
7996 @code{@w{_foreachq}} to trim elements immediately; those helper macros
7997 in turn must re-supply the layer of quotes lost in the macro invocation.
7998 Contrast the use of @code{@w{_arg1q}}, which quotes the first list
7999 element, with @code{@w{_arg1}} of the earlier implementation that
8000 returned the first list element directly. Additionally, by calling the
8001 helper method immediately, the @samp{defn(`@var{iterator}')} no longer
8002 contains unexpanded macros.
8004 The astute m4 programmer might notice that the solution above still uses
8005 more memory and macro invocations, and thus more time, than strictly
8006 necessary. Note that @samp{$2}, which contains an arbitrarily long
8007 quoted list, is expanded and rescanned three times per iteration of
8008 @code{_foreachq}. Furthermore, every iteration of the algorithm
8009 effectively unboxes then reboxes the list, which costs a couple of macro
8010 invocations. It is possible to rewrite the algorithm for a bit more
8011 speed by swapping the order of the arguments to @code{_foreachq} in
8012 order to operate on an unboxed list in the first place, and by using the
8013 fixed-length @samp{$#} instead of an arbitrary length list as the key to
8014 end recursion. The result is an overhead of six macro invocations per
8015 loop (excluding any macros in @var{text}), instead of eight. This
8016 alternative approach is available as
8017 @file{m4-@value{VERSION}/@/examples/@/foreach3.m4}:
8021 $ @kbd{m4 -I examples}
8022 include(`foreachq3.m4')
8024 undivert(`foreachq3.m4')dnl
8025 @result{}divert(`-1')
8026 @result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
8027 @result{}# quoted list, alternate improved version
8028 @result{}define(`foreachq', `ifelse(`$2', `', `',
8029 @result{} `pushdef(`$1')_$0(`$1', `$3', `', $2)popdef(`$1')')')
8030 @result{}define(`_foreachq', `ifelse(`$#', `3', `',
8031 @result{} `define(`$1', `$4')$2`'$0(`$1', `$2',
8032 @result{} shift(shift(shift($@@))))')')
8033 @result{}divert`'dnl
8034 traceon(`shift')debugmode(`aq')
8036 foreachq(`x', ``1', `2', `3', `4'', `x
8039 @error{}m4trace: -4- shift(`x', `x
8040 @error{}', `', `1', `2', `3', `4')
8041 @error{}m4trace: -3- shift(`x
8042 @error{}', `', `1', `2', `3', `4')
8043 @error{}m4trace: -2- shift(`', `1', `2', `3', `4')
8045 @error{}m4trace: -4- shift(`x', `x
8046 @error{}', `1', `2', `3', `4')
8047 @error{}m4trace: -3- shift(`x
8048 @error{}', `1', `2', `3', `4')
8049 @error{}m4trace: -2- shift(`1', `2', `3', `4')
8051 @error{}m4trace: -4- shift(`x', `x
8052 @error{}', `2', `3', `4')
8053 @error{}m4trace: -3- shift(`x
8054 @error{}', `2', `3', `4')
8055 @error{}m4trace: -2- shift(`2', `3', `4')
8057 @error{}m4trace: -4- shift(`x', `x
8058 @error{}', `3', `4')
8059 @error{}m4trace: -3- shift(`x
8060 @error{}', `3', `4')
8061 @error{}m4trace: -2- shift(`3', `4')
8064 In the current version of M4, every instance of @samp{$@@} is rescanned
8065 as it is encountered. Thus, the @file{foreachq3.m4} alternative uses
8066 much less memory than @file{foreachq2.m4}, and executes as much as 10%
8067 faster, since each iteration encounters fewer @samp{$@@}. However, the
8068 implementation of rescanning every byte in @samp{$@@} is quadratic in
8069 the number of bytes scanned (for example, making the broken version in
8070 @file{foreachq.m4} cubic, rather than quadratic, in behavior). A future
8071 release of M4 will improve the underlying implementation by reusing
8072 results of previous scans, so that both styles of @code{foreachq} can
8073 become linear in the number of bytes scanned. Notice how the
8074 implementation injects an empty argument prior to expanding @samp{$2}
8075 within @code{foreachq}; the helper macro @code{_foreachq} then ignores
8076 the third argument altogether, and ends recursion when there are three
8077 arguments left because there was nothing left to pass through
8078 @code{shift}. Thus, each iteration only needs one @code{ifelse}, rather
8079 than the two conditionals used in the version from @file{foreachq2.m4}.
8081 @cindex nine arguments, more than
8082 @cindex more than nine arguments
8083 @cindex arguments, more than nine
8084 So far, all of the implementations of @code{foreachq} presented have
8085 been quadratic with M4 1.4.x. But @code{forloop} is linear, because
8086 each iteration parses a constant amount of arguments. So, it is
8087 possible to design a variant that uses @code{forloop} to do the
8088 iteration, then uses @samp{$@@} only once at the end, giving a linear
8089 result even with older M4 implementations. This implementation relies
8090 on the GNU extension that @samp{$10} expands to the tenth
8091 argument rather than the first argument concatenated with @samp{0}. The
8092 trick is to define an intermediate macro that repeats the text
8093 @code{m4_define(`$1', `$@var{n}')$2`'}, with @samp{n} set to successive
8094 integers corresponding to each argument. The helper macro
8095 @code{_foreachq_} is needed in order to generate the literal sequences
8096 such as @samp{$1} into the intermediate macro, rather than expanding
8097 them as the arguments of @code{_foreachq}. With this approach, no
8098 @code{shift} calls are even needed! Even though there are seven macros
8099 of overhead per iteration instead of six in @file{foreachq3.m4}, the
8100 linear scaling is apparent at relatively small list sizes. However,
8101 this approach will need adjustment when a future version of M4 follows
8102 POSIX by no longer treating @samp{$10} as the tenth argument;
8103 the anticipation is that @samp{$@{10@}} can be used instead, although
8104 that alternative syntax is not yet supported.
8108 $ @kbd{m4 -I examples}
8109 include(`foreachq4.m4')
8111 undivert(`foreachq4.m4')dnl
8112 @result{}include(`forloop2.m4')dnl
8113 @result{}divert(`-1')
8114 @result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
8115 @result{}# quoted list, version based on forloop
8116 @result{}define(`foreachq',
8117 @result{}`ifelse(`$2', `', `', `_$0(`$1', `$3', $2)')')
8118 @result{}define(`_foreachq',
8119 @result{}`pushdef(`$1', forloop(`$1', `3', `$#',
8120 @result{} `$0_(`1', `2', indir(`$1'))')`popdef(
8121 @result{} `$1')')indir(`$1', $@@)')
8122 @result{}define(`_foreachq_',
8123 @result{}``define(`$$1', `$$3')$$2`''')
8124 @result{}divert`'dnl
8125 traceon(`shift')debugmode(`aq')
8127 foreachq(`x', ``1', `2', `3', `4'', `x
8135 For yet another approach, the improved version of @code{foreach},
8136 available in @file{m4-@value{VERSION}/@/examples/@/foreach2.m4}, simply
8137 overquotes the arguments to @code{@w{_foreach}} to begin with, using
8138 @code{dquote_elt}. Then @code{@w{_foreach}} can just use
8139 @code{@w{_arg1}} to remove the extra layer of quoting that was added up
8144 $ @kbd{m4 -I examples}
8145 include(`foreach2.m4')
8147 undivert(`foreach2.m4')dnl
8148 @result{}include(`quote.m4')dnl
8149 @result{}divert(`-1')
8150 @result{}# foreach(x, (item_1, item_2, ..., item_n), stmt)
8151 @result{}# parenthesized list, improved version
8152 @result{}define(`foreach', `pushdef(`$1')_$0(`$1',
8153 @result{} (dquote(dquote_elt$2)), `$3')popdef(`$1')')
8154 @result{}define(`_arg1', `$1')
8155 @result{}define(`_foreach', `ifelse(`$2', `(`')', `',
8156 @result{} `define(`$1', _arg1$2)$3`'$0(`$1', (dquote(shift$2)), `$3')')')
8157 @result{}divert`'dnl
8158 traceon(`shift')debugmode(`aq')
8160 foreach(`x', `(`1', `2', `3', `4')', `x
8162 @error{}m4trace: -4- shift(`1', `2', `3', `4')
8163 @error{}m4trace: -4- shift(`2', `3', `4')
8164 @error{}m4trace: -4- shift(`3', `4')
8166 @error{}m4trace: -3- shift(``1'', ``2'', ``3'', ``4'')
8168 @error{}m4trace: -3- shift(``2'', ``3'', ``4'')
8170 @error{}m4trace: -3- shift(``3'', ``4'')
8172 @error{}m4trace: -3- shift(``4'')
8175 It is likewise possible to write a variant of @code{foreach} that
8176 performs in linear time on M4 1.4.x; the easiest method is probably
8177 writing a version of @code{foreach} that unboxes its list, then invokes
8178 @code{_foreachq} as previously defined in @file{foreachq4.m4}.
8180 In summary, recursion over list elements is trickier than it appeared at
8181 first glance, but provides a powerful idiom within @code{m4} processing.
8182 As a final demonstration, both list styles are now able to handle
8183 several scenarios that would wreak havoc on one or both of the original
8184 implementations. This points out one other difference between the
8185 list styles. @code{foreach} evaluates unquoted list elements only once,
8186 in preparation for calling @code{@w{_foreach}}, similary for
8187 @code{foreachq} as provided by @file{foreachq3.m4} or
8188 @file{foreachq4.m4}. But
8189 @code{foreachq}, as provided by @file{foreachq2.m4},
8190 evaluates unquoted list elements twice while visiting the first list
8191 element, once in @code{@w{_arg1q}} and once in @code{@w{_rest}}. When
8192 deciding which list style to use, one must take into account whether
8193 repeating the side effects of unquoted list elements will have any
8194 detrimental effects.
8198 $ @kbd{m4 -I examples}
8199 include(`foreach2.m4')
8201 include(`foreachq2.m4')
8204 foreach(`x', `', `<x>') / foreachq(`x', `', `<x>')
8206 dnl 1-element list of empty element
8207 foreach(`x', `()', `<x>') / foreachq(`x', ``'', `<x>')
8209 dnl 2-element list of empty elements
8210 foreach(`x', `(`',`')', `<x>') / foreachq(`x', ``',`'', `<x>')
8211 @result{}<><> / <><>
8212 dnl 1-element list of a comma
8213 foreach(`x', `(`,')', `<x>') / foreachq(`x', ``,'', `<x>')
8215 dnl 2-element list of unbalanced parentheses
8216 foreach(`x', `(`(', `)')', `<x>') / foreachq(`x', ``(', `)'', `<x>')
8217 @result{}<(><)> / <(><)>
8218 define(`ab', `oops')dnl using defn(`iterator')
8219 foreach(`x', `(`a', `b')', `defn(`x')') /dnl
8220 foreachq(`x', ``a', `b'', `defn(`x')')
8222 define(`active', `ACT, IVE')
8226 dnl list of unquoted macros; expansion occurs before recursion
8227 foreach(`x', `(active, active)', `<x>
8229 @error{}m4trace: -4- active -> `ACT, IVE'
8230 @error{}m4trace: -4- active -> `ACT, IVE'
8235 foreachq(`x', `active, active', `<x>
8237 @error{}m4trace: -3- active -> `ACT, IVE'
8238 @error{}m4trace: -3- active -> `ACT, IVE'
8240 @error{}m4trace: -3- active -> `ACT, IVE'
8241 @error{}m4trace: -3- active -> `ACT, IVE'
8245 dnl list of quoted macros; expansion occurs during recursion
8246 foreach(`x', `(`active', `active')', `<x>
8248 @error{}m4trace: -1- active -> `ACT, IVE'
8250 @error{}m4trace: -1- active -> `ACT, IVE'
8252 foreachq(`x', ``active', `active'', `<x>
8254 @error{}m4trace: -1- active -> `ACT, IVE'
8256 @error{}m4trace: -1- active -> `ACT, IVE'
8258 dnl list of double-quoted macro names; no expansion
8259 foreach(`x', `(``active'', ``active'')', `<x>
8263 foreachq(`x', ```active'', ``active''', `<x>
8270 @comment Not worth putting in the manual, but make sure that foreach
8271 @comment implementations behave, and that final implementation is
8274 @comment boxed recursion
8277 @comment options: -Dlimit=10 -Dverbose
8279 $ @kbd {m4 -I examples -Dlimit=10 -Dverbose}
8280 include(`loop.m4')dnl
8281 @result{} 1 2 3 4 5 6 7 8 9 10
8284 @comment unboxed recursion
8287 @comment options: -Dlimit=10 -Dverbose -Dalt
8289 $ @kbd {m4 -I examples -Dlimit=10 -Dverbose -Dalt}
8290 include(`loop.m4')dnl
8291 @result{} 1 2 3 4 5 6 7 8 9 10
8294 @comment foreach via forloop recursion
8297 @comment options: -Dlimit=10 -Dverbose -Dalt=4
8299 $ @kbd {m4 -I examples -Dlimit=10 -Dverbose -Dalt=4}
8300 include(`loop.m4')dnl
8301 @result{} 1 2 3 4 5 6 7 8 9 10
8305 @comment options: -Dlimit=2500 -Dalt=4
8307 $ @kbd {m4 -I examples -Dlimit=2500 -Dalt=4}
8308 include(`loop.m4')dnl
8312 @comment options: -Dlimit=10000 -Dalt=4
8314 $ @kbd {m4 -I examples -Dlimit=10000 -Dalt=4}
8315 define(`foo', `divert`'len(popdef(`_foreachq')_foreachq($@@))')dnl
8316 define(`debug', `pushdef(`_foreachq', defn(`foo'))')
8318 include(`loop.m4')dnl
8325 @section Solution for @code{copy}
8327 The macro @code{copy} presented above
8328 is unable to handle builtin tokens with M4 1.4.x, because it tries to
8329 pass the builtin token through the macro @code{curry}, where it is
8330 silently flattened to an empty string (@pxref{Composition}). Rather
8331 than using the problematic @code{curry} to work around the limitation
8332 that @code{stack_foreach} expects to invoke a macro that takes exactly
8333 one argument, we can write a new macro that lets us form the exact
8334 two-argument @code{pushdef} call sequence needed, so that we are no
8335 longer passing a builtin token through a text macro.
8337 @deffn Composite stack_foreach_sep (@var{macro}, @var{pre}, @var{post}, @
8339 @deffnx Composite stack_foreach_sep_lifo (@var{macro}, @var{pre}, @
8340 @var{post}, @var{sep})
8341 For each of the @code{pushdef} definitions associated with @var{macro},
8342 expand the sequence @samp{@var{pre}`'definition`'@var{post}}.
8343 Additionally, expand @var{sep} between definitions.
8344 @code{stack_foreach_sep} visits the oldest definition first, while
8345 @code{stack_foreach_sep_lifo} visits the current definition first. The
8346 expansion may dereference @var{macro}, but should not modify it. There
8347 are a few special macros, such as @code{defn}, which cannot be used as
8348 the @var{macro} parameter.
8351 Note that @code{stack_foreach(`@var{macro}', `@var{action}')} is
8352 equivalent to @code{stack_foreach_sep(`@var{macro}', `@var{action}(',
8353 `)')}. By supplying explicit parentheses, split among the @var{pre} and
8354 @var{post} arguments to @code{stack_foreach_sep}, it is now possible to
8355 construct macro calls with more than one argument, without passing
8356 builtin tokens through a macro call. It is likewise possible to
8357 directly reference the stack definitions without a macro call, by
8358 leaving @var{pre} and @var{post} empty. Thus, in addition to fixing
8359 @code{copy} on builtin tokens, it also executes with fewer macro
8362 The new macro also adds a separator that is only output after the first
8363 iteration of the helper @code{_stack_reverse_sep}, implemented by
8364 prepending the original @var{sep} to @var{pre} and omitting a @var{sep}
8365 argument in subsequent iterations. Note that the empty string that
8366 separates @var{sep} from @var{pre} is provided as part of the fourth
8367 argument when originally calling @code{_stack_reverse_sep}, and not by
8368 writing @code{$4`'$3} as the third argument in the recursive call; while
8369 the other approach would give the same output, it does so at the expense
8370 of increasing the argument size on each iteration of
8371 @code{_stack_reverse_sep}, which results in quadratic instead of linear
8372 execution time. The improved stack walking macros are available in
8373 @file{m4-@value{VERSION}/@/examples/@/stack_sep.m4}:
8377 $ @kbd{m4 -I examples}
8378 include(`stack_sep.m4')
8380 define(`copy', `ifdef(`$2', `errprint(`$2 already defined
8382 `stack_foreach_sep(`$1', `pushdef(`$2',', `)')')')dnl
8383 pushdef(`a', `1')pushdef(`a', defn(`divnum'))
8393 pushdef(`c', `1')pushdef(`c', `2')
8395 stack_foreach_sep_lifo(`c', `', `', `, ')
8397 undivert(`stack_sep.m4')dnl
8398 @result{}divert(`-1')
8399 @result{}# stack_foreach_sep(macro, pre, post, sep)
8400 @result{}# Invoke PRE`'defn`'POST with a single argument of each definition
8401 @result{}# from the definition stack of MACRO, starting with the oldest, and
8402 @result{}# separated by SEP between definitions.
8403 @result{}define(`stack_foreach_sep',
8404 @result{}`_stack_reverse_sep(`$1', `tmp-$1')'dnl
8405 @result{}`_stack_reverse_sep(`tmp-$1', `$1', `$2`'defn(`$1')$3', `$4`'')')
8406 @result{}# stack_foreach_sep_lifo(macro, pre, post, sep)
8407 @result{}# Like stack_foreach_sep, but starting with the newest definition.
8408 @result{}define(`stack_foreach_sep_lifo',
8409 @result{}`_stack_reverse_sep(`$1', `tmp-$1', `$2`'defn(`$1')$3', `$4`'')'dnl
8410 @result{}`_stack_reverse_sep(`tmp-$1', `$1')')
8411 @result{}define(`_stack_reverse_sep',
8412 @result{}`ifdef(`$1', `pushdef(`$2', defn(`$1'))$3`'popdef(`$1')$0(
8413 @result{} `$1', `$2', `$4$3')')')
8414 @result{}divert`'dnl
8418 @comment Not worth putting in the manual, but make sure that
8419 @comment stack_foreach_sep has linear performance.
8423 $ @kbd {m4 -I examples}
8424 include(`forloop3.m4')include(`stack_sep.m4')dnl
8425 forloop(`i', `1', `10000', `pushdef(`s', i)')
8427 define(`colon', `:')define(`dash', `-')
8429 len(stack_foreach_sep(`s', `dash', `', `colon'))
8434 @node Improved m4wrap
8435 @section Solution for @code{m4wrap}
8437 The replacement @code{m4wrap} versions presented above, designed to
8438 guarantee FIFO or LIFO order regardless of the underlying M4
8439 implementation, share a bug when dealing with wrapped text that looks
8440 like parameter expansion. Note how the invocation of
8441 @code{m4wrap@var{n}} interprets these parameters, while using the
8442 builtin preserves them for their intended use.
8446 $ @kbd{m4 -I examples}
8447 include(`wraplifo.m4')
8449 m4wrap(`define(`foo', ``$0:'-$1-$*-$#-')foo(`a', `b')
8452 builtin(`m4wrap', ``'define(`bar', ``$0:'-$1-$*-$#-')bar(`a', `b')
8456 @result{}bar:-a-a,b-2-
8457 @result{}m4wrap0:---0-
8460 Additionally, the computation of @code{_m4wrap_level} and creation of
8461 multiple @code{m4wrap@var{n}} placeholders in the original examples is
8462 more expensive in time and memory than strictly necessary. Notice how
8463 the improved version grabs the wrapped text via @code{defn} to avoid
8464 parameter expansion, then undefines @code{_m4wrap_text}, before
8465 stripping a level of quotes with @code{_arg1} to expand the text. That
8466 way, each level of wrapping reuses the single placeholder, which starts
8467 each nesting level in an undefined state.
8469 Finally, it is worth emulating the GNU M4 extension of saving
8470 all arguments to @code{m4wrap}, separated by a space, rather than saving
8471 just the first argument. This is done with the @code{join} macro
8472 documented previously (@pxref{Shift}). The improved LIFO example is
8473 shipped as @file{m4-@value{VERSION}/@/examples/@/wraplifo2.m4}, and can
8474 easily be converted to a FIFO solution by swapping the adjacent
8475 invocations of @code{joinall} and @code{defn}.
8479 $ @kbd{m4 -I examples}
8480 include(`wraplifo2.m4')
8482 undivert(`wraplifo2.m4')dnl
8483 @result{}dnl Redefine m4wrap to have LIFO semantics, improved example.
8484 @result{}include(`join.m4')dnl
8485 @result{}define(`_m4wrap', defn(`m4wrap'))dnl
8486 @result{}define(`_arg1', `$1')dnl
8487 @result{}define(`m4wrap',
8488 @result{}`ifdef(`_$0_text',
8489 @result{} `define(`_$0_text', joinall(` ', $@@)defn(`_$0_text'))',
8490 @result{} `_$0(`_arg1(defn(`_$0_text')undefine(`_$0_text'))')dnl
8491 @result{}define(`_$0_text', joinall(` ', $@@))')')dnl
8492 m4wrap(`define(`foo', ``$0:'-$1-$*-$#-')foo(`a', `b')
8496 m4wrap(`nested', `', `$@@
8501 @result{}foo:-a-a,b-2-
8505 @node Improved cleardivert
8506 @section Solution for @code{cleardivert}
8508 The @code{cleardivert} macro (@pxref{Cleardivert}) cannot, as it stands, be
8509 called without arguments to clear all pending diversions. That is
8510 because using undivert with an empty string for an argument is different
8511 than using it with no arguments at all. Compare the earlier definition
8512 with one that takes the number of arguments into account:
8515 define(`cleardivert',
8516 `pushdef(`_n', divnum)divert(`-1')undivert($@@)divert(_n)popdef(`_n')')
8526 define(`cleardivert',
8527 `pushdef(`_num', divnum)divert(`-1')ifelse(`$#', `0',
8528 `undivert`'', `undivert($@@)')divert(_num)popdef(`_num')')
8539 @node Improved capitalize
8540 @section Solution for @code{capitalize}
8542 The @code{capitalize} macro (@pxref{Patsubst}) as presented earlier does
8543 not allow clients to follow the quoting rule of thumb. Consider the
8544 three macros @code{active}, @code{Active}, and @code{ACTIVE}, and the
8545 difference between calling @code{capitalize} with the expansion of a
8546 macro, expanding the result of a case change, and changing the case of a
8547 double-quoted string:
8551 $ @kbd{m4 -I examples}
8552 include(`capitalize.m4')dnl
8553 define(`active', `act1, ive')dnl
8554 define(`Active', `Act2, Ive')dnl
8555 define(`ACTIVE', `ACT3, IVE')dnl
8566 downcase(``ACTIVE'')
8570 capitalize(`active')
8572 capitalize(``active'')
8573 @result{}_capitalize(`active')
8578 capitalize(`active')
8582 First, when @code{capitalize} is called with more than one argument, it
8583 was throwing away later arguments, whereas @code{upcase} and
8584 @code{downcase} used @samp{$*} to collect them all. The fix is simple:
8585 use @samp{$*} consistently.
8587 Next, with single-quoting, @code{capitalize} outputs a single character,
8588 a set of quotes, then the rest of the characters, making it impossible
8589 to invoke @code{Active} after the fact, and allowing the alternate macro
8590 @code{A} to interfere. Here, the solution is to use additional quoting
8591 in the helper macros, then pass the final over-quoted output string
8592 through @code{_arg1} to remove the extra quoting and finally invoke the
8593 concatenated portions as a single string.
8595 Finally, when passed a double-quoted string, the nested macro
8596 @code{_capitalize} is never invoked because it ended up nested inside
8597 quotes. This one is the toughest to fix. In short, we have no idea how
8598 many levels of quotes are in effect on the substring being altered by
8599 @code{patsubst}. If the replacement string cannot be expressed entirely
8600 in terms of literal text and backslash substitutions, then we need a
8601 mechanism to guarantee that the helper macros are invoked outside of
8602 quotes. In other words, this sounds like a job for @code{changequote}
8603 (@pxref{Changequote}). By changing the active quoting characters, we
8604 can guarantee that replacement text injected by @code{patsubst} always
8605 occurs in the middle of a string that has exactly one level of
8606 over-quoting using alternate quotes; so the replacement text closes the
8607 quoted string, invokes the helper macros, then reopens the quoted
8608 string. In turn, that means the replacement text has unbalanced quotes,
8609 necessitating another round of @code{changequote}.
8611 In the fixed version below, (also shipped as
8612 @file{m4-@value{VERSION}/@/examples/@/capitalize2.m4}), @code{capitalize}
8613 uses the alternate quotes of @samp{<<[} and @samp{]>>} (the longer
8614 strings are chosen so as to be less likely to appear in the text being
8615 converted). The helpers @code{_to_alt} and @code{_from_alt} merely
8616 reduce the number of characters required to perform a
8617 @code{changequote}, since the definition changes twice. The outermost
8618 pair means that @code{patsubst} and @code{_capitalize_alt} are invoked
8619 with alternate quoting; the innermost pair is used so that the third
8620 argument to @code{patsubst} can contain an unbalanced
8621 @samp{]>>}/@samp{<<[} pair. Note that @code{upcase} and @code{downcase}
8622 must be redefined as @code{_upcase_alt} and @code{_downcase_alt}, since
8623 they contain nested quotes but are invoked with the alternate quoting
8628 $ @kbd{m4 -I examples}
8629 include(`capitalize2.m4')dnl
8630 define(`active', `act1, ive')dnl
8631 define(`Active', `Act2, Ive')dnl
8632 define(`ACTIVE', `ACT3, IVE')dnl
8633 define(`A', `OOPS')dnl
8634 capitalize(active; `active'; ``active''; ```actIVE''')
8635 @result{}Act1,Ive; Act2, Ive; Active; `Active'
8636 undivert(`capitalize2.m4')dnl
8637 @result{}divert(`-1')
8638 @result{}# upcase(text)
8639 @result{}# downcase(text)
8640 @result{}# capitalize(text)
8641 @result{}# change case of text, improved version
8642 @result{}define(`upcase', `translit(`$*', `a-z', `A-Z')')
8643 @result{}define(`downcase', `translit(`$*', `A-Z', `a-z')')
8644 @result{}define(`_arg1', `$1')
8645 @result{}define(`_to_alt', `changequote(`<<[', `]>>')')
8646 @result{}define(`_from_alt', `changequote(<<[`]>>, <<[']>>)')
8647 @result{}define(`_upcase_alt', `translit(<<[$*]>>, <<[a-z]>>, <<[A-Z]>>)')
8648 @result{}define(`_downcase_alt', `translit(<<[$*]>>, <<[A-Z]>>, <<[a-z]>>)')
8649 @result{}define(`_capitalize_alt',
8650 @result{} `regexp(<<[$1]>>, <<[^\(\w\)\(\w*\)]>>,
8651 @result{} <<[_upcase_alt(<<[<<[\1]>>]>>)_downcase_alt(<<[<<[\2]>>]>>)]>>)')
8652 @result{}define(`capitalize',
8653 @result{} `_arg1(_to_alt()patsubst(<<[<<[$*]>>]>>, <<[\w+]>>,
8654 @result{} _from_alt()`]>>_$0_alt(<<[\&]>>)<<['_to_alt())_from_alt())')
8655 @result{}divert`'dnl
8658 @node Improved fatal_error
8659 @section Solution for @code{fatal_error}
8661 The @code{fatal_error} macro (@pxref{M4exit}) is not robust to versions
8662 of GNU M4 earlier than 1.4.8, where invoking
8663 @code{@w{__file__}} (@pxref{Location}) inside @code{m4wrap} would result
8664 in an empty string, and @code{@w{__line__}} resulted in @samp{0} even
8665 though all files start at line 1. Furthermore, versions earlier than
8666 1.4.6 did not support the @code{@w{__program__}} macro. If you want
8667 @code{fatal_error} to work across the entire 1.4.x release series, a
8668 better implementation would be:
8672 define(`fatal_error',
8673 `errprint(ifdef(`__program__', `__program__', ``m4'')'dnl
8674 `:ifelse(__line__, `0', `',
8675 `__file__:__line__:')` fatal error: $*
8678 m4wrap(`divnum(`demo of internal message')
8679 fatal_error(`inside wrapped text')')
8682 @error{}m4:stdin:6: Warning: excess arguments to builtin `divnum' ignored
8684 @error{}m4:stdin:6: fatal error: inside wrapped text
8687 @c ========================================================== Appendices
8689 @node Copying This Package
8690 @appendix How to make copies of the overall M4 package
8691 @cindex License, code
8693 This appendix covers the license for copying the source code of the
8694 overall M4 package. This manual is under a different set of
8695 restrictions, covered later (@pxref{Copying This Manual}).
8698 * GNU General Public License:: License for copying the M4 package
8701 @node GNU General Public License
8702 @appendixsec License for copying the M4 package
8703 @cindex GPL, GNU General Public License
8704 @cindex GNU General Public License
8705 @cindex General Public License (GPL), GNU
8706 @include gpl-3.0.texi
8708 @node Copying This Manual
8709 @appendix How to make copies of this manual
8710 @cindex License, manual
8712 This appendix covers the license for copying this manual. Note that
8713 some of the longer examples in this manual are also distributed in the
8714 directory @file{m4-@value{VERSION}/@/examples/}, where a more
8715 permissive license is in effect when copying just the examples.
8718 * GNU Free Documentation License:: License for copying this manual
8721 @node GNU Free Documentation License
8722 @appendixsec License for copying this manual
8723 @cindex FDL, GNU Free Documentation License
8724 @cindex GNU Free Documentation License
8725 @cindex Free Documentation License (FDL), GNU
8726 @include fdl-1.3.texi
8729 @appendix Indices of concepts and macros
8732 * Macro index:: Index for all @code{m4} macros
8733 * Concept index:: Index for many concepts
8737 @appendixsec Index for all @code{m4} macros
8739 This index covers all @code{m4} builtins, as well as several useful
8740 composite macros. References are exclusively to the places where a
8741 macro is introduced the first time.
8746 @appendixsec Index for many concepts
8753 @c coding: iso-8859-1
8755 @c ispell-local-dictionary: "american"
8756 @c indent-tabs-mode: nil
8757 @c whitespace-check-buffer-indent: nil