1 \input texinfo @c -*- texinfo -*-
2 @comment ========================================================
3 @comment %**start of header
6 @settitle GNU M4 @value{VERSION} macro processor
9 @setcontentsaftertitlepage
15 @c The testsuite expects literal tab output in some examples, but
16 @c literal tabs in texinfo lead to formatting issues.
22 @c -------------------
23 @c The ARG is an optional argument. To be used for macro arguments in
24 @c their documentation (@defmac).
26 @r{[}@var{\varname\}@r{]}@c
29 @c @dvar{ARG, DEFAULT}
30 @c -------------------
31 @c The ARG is an optional argument, defaulting to DEFAULT. To be used
32 @c for macro arguments in their documentation (@defmac).
33 @macro dvar{varname, default}
34 @r{[}@var{\varname\} = @samp{\default\}@r{]}@c
37 @comment %**end of header
38 @comment ========================================================
42 This manual (@value{UPDATED}) is for GNU M4 (version
43 @value{VERSION}), a package containing an implementation of the m4 macro
46 Copyright @copyright{} 1989-1994, 2004-2011 Free Software Foundation,
50 Permission is granted to copy, distribute and/or modify this document
51 under the terms of the GNU Free Documentation License,
52 Version 1.3 or any later version published by the Free Software
53 Foundation; with no Invariant Sections, no Front-Cover Texts, and no
54 Back-Cover Texts. A copy of the license is included in the section
55 entitled ``GNU Free Documentation License.''
59 @dircategory Text creation and manipulation
61 * M4: (m4). A powerful macro processor.
65 @title GNU M4, version @value{VERSION}
66 @subtitle A powerful macro processor
67 @subtitle Edition @value{EDITION}, @value{UPDATED}
68 @author by Ren@'e Seindal, Fran@,{c}ois Pinard,
69 @author Gary V. Vaughan, and Eric Blake
70 @author (@email{bug-m4@@gnu.org})
73 @vskip 0pt plus 1filll
85 GNU @code{m4} is an implementation of the traditional UNIX macro
86 processor. It is mostly SVR4 compatible, although it has some
87 extensions (for example, handling more than 9 positional parameters
88 to macros). @code{m4} also has builtin functions for including
89 files, running shell commands, doing arithmetic, etc. Autoconf needs
90 GNU @code{m4} for generating @file{configure} scripts, but not for
93 GNU @code{m4} was originally written by Ren@'e Seindal, with
94 subsequent changes by Fran@,{c}ois Pinard and other volunteers
95 on the Internet. All names and email addresses can be found in the
96 files @file{m4-@value{VERSION}/@/AUTHORS} and
97 @file{m4-@value{VERSION}/@/THANKS} from the GNU M4
100 This is release @value{VERSION}. It is now considered stable: future
101 releases in the 1.4.x series are only meant to fix bugs, increase speed,
102 or improve documentation. However@dots{}
104 An experimental feature, which would improve @code{m4} usefulness,
105 allows for changing the syntax for what is a @dfn{word} in @code{m4}.
109 ./configure --enable-changeword
112 if you want this feature compiled in. The current implementation
113 slows down @code{m4} considerably and is hardly acceptable. In the
114 future, @code{m4} 2.0 will come with a different set of new features
115 that provide similar capabilities, but without the inefficiencies, so
116 changeword will go away and @emph{you should not count on it}.
119 * Preliminaries:: Introduction and preliminaries
120 * Invoking m4:: Invoking @code{m4}
121 * Syntax:: Lexical and syntactic conventions
123 * Macros:: How to invoke macros
124 * Definitions:: How to define new macros
125 * Conditionals:: Conditionals, loops, and recursion
127 * Debugging:: How to debug macros and input
129 * Input Control:: Input control
130 * File Inclusion:: File inclusion
131 * Diversions:: Diverting and undiverting output
133 * Text handling:: Macros for text handling
134 * Arithmetic:: Macros for doing arithmetic
135 * Shell commands:: Macros for running shell commands
136 * Miscellaneous:: Miscellaneous builtin macros
137 * Frozen files:: Fast loading of frozen state
139 * Compatibility:: Compatibility with other versions of @code{m4}
140 * Answers:: Correct version of some examples
142 * Copying This Package:: How to make copies of the overall M4 package
143 * Copying This Manual:: How to make copies of this manual
144 * Indices:: Indices of concepts and macros
147 --- The Detailed Node Listing ---
149 Introduction and preliminaries
151 * Intro:: Introduction to @code{m4}
152 * History:: Historical references
153 * Bugs:: Problems and bugs
154 * Manual:: Using this manual
158 * Operation modes:: Command line options for operation modes
159 * Preprocessor features:: Command line options for preprocessor features
160 * Limits control:: Command line options for limits control
161 * Frozen state:: Command line options for frozen state
162 * Debugging options:: Command line options for debugging
163 * Command line files:: Specifying input files on the command line
165 Lexical and syntactic conventions
167 * Names:: Macro names
168 * Quoted strings:: Quoting input to @code{m4}
169 * Comments:: Comments in @code{m4} input
170 * Other tokens:: Other kinds of input tokens
171 * Input processing:: How @code{m4} copies input to output
175 * Invocation:: Macro invocation
176 * Inhibiting Invocation:: Preventing macro invocation
177 * Macro Arguments:: Macro arguments
178 * Quoting Arguments:: On Quoting Arguments to macros
179 * Macro expansion:: Expanding macros
181 How to define new macros
183 * Define:: Defining a new macro
184 * Arguments:: Arguments to macros
185 * Pseudo Arguments:: Special arguments to macros
186 * Undefine:: Deleting a macro
187 * Defn:: Renaming macros
188 * Pushdef:: Temporarily redefining macros
190 * Indir:: Indirect call of macros
191 * Builtin:: Indirect call of builtins
193 Conditionals, loops, and recursion
195 * Ifdef:: Testing if a macro is defined
196 * Ifelse:: If-else construct, or multibranch
197 * Shift:: Recursion in @code{m4}
198 * Forloop:: Iteration by counting
199 * Foreach:: Iteration by list contents
200 * Stacks:: Working with definition stacks
201 * Composition:: Building macros with macros
203 How to debug macros and input
205 * Dumpdef:: Displaying macro definitions
206 * Trace:: Tracing macro calls
207 * Debug Levels:: Controlling debugging output
208 * Debug Output:: Saving debugging output
212 * Dnl:: Deleting whitespace in input
213 * Changequote:: Changing the quote characters
214 * Changecom:: Changing the comment delimiters
215 * Changeword:: Changing the lexical structure of words
216 * M4wrap:: Saving text until end of input
220 * Include:: Including named files
221 * Search Path:: Searching for include files
223 Diverting and undiverting output
225 * Divert:: Diverting output
226 * Undivert:: Undiverting output
227 * Divnum:: Diversion numbers
228 * Cleardivert:: Discarding diverted text
230 Macros for text handling
232 * Len:: Calculating length of strings
233 * Index macro:: Searching for substrings
234 * Regexp:: Searching for regular expressions
235 * Substr:: Extracting substrings
236 * Translit:: Translating characters
237 * Patsubst:: Substituting text by regular expression
238 * Format:: Formatting strings (printf-like)
240 Macros for doing arithmetic
242 * Incr:: Decrement and increment operators
243 * Eval:: Evaluating integer expressions
245 Macros for running shell commands
247 * Platform macros:: Determining the platform
248 * Syscmd:: Executing simple commands
249 * Esyscmd:: Reading the output of commands
250 * Sysval:: Exit status
251 * Mkstemp:: Making temporary files
253 Miscellaneous builtin macros
255 * Errprint:: Printing error messages
256 * Location:: Printing current location
257 * M4exit:: Exiting from @code{m4}
259 Fast loading of frozen state
261 * Using frozen files:: Using frozen files
262 * Frozen file format:: Frozen file format
264 Compatibility with other versions of @code{m4}
266 * Extensions:: Extensions in GNU M4
267 * Incompatibilities:: Facilities in System V m4 not in GNU M4
268 * Other Incompatibilities:: Other incompatibilities
270 Correct version of some examples
272 * Improved exch:: Solution for @code{exch}
273 * Improved forloop:: Solution for @code{forloop}
274 * Improved foreach:: Solution for @code{foreach}
275 * Improved copy:: Solution for @code{copy}
276 * Improved m4wrap:: Solution for @code{m4wrap}
277 * Improved cleardivert:: Solution for @code{cleardivert}
278 * Improved capitalize:: Solution for @code{capitalize}
279 * Improved fatal_error:: Solution for @code{fatal_error}
281 How to make copies of the overall M4 package
283 * GNU General Public License:: License for copying the M4 package
285 How to make copies of this manual
287 * GNU Free Documentation License:: License for copying this manual
289 Indices of concepts and macros
291 * Macro index:: Index for all @code{m4} macros
292 * Concept index:: Index for many concepts
298 @chapter Introduction and preliminaries
300 This first chapter explains what GNU @code{m4} is, where @code{m4}
301 comes from, how to read and use this documentation, how to call the
302 @code{m4} program, and how to report bugs about it. It concludes by
303 giving tips for reading the remainder of the manual.
305 The following chapters then detail all the features of the @code{m4}
309 * Intro:: Introduction to @code{m4}
310 * History:: Historical references
311 * Bugs:: Problems and bugs
312 * Manual:: Using this manual
316 @section Introduction to @code{m4}
318 @cindex overview of @code{m4}
319 @code{m4} is a macro processor, in the sense that it copies its
320 input to the output, expanding macros as it goes. Macros are either
321 builtin or user-defined, and can take any number of arguments.
322 Besides just doing macro expansion, @code{m4} has builtin functions
323 for including named files, running shell commands, doing integer
324 arithmetic, manipulating text in various ways, performing recursion,
325 etc.@dots{} @code{m4} can be used either as a front-end to a compiler,
326 or as a macro processor in its own right.
328 The @code{m4} macro processor is widely available on all UNIXes, and has
329 been standardized by POSIX.
330 Usually, only a small percentage of users are aware of its existence.
331 However, those who find it often become committed users. The
332 popularity of GNU Autoconf, which requires GNU
333 @code{m4} for @emph{generating} @file{configure} scripts, is an incentive
334 for many to install it, while these people will not themselves
335 program in @code{m4}. GNU @code{m4} is mostly compatible with the
336 System V, Release 3 version, except for some minor differences.
337 @xref{Compatibility}, for more details.
339 Some people find @code{m4} to be fairly addictive. They first use
340 @code{m4} for simple problems, then take bigger and bigger challenges,
341 learning how to write complex sets of @code{m4} macros along the way.
342 Once really addicted, users pursue writing of sophisticated @code{m4}
343 applications even to solve simple problems, devoting more time
344 debugging their @code{m4} scripts than doing real work. Beware that
345 @code{m4} may be dangerous for the health of compulsive programmers.
348 @section Historical references
350 @cindex history of @code{m4}
351 @cindex GNU M4, history of
352 @code{GPM} was an important ancestor of @code{m4}. See
353 C. Strachey: ``A General Purpose Macro generator'', Computer Journal
354 8,3 (1965), pp.@: 225 ff. @code{GPM} is also succinctly described into
355 David Gries classic ``Compiler Construction for Digital Computers''.
357 The classic B. Kernighan and P.J. Plauger: ``Software Tools'',
358 Addison-Wesley, Inc.@: (1976) describes and implements a Unix
359 macro-processor language, which inspired Dennis Ritchie to write
360 @code{m3}, a macro processor for the AP-3 minicomputer.
362 Kernighan and Ritchie then joined forces to develop the original
363 @code{m4}, as described in ``The M4 Macro Processor'', Bell
364 Laboratories (1977). It had only 21 builtin macros.
366 While @code{GPM} was more @emph{pure}, @code{m4} is meant to deal with
367 the true intricacies of real life: macros can be recognized without
368 being pre-announced, skipping whitespace or end-of-lines is easier,
369 more constructs are builtin instead of derived, etc.
371 Originally, the Kernighan and Plauger macro-processor, and then
372 @code{m3}, formed the engine for the Rational FORTRAN preprocessor,
373 that is, the @code{Ratfor} equivalent of @code{cpp}. Later, @code{m4}
374 was used as a front-end for @code{Ratfor}, @code{C} and @code{Cobol}.
376 Ren@'e Seindal released his implementation of @code{m4}, GNU
378 in 1990, with the aim of removing the artificial limitations in many
379 of the traditional @code{m4} implementations, such as maximum line
380 length, macro size, or number of macros.
382 The late Professor A. Dain Samples described and implemented a further
383 evolution in the form of @code{M5}: ``User's Guide to the M5 Macro
384 Language: 2nd edition'', Electronic Announcement on comp.compilers
387 Fran@,{c}ois Pinard took over maintenance of GNU @code{m4} in
388 1992, until 1994 when he released GNU @code{m4} 1.4, which was
389 the stable release for 10 years. It was at this time that GNU
390 Autoconf decided to require GNU @code{m4} as its underlying
391 engine, since all other implementations of @code{m4} had too many
394 More recently, in 2004, Paul Eggert released 1.4.1 and 1.4.2 which
395 addressed some long standing bugs in the venerable 1.4 release. Then in
396 2005, Gary V. Vaughan collected together the many patches to
397 GNU @code{m4} 1.4 that were floating around the net and
398 released 1.4.3 and 1.4.4. And in 2006, Eric Blake joined the team and
399 prepared patches for the release of 1.4.5, 1.4.6, 1.4.7, and 1.4.8.
400 More bug fixes were incorporated in 2007, with releases 1.4.9 and
401 1.4.10. Eric continued with some portability fixes for 1.4.11 and
402 1.4.12 in 2008, 1.4.13 in 2009, 1.4.14 and 1.4.15 in 2010, and 1.4.16 in
405 Meanwhile, development has continued on new features for @code{m4}, such
406 as dynamic module loading and additional builtins. When complete,
407 GNU @code{m4} 2.0 will start a new series of releases.
410 @section Problems and bugs
412 @cindex reporting bugs
414 @cindex suggestions, reporting
415 If you have problems with GNU M4 or think you've found a bug,
416 please report it. Before reporting a bug, make sure you've actually
417 found a real bug. Carefully reread the documentation and see if it
418 really says you can do what you're trying to do. If it's not clear
419 whether you should be able to do something or not, report that too; it's
420 a bug in the documentation!
422 Before reporting a bug or trying to fix it yourself, try to isolate it
423 to the smallest possible input file that reproduces the problem. Then
424 send us the input file and the exact results @code{m4} gave you. Also
425 say what you expected to occur; this will help us decide whether the
426 problem was really in the documentation.
428 Once you've got a precise problem, send e-mail to
429 @email{bug-m4@@gnu.org}. Please include the version number of @code{m4}
430 you are using. You can get this information with the command
431 @kbd{m4 --version}. Also provide details about the platform you are
434 Non-bug suggestions are always welcome as well. If you have questions
435 about things that are unclear in the documentation or are just obscure
436 features, please report them too.
439 @section Using this manual
441 @cindex examples, understanding
442 This manual contains a number of examples of @code{m4} input and output,
443 and a simple notation is used to distinguish input, output and error
444 messages from @code{m4}. Examples are set out from the normal text, and
445 shown in a fixed width font, like this
449 This is an example of an example!
452 To distinguish input from output, all output from @code{m4} is prefixed
453 by the string @samp{@result{}}, and all error messages by the string
454 @samp{@error{}}. When showing how command line options affect matters,
455 the command line is shown with a prompt @samp{$ @kbd{like this}},
456 otherwise, you can assume that a simple @kbd{m4} invocation will work.
461 $ @kbd{command line to invoke m4}
462 Example of input line
463 @result{}Output line from m4
464 @error{}and an error message
467 The sequence @samp{^D} in an example indicates the end of the input
468 file. The sequence @samp{@key{NL}} refers to the newline character.
469 The majority of these examples are self-contained, and you can run them
470 with similar results by invoking @kbd{m4 -d}. In fact, the testsuite
471 that is bundled in the GNU M4 package consists of the examples
472 in this document! Some of the examples assume that your current
473 directory is located where you unpacked the installation, so if you plan
474 on following along, you may find it helpful to do this now:
478 $ @kbd{cd m4-@value{VERSION}}
481 As each of the predefined macros in @code{m4} is described, a prototype
482 call of the macro will be shown, giving descriptive names to the
485 @deffn Composite example (@var{string}, @dvar{count, 1}, @
486 @ovar{argument}@dots{})
487 This is a sample prototype. There is not really a macro named
488 @code{example}, but this documents that if there were, it would be a
489 Composite macro, rather than a Builtin. It requires at least one
490 argument, @var{string}. Remember that in @code{m4}, there must not be a
491 space between the macro name and the opening parenthesis, unless it was
492 intended to call the macro without any arguments. The brackets around
493 @var{count} and @var{argument} show that these arguments are optional.
494 If @var{count} is omitted, the macro behaves as if count were @samp{1},
495 whereas if @var{argument} is omitted, the macro behaves as if it were
496 the empty string. A blank argument is not the same as an omitted
497 argument. For example, @samp{example(`a')}, @samp{example(`a',`1')},
498 and @samp{example(`a',`1',)} would behave identically with @var{count}
499 set to @samp{1}; while @samp{example(`a',)} and @samp{example(`a',`')}
500 would explicitly pass the empty string for @var{count}. The ellipses
501 (@samp{@dots{}}) show that the macro processes additional arguments
502 after @var{argument}, rather than ignoring them.
506 All macro arguments in @code{m4} are strings, but some are given
507 special interpretation, e.g., as numbers, file names, regular
508 expressions, etc. The documentation for each macro will state how the
509 parameters are interpreted, and what happens if the argument cannot be
510 parsed according to the desired interpretation. Unless specified
511 otherwise, a parameter specified to be a number is parsed as a decimal,
512 even if the argument has leading zeros; and parsing the empty string as
513 a number results in 0 rather than an error, although a warning will be
516 This document consistently writes and uses @dfn{builtin}, without a
517 hyphen, as if it were an English word. This is how the @code{builtin}
518 primitive is spelled within @code{m4}.
521 @chapter Invoking @code{m4}
524 @cindex invoking @code{m4}
525 The format of the @code{m4} command is:
529 @code{m4} @r{[}@var{option}@dots{}@r{]} @r{[}@var{file}@dots{}@r{]}
532 @cindex command line, options
533 @cindex options, command line
534 @cindex @env{POSIXLY_CORRECT}
535 All options begin with @samp{-}, or if long option names are used, with
536 @samp{--}. A long option name need not be written completely, any
537 unambiguous prefix is sufficient. POSIX requires @code{m4} to
538 recognize arguments intermixed with files, even when
539 @env{POSIXLY_CORRECT} is set in the environment. Most options take
540 effect at startup regardless of their position, but some are documented
541 below as taking effect after any files that occurred earlier in the
542 command line. The argument @option{--} is a marker to denote the end of
545 With short options, options that do not take arguments may be combined
546 into a single command line argument with subsequent options, options
547 with mandatory arguments may be provided either as a single command line
548 argument or as two arguments, and options with optional arguments must
549 be provided as a single argument. In other words,
550 @kbd{m4 -QPDfoo -d a -df} is equivalent to
551 @kbd{m4 -Q -P -D foo -d -df -- ./a}, although the latter form is
552 considered canonical.
554 With long options, options with mandatory arguments may be provided with
555 an equal sign (@samp{=}) in a single argument, or as two arguments, and
556 options with optional arguments must be provided as a single argument.
557 In other words, @kbd{m4 --def foo --debug a} is equivalent to
558 @kbd{m4 --define=foo --debug= -- ./a}, although the latter form is
559 considered canonical (not to mention more robust, in case a future
560 version of @code{m4} introduces an option named @option{--default}).
562 @code{m4} understands the following options, grouped by functionality.
565 * Operation modes:: Command line options for operation modes
566 * Preprocessor features:: Command line options for preprocessor features
567 * Limits control:: Command line options for limits control
568 * Frozen state:: Command line options for frozen state
569 * Debugging options:: Command line options for debugging
570 * Command line files:: Specifying input files on the command line
573 @node Operation modes
574 @section Command line options for operation modes
576 Several options control the overall operation of @code{m4}:
580 Print a help summary on standard output, then immediately exit
581 @code{m4} without reading any input files or performing any other
585 Print the version number of the program on standard output, then
586 immediately exit @code{m4} without reading any input files or
587 performing any other actions.
590 @itemx --fatal-warnings
591 @cindex errors, fatal
593 Controls the effect of warnings. If unspecified, then execution
594 continues and exit status is unaffected when a warning is printed. If
595 specified exactly once, warnings become fatal; when one is issued,
596 execution continues, but the exit status will be non-zero. If specified
597 multiple times, then execution halts with non-zero status the first time
598 a warning is issued. The introduction of behavior levels is new to M4
599 1.4.9; for behavior consistent with earlier versions, you should specify
605 Makes this invocation of @code{m4} interactive. This means that all
606 output will be unbuffered, and interrupts will be ignored. The
607 spelling @option{-e} exists for compatibility with other @code{m4}
608 implementations, and issues a warning because it may be withdrawn in a
609 future version of GNU M4.
612 @itemx --prefix-builtins
613 Internally modify @emph{all} builtin macro names so they all start with
614 the prefix @samp{m4_}. For example, using this option, one should write
615 @samp{m4_define} instead of @samp{define}, and @samp{m4___file__}
616 instead of @samp{__file__}. This option has no effect if @option{-R}
622 Suppress warnings, such as missing or superfluous arguments in macro
623 calls, or treating the empty string as zero.
625 @item --warn-macro-sequence@r{[}=@var{regexp}@r{]}
626 Issue a warning if the regular expression @var{regexp} has a non-empty
627 match in any macro definition (either by @code{define} or
628 @code{pushdef}). Empty matches are ignored; therefore, supplying the
629 empty string as @var{regexp} disables any warning. If the optional
630 @var{regexp} is not supplied, then the default regular expression is
631 @samp{\$\(@{[^@}]*@}\|[0-9][0-9]+\)} (a literal @samp{$} followed by
632 multiple digits or by an open brace), since these sequences will
633 change semantics in the default operation of GNU M4 2.0 (due
634 to a change in how more than 9 arguments in a macro definition will be
635 handled, @pxref{Arguments}). Providing an alternate regular
636 expression can provide a useful reverse lookup feature of finding
637 where a macro is defined to have a given definition.
639 @item -W @var{regexp}
640 @itemx --word-regexp=@var{regexp}
641 Use @var{regexp} as an alternative syntax for macro names. This
642 experimental option will not be present in all GNU @code{m4}
643 implementations (@pxref{Changeword}).
646 @node Preprocessor features
647 @section Command line options for preprocessor features
649 @cindex macro definitions, on the command line
650 @cindex command line, macro definitions on the
651 @cindex preprocessor features
652 Several options allow @code{m4} to behave more like a preprocessor.
653 Macro definitions and deletions can be made on the command line, the
654 search path can be altered, and the output file can track where the
655 input came from. These features occur with the following options:
658 @item -D @var{name}@r{[}=@var{value}@r{]}
659 @itemx --define=@var{name}@r{[}=@var{value}@r{]}
660 This enters @var{name} into the symbol table. If @samp{=@var{value}} is
661 missing, the value is taken to be the empty string. The @var{value} can
662 be any string, and the macro can be defined to take arguments, just as
663 if it was defined from within the input. This option may be given more
664 than once; order with respect to file names is significant, and
665 redefining the same @var{name} loses the previous value.
667 @item -I @var{directory}
668 @itemx --include=@var{directory}
669 Make @code{m4} search @var{directory} for included files that are not
670 found in the current working directory. @xref{Search Path}, for more
671 details. This option may be given more than once.
675 @cindex synchronization lines
676 @cindex location, input
677 @cindex input location
678 Generate synchronization lines, for use by the C preprocessor or other
679 similar tools. Order is significant with respect to file names. This
680 option is useful, for example, when @code{m4} is used as a
681 front end to a compiler. Source file name and line number information
682 is conveyed by directives of the form @samp{#line @var{linenum}
683 "@var{file}"}, which are inserted as needed into the middle of the
684 output. Such directives mean that the following line originated or was
685 expanded from the contents of input file @var{file} at line
686 @var{linenum}. The @samp{"@var{file}"} part is often omitted when
687 the file name did not change from the previous directive.
689 Synchronization directives are always given on complete lines by
690 themselves. When a synchronization discrepancy occurs in the middle of
691 an output line, the associated synchronization directive is delayed
692 until the next newline that does not occur in the middle of a quoted
699 @result{}#line 2 "stdin"
701 changecom(`/*', `*/')
703 define(`comment', `/*1
730 @itemx --undefine=@var{name}
731 This deletes any predefined meaning @var{name} might have. Obviously,
732 only predefined macros can be deleted in this way. This option may be
733 given more than once; undefining a @var{name} that does not have a
734 definition is silently ignored. Order is significant with respect to
739 @section Command line options for limits control
741 There are some limits within @code{m4} that can be tuned. For
742 compatibility, @code{m4} also accepts some options that control limits
743 in other implementations, but which are automatically unbounded (limited
744 only by your hardware and operating system constraints) in GNU
750 Enable all the extensions in this implementation. In this release of
751 M4, this option is always on by default; it is currently only useful
752 when overriding a prior use of @option{--traditional}. However, having
753 GNU behavior as default makes it impossible to write a
754 strictly POSIX-compliant client that avoids all incompatible
755 GNU M4 extensions, since such a client would have to use the
756 non-POSIX command-line option to force full POSIX
757 behavior. Thus, a future version of M4 will be changed to implicitly
758 use the option @option{--traditional} if the environment variable
759 @env{POSIXLY_CORRECT} is set. Projects that intentionally use
760 GNU extensions should consider using @option{--gnu} to state
761 their intentions, so that the project will not mysteriously break if the
762 user upgrades to a newer M4 and has @env{POSIXLY_CORRECT} set in their
767 Suppress all the extensions made in this implementation, compared to the
768 System V version. @xref{Compatibility}, for a list of these.
771 @itemx --hashsize=@var{num}
772 Make the internal hash table for symbol lookup be @var{num} entries big.
773 For better performance, the number should be prime, but this is not
774 checked. The default is 509 entries. It should not be necessary to
775 increase this value, unless you define an excessive number of macros.
778 @itemx --nesting-limit=@var{num}
779 @cindex nesting limit
780 @cindex limit, nesting
781 Artificially limit the nesting of macro calls to @var{num} levels,
782 stopping program execution if this limit is ever exceeded. When not
783 specified, nesting defaults to unlimited on platforms that can detect
784 stack overflow, and to 1024 levels otherwise. A value of zero means
785 unlimited; but then heavily nested code could potentially cause a stack
788 The precise effect of this option is more correctly associated
789 with textual nesting than dynamic recursion. It has been useful
790 when some complex @code{m4} input was generated by mechanical means, and
791 also in diagnosing recursive algorithms that do not scale well.
792 Most users never need to change this option from its default.
795 This option does @emph{not} have the ability to break endless
796 rescanning loops, since these do not necessarily consume much memory
797 or stack space. Through clever usage of rescanning loops, one can
798 request complex, time-consuming computations from @code{m4} with useful
799 results. Putting limitations in this area would break @code{m4} power.
800 There are many pathological cases: @w{@samp{define(`a', `a')a}} is
801 only the simplest example (but @pxref{Compatibility}). Expecting GNU
802 @code{m4} to detect these would be a little like expecting a compiler
803 system to detect and diagnose endless loops: it is a quite @emph{hard}
804 problem in general, if not undecidable!
809 These options are present for compatibility with System V @code{m4}, but
810 do nothing in this implementation. They may disappear in future
811 releases, and issue a warning to that effect.
814 @itemx --diversions=@var{num}
815 These options are present only for compatibility with previous
816 versions of GNU @code{m4}, and were controlling the number of
817 possible diversions which could be used at the same time. They do nothing,
818 because there is no fixed limit anymore. They may disappear in future
819 releases, and issue a warning to that effect.
823 @section Command line options for frozen state
825 GNU @code{m4} comes with a feature of freezing internal state
826 (@pxref{Frozen files}). This can be used to speed up @code{m4}
827 execution when reusing a common initialization script.
831 @itemx --freeze-state=@var{file}
832 Once execution is finished, write out the frozen state on the specified
833 @var{file}. It is conventional, but not required, for @var{file} to end
837 @itemx --reload-state=@var{file}
838 Before execution starts, recover the internal state from the specified
839 frozen @var{file}. The options @option{-D}, @option{-U}, and
840 @option{-t} take effect after state is reloaded, but before the input
844 @node Debugging options
845 @section Command line options for debugging
847 Finally, there are several options for aiding in debugging @code{m4}
851 @item -d@r{[}@var{flags}@r{]}
852 @itemx --debug@r{[}=@var{flags}@r{]}
853 Set the debug-level according to the flags @var{flags}. The debug-level
854 controls the format and amount of information presented by the debugging
855 functions. @xref{Debug Levels}, for more details on the format and
856 meaning of @var{flags}. If omitted, @var{flags} defaults to @samp{aeq}.
858 @item --debugfile@r{[}=@var{file}@r{]}
860 @itemx --error-output=@var{file}
861 Redirect @code{dumpdef} output, debug messages, and trace output to the
862 named @var{file}. Warnings, error messages, and @code{errprint} output
863 are still printed to standard error. If these options are not used, or
864 if @var{file} is unspecified (only possible for @option{--debugfile}),
865 debug output goes to standard error; if @var{file} is the empty string,
866 debug output is discarded. @xref{Debug Output}, for more details. The
867 option @option{--debugfile} may be given more than once, and order is
868 significant with respect to file names. The spellings @option{-o} and
869 @option{--error-output} are misleading and inconsistent with other
870 GNU tools; for now they are silently accepted as synonyms of
871 @option{--debugfile} and only recognized once, but in a future version
872 of M4, using them will cause a warning to be issued.
875 @comment not worth including in the manual, but provides a good test
878 @comment options: -Dbar=hello -tbar --debugfile= foo --debugfile -
880 $ @kbd{m4 -d -Iexamples -Dbar=hello -tbar --debugfile= foo --debugfile -
886 @error{}m4trace: -1- bar -> `hello'
892 @itemx --arglength=@var{num}
893 Restrict the size of the output generated by macro tracing to @var{num}
894 characters per trace line. If unspecified or zero, output is
895 unlimited. @xref{Debug Levels}, for more details.
898 @itemx --trace=@var{name}
899 This enables tracing for the macro @var{name}, at any point where it is
900 defined. @var{name} need not be defined when this option is given.
901 This option may be given more than once, and order is significant with
902 respect to file names. @xref{Trace}, for more details.
905 @node Command line files
906 @section Specifying input files on the command line
908 @cindex command line, file names on the
909 @cindex file names, on the command line
910 The remaining arguments on the command line are taken to be input file
911 names. If no names are present, standard input is read. A file
912 name of @file{-} is taken to mean standard input. It is
913 conventional, but not required, for input files to end in @samp{.m4}.
915 The input files are read in the sequence given. Standard input can be
916 read more than once, so the file name @file{-} may appear multiple times
917 on the command line; this makes a difference when input is from a
918 terminal or other special file type. It is an error if an input file
919 ends in the middle of argument collection, a comment, or a quoted
922 The options @option{--define} (@option{-D}), @option{--undefine}
923 (@option{-U}), @option{--synclines} (@option{-s}), and @option{--trace}
924 (@option{-t}) only take effect after processing input from any file
925 names that occur earlier on the command line. For example, assume the
926 file @file{foo} contains:
934 The text @samp{bar} can then be redefined over multiple uses of
937 @comment options: -Dbar=hello foo -Dbar=world foo
939 $ @kbd{m4 -Dbar=hello foo -Dbar=world foo}
944 If none of the input files invoked @code{m4exit} (@pxref{M4exit}), the
945 exit status of @code{m4} will be 0 for success, 1 for general failure
946 (such as problems with reading an input file), and 63 for version
947 mismatch (@pxref{Using frozen files}).
949 If you need to read a file whose name starts with a @file{-}, you can
950 specify it as @samp{./-file}, or use @option{--} to mark the end of
954 @comment Test that 'm4 file/' detects that file is not a directory; we
955 @comment can assume that the current directory contains a Makefile.
956 @comment mingw fails with EINVAL rather than ENOTDIR.
959 @comment xerr: ignore
960 @comment options: Makefile/
962 @error{}m4: cannot open `Makefile/': Not a directory
965 @comment Test that closed stderr does not cause a crash. Not all
966 @comment systems have the same message for EBADF.
968 @comment xerr: ignore
971 `errprint(` skipping: syscmd does not have unix semantics
973 changequote(`[', `]')dnl
974 syscmd([echo | ']__program__[' >&-])dnl
975 @error{}m4: write error: Bad file descriptor
982 `errprint(` skipping: syscmd does not have unix semantics
984 changequote(`[', `]')dnl
985 syscmd([echo 'esyscmd(echo hi >&2 && echo err"print(bye
986 )d"nl)dnl' > tmp.m4 \
987 && ']__program__[' tmp.m4 <&- >&- \
994 @comment Test that we obey POSIX semantics with -D interspersed with
995 @comment files, even with POSIXLY_CORRECT (BSD getopt gets it wrong).
1000 `errprint(` skipping: syscmd does not have unix semantics
1002 changequote(`[', `]')dnl
1003 syscmd([POSIXLY_CORRECT=1 ']__program__[' -Dbar=hello foo -Dbar=world foo])dnl
1012 @chapter Lexical and syntactic conventions
1014 @cindex input tokens
1016 As @code{m4} reads its input, it separates it into @dfn{tokens}. A
1017 token is either a name, a quoted string, or any single character, that
1018 is not a part of either a name or a string. Input to @code{m4} can also
1019 contain comments. GNU @code{m4} does not yet understand
1020 multibyte locales; all operations are byte-oriented rather than
1021 character-oriented (although if your locale uses a single byte
1022 encoding, such as @sc{ISO-8859-1}, you will not notice a difference).
1023 However, @code{m4} is eight-bit clean, so you can
1024 use non-@sc{ascii} characters in quoted strings (@pxref{Changequote}),
1025 comments (@pxref{Changecom}), and macro names (@pxref{Indir}), with the
1026 exception of the @sc{nul} character (the zero byte @samp{'\0'}).
1029 * Names:: Macro names
1030 * Quoted strings:: Quoting input to @code{m4}
1031 * Comments:: Comments in @code{m4} input
1032 * Other tokens:: Other kinds of input tokens
1033 * Input processing:: How @code{m4} copies input to output
1037 @section Macro names
1041 A name is any sequence of letters, digits, and the character @samp{_}
1042 (underscore), where the first character is not a digit. @code{m4} will
1043 use the longest such sequence found in the input. If a name has a
1044 macro definition, it will be subject to macro expansion
1045 (@pxref{Macros}). Names are case-sensitive.
1047 Examples of legal names are: @samp{foo}, @samp{_tmp}, and @samp{name01}.
1049 @node Quoted strings
1050 @section Quoting input to @code{m4}
1052 @cindex quoted string
1053 @cindex string, quoted
1054 A quoted string is a sequence of characters surrounded by quote
1055 strings, defaulting to
1056 @samp{`} and @samp{'}, where the nested begin and end quotes within the
1057 string are balanced. The value of a string token is the text, with one
1058 level of quotes stripped off. Thus
1067 is the empty string, and double-quoting turns into single-quoting.
1075 The quote characters can be changed at any time, using the builtin macro
1076 @code{changequote}. @xref{Changequote}, for more information.
1079 @section Comments in @code{m4} input
1082 Comments in @code{m4} are normally delimited by the characters @samp{#}
1083 and newline. All characters between the comment delimiters are ignored,
1084 but the entire comment (including the delimiters) is passed through to
1085 the output---comments are @emph{not} discarded by @code{m4}.
1087 Comments cannot be nested, so the first newline after a @samp{#} ends
1088 the comment. The commenting effect of the begin-comment string
1089 can be inhibited by quoting it.
1093 `quoted text' # `commented text'
1094 @result{}quoted text # `commented text'
1095 `quoting inhibits' `#' `comments'
1096 @result{}quoting inhibits # comments
1099 The comment delimiters can be changed to any string at any time, using
1100 the builtin macro @code{changecom}. @xref{Changecom}, for more
1104 @comment Detect regression in 1.4.10b in regards to reparsing comments.
1105 @comment Not worth including in the manual.
1107 define(`e', `$@@')define(`q', ``$@@'')define(`foo', `bar')
1113 @result{}',`#two bar
1115 changecom(`<', `>')define(`n', `$#')
1125 @section Other kinds of input tokens
1127 @cindex tokens, special
1128 Any character, that is neither a part of a name, nor of a quoted string,
1129 nor a comment, is a token by itself. When not in the context of macro
1130 expansion, all of these tokens are just copied to output. However,
1131 during macro expansion, whitespace characters (space, tab, newline,
1132 formfeed, carriage return, vertical tab), parentheses (@samp{(} and
1133 @samp{)}), comma (@samp{,}), and dollar (@samp{$}) have additional
1134 roles, explained later.
1136 @node Input processing
1137 @section How @code{m4} copies input to output
1139 As @code{m4} reads the input token by token, it will copy each token
1140 directly to the output immediately.
1142 The exception is when it finds a word with a macro definition. In that
1143 case @code{m4} will calculate the macro's expansion, possibly reading
1144 more input to get the arguments. It then inserts the expansion in front
1145 of the remaining input. In other words, the resulting text from a macro
1146 call will be read and parsed into tokens again.
1148 @code{m4} expands a macro as soon as possible. If it finds a macro call
1149 when collecting the arguments to another, it will expand the second call
1150 first. This process continues until there are no more macro calls to
1151 expand and all the input has been consumed.
1153 For a running example, examine how @code{m4} handles this input:
1157 format(`Result is %d', eval(`2**15'))
1161 First, @code{m4} sees that the token @samp{format} is a macro name, so
1162 it collects the tokens @samp{(}, @samp{`Result is %d'}, @samp{,},
1163 and @samp{@w{ }}, before encountering another potential macro. Sure
1164 enough, @samp{eval} is a macro name, so the nested argument collection
1165 picks up @samp{(}, @samp{`2**15'}, and @samp{)}, invoking the eval macro
1166 with the lone argument of @samp{2**15}. The expansion of
1167 @samp{eval(2**15)} is @samp{32768}, which is then rescanned as the five
1168 tokens @samp{3}, @samp{2}, @samp{7}, @samp{6}, and @samp{8}; and
1169 combined with the next @samp{)}, the format macro now has all its
1170 arguments, as if the user had typed:
1174 format(`Result is %d', 32768)
1178 The format macro expands to @samp{Result is 32768}, and we have another
1179 round of scanning for the tokens @samp{Result}, @samp{@w{ }},
1180 @samp{is}, @samp{@w{ }}, @samp{3}, @samp{2}, @samp{7}, @samp{6}, and
1181 @samp{8}. None of these are macros, so the final output is
1185 @result{}Result is 32768
1188 As a more complicated example, we will contrast an actual code
1189 example from the Gnulib project@footnote{Derived from a patch in
1190 @uref{http://lists.gnu.org/archive/html/bug-gnulib/@/2007-01/@/msg00389.html},
1191 and a followup patch in
1192 @uref{http://lists.gnu.org/archive/html/bug-gnulib/@/2007-02/@/msg00000.html}},
1193 showing both a buggy approach and the desired results. The user desires
1194 to output a shell assignment statement that takes its argument and turns
1195 it into a shell variable by converting it to uppercase and prepending a
1196 prefix. The original attempt looks like this:
1200 define([gl_STRING_MODULE_INDICATOR],
1203 GNULIB_]translit([$1],[a-z],[A-Z])[=1
1205 gl_STRING_MODULE_INDICATOR([strcase])
1207 @result{} GNULIB_strcase=1
1211 Oops -- the argument did not get capitalized. And although the manual
1212 is not able to easily show it, both lines that appear empty actually
1213 contain two trailing spaces. By stepping through the parse, it is easy
1214 to see what happened. First, @code{m4} sees the token
1215 @samp{changequote}, which it recognizes as a macro, followed by
1216 @samp{(}, @samp{[}, @samp{,}, @samp{]}, and @samp{)} to form the
1217 argument list. The macro expands to the empty string, but changes the
1218 quoting characters to something more useful for generating shell code
1219 (unbalanced @samp{`} and @samp{'} appear all the time in shell scripts,
1220 but unbalanced @samp{[]} tend to be rare). Also in the first line,
1221 @code{m4} sees the token @samp{dnl}, which it recognizes as a builtin
1222 macro that consumes the rest of the line, resulting in no output for
1225 The second line starts a macro definition. @code{m4} sees the token
1226 @samp{define}, which it recognizes as a macro, followed by a @samp{(},
1227 @samp{[gl_STRING_MODULE_INDICATOR]}, and @samp{,}. Because an unquoted
1228 comma was encountered, the first argument is known to be the expansion
1229 of the single-quoted string token, or @samp{gl_STRING_MODULE_INDICATOR}.
1230 Next, @code{m4} sees @samp{@key{NL}}, @samp{ }, and @samp{ }, but this
1231 whitespace is discarded as part of argument collection. Then comes a
1232 rather lengthy single-quoted string token, @samp{[@key{NL}@ @ @ @ dnl
1233 comment@key{NL}@ @ @ @ GNULIB_]}. This is followed by the token
1234 @samp{translit}, which @code{m4} recognizes as a macro name, so a nested
1235 macro expansion has started.
1237 The arguments to the @code{translit} are found by the tokens @samp{(},
1238 @samp{[$1]}, @samp{,}, @samp{[a-z]}, @samp{,}, @samp{[A-Z]}, and finally
1239 @samp{)}. All three string arguments are expanded (or in other words,
1240 the quotes are stripped), and since neither @samp{$} nor @samp{1} need
1241 capitalization, the result of the macro is @samp{$1}. This expansion is
1242 rescanned, resulting in the two literal characters @samp{$} and
1245 Scanning of the outer macro resumes, and picks up with
1246 @samp{[=1@key{NL}@ @ ]}, and finally @samp{)}. The collected pieces of
1247 expanded text are concatenated, with the end result that the macro
1248 @samp{gl_STRING_MODULE_INDICATOR} is now defined to be the sequence
1249 @samp{@key{NL}@ @ @ @ dnl comment@key{NL}@ @ @ @ GNULIB_$1=1@key{NL}@ @ }.
1250 Once again, @samp{dnl} is recognized and avoids a newline in the output.
1252 The final line is then parsed, beginning with @samp{ } and @samp{ }
1253 that are output literally. Then @samp{gl_STRING_MODULE_INDICATOR} is
1254 recognized as a macro name, with an argument list of @samp{(},
1255 @samp{[strcase]}, and @samp{)}. Since the definition of the macro
1256 contains the sequence @samp{$1}, that sequence is replaced with the
1257 argument @samp{strcase} prior to starting the rescan. The rescan sees
1258 @samp{@key{NL}} and four spaces, which are output literally, then
1259 @samp{dnl}, which discards the text @samp{ comment@key{NL}}. Next
1260 comes four more spaces, also output literally, and the token
1261 @samp{GNULIB_strcase}, which resulted from the earlier parameter
1262 substitution. Since that is not a macro name, it is output literally,
1263 followed by the literal tokens @samp{=}, @samp{1}, @samp{@key{NL}}, and
1264 two more spaces. Finally, the original @samp{@key{NL}} seen after the
1265 macro invocation is scanned and output literally.
1267 Now for a corrected approach. This rearranges the use of newlines and
1268 whitespace so that less whitespace is output (which, although harmless
1269 to shell scripts, can be visually unappealing), and fixes the quoting
1270 issues so that the capitalization occurs when the macro
1271 @samp{gl_STRING_MODULE_INDICATOR} is invoked, rather then when it is
1272 defined. It also adds another layer of quoting to the first argument of
1273 @code{translit}, to ensure that the output will be rescanned as a string
1274 rather than a potential uppercase macro name needing further expansion.
1278 define([gl_STRING_MODULE_INDICATOR],
1280 GNULIB_[]translit([[$1]], [a-z], [A-Z])=1dnl
1282 gl_STRING_MODULE_INDICATOR([strcase])
1283 @result{} GNULIB_STRCASE=1
1286 The parsing of the first line is unchanged. The second line sees the
1287 name of the macro to define, then sees the discarded @samp{@key{NL}}
1288 and two spaces, as before. But this time, the next token is
1289 @samp{[dnl comment@key{NL}@ @ GNULIB_[]translit([[$1]], [a-z],
1290 [A-Z])=1dnl@key{NL}]}, which includes nested quotes, followed by
1291 @samp{)} to end the macro definition and @samp{dnl} to skip the
1292 newline. No early expansion of @code{translit} occurs, so the entire
1293 string becomes the definition of the macro.
1295 The final line is then parsed, beginning with two spaces that are
1296 output literally, and an invocation of
1297 @code{gl_STRING_MODULE_INDICATOR} with the argument @samp{strcase}.
1298 Again, the @samp{$1} in the macro definition is substituted prior to
1299 rescanning. Rescanning first encounters @samp{dnl}, and discards
1300 @samp{ comment@key{NL}}. Then two spaces are output literally. Next
1301 comes the token @samp{GNULIB_}, but that is not a macro, so it is
1302 output literally. The token @samp{[]} is an empty string, so it does
1303 not affect output. Then the token @samp{translit} is encountered.
1305 This time, the arguments to @code{translit} are parsed as @samp{(},
1306 @samp{[[strcase]]}, @samp{,}, @samp{ }, @samp{[a-z]}, @samp{,}, @samp{ },
1307 @samp{[A-Z]}, and @samp{)}. The two spaces are discarded, and the
1308 translit results in the desired result @samp{[STRCASE]}. This is
1309 rescanned, but since it is a string, the quotes are stripped and the
1310 only output is a literal @samp{STRCASE}.
1311 Then the scanner sees @samp{=} and @samp{1}, which are output
1312 literally, followed by @samp{dnl} which discards the rest of the
1313 definition of @code{gl_STRING_MODULE_INDICATOR}. The newline at the
1314 end of output is the literal @samp{@key{NL}} that appeared after the
1315 invocation of the macro.
1317 The order in which @code{m4} expands the macros can be further explored
1318 using the trace facilities of GNU @code{m4} (@pxref{Trace}).
1321 @chapter How to invoke macros
1323 This chapter covers macro invocation, macro arguments and how macro
1324 expansion is treated.
1327 * Invocation:: Macro invocation
1328 * Inhibiting Invocation:: Preventing macro invocation
1329 * Macro Arguments:: Macro arguments
1330 * Quoting Arguments:: On Quoting Arguments to macros
1331 * Macro expansion:: Expanding macros
1335 @section Macro invocation
1337 @cindex macro invocation
1338 @cindex invoking macros
1339 Macro invocations has one of the forms
1347 which is a macro invocation without any arguments, or
1351 name(arg1, arg2, @dots{}, arg@var{n})
1355 which is a macro invocation with @var{n} arguments. Macros can have any
1356 number of arguments. All arguments are strings, but different macros
1357 might interpret the arguments in different ways.
1359 The opening parenthesis @emph{must} follow the @var{name} directly, with
1360 no spaces in between. If it does not, the macro is called with no
1363 For a macro call to have no arguments, the parentheses @emph{must} be
1364 left out. The macro call
1372 is a macro call with one argument, which is the empty string, not a call
1375 @node Inhibiting Invocation
1376 @section Preventing macro invocation
1378 An innovation of the @code{m4} language, compared to some of its
1379 predecessors (like Strachey's @code{GPM}, for example), is the ability
1380 to recognize macro calls without resorting to any special, prefixed
1381 invocation character. While generally useful, this feature might
1382 sometimes be the source of spurious, unwanted macro calls. So, GNU
1383 @code{m4} offers several mechanisms or techniques for inhibiting the
1384 recognition of names as macro calls.
1386 @cindex GNU extensions
1388 @cindex macro, blind
1389 First of all, many builtin macros cannot meaningfully be called without
1390 arguments. As a GNU extension, for any of these macros,
1391 whenever an opening parenthesis does not immediately follow their name,
1392 the builtin macro call is not triggered. This solves the most usual
1393 cases, like for @samp{include} or @samp{eval}. Later in this document,
1394 the sentence ``This macro is recognized only with parameters'' refers to
1395 this specific provision of GNU M4, also known as a blind
1396 builtin macro. For the builtins defined by POSIX that bear
1397 this disclaimer, POSIX specifically states that invoking those
1398 builtins without arguments is unspecified, because many other
1399 implementations simply invoke the builtin as though it were given one
1400 empty argument instead.
1410 There is also a command line option (@option{--prefix-builtins}, or
1411 @option{-P}, @pxref{Operation modes, , Invoking m4}) that renames all
1412 builtin macros with a prefix of @samp{m4_} at startup. The option has
1413 no effect whatsoever on user defined macros. For example, with this option,
1414 one has to write @code{m4_dnl} and even @code{m4_m4exit}. It also has
1415 no effect on whether a macro requires parameters.
1417 @comment options: -P
1430 Another alternative is to redefine problematic macros to a name less
1431 likely to cause conflicts, @xref{Definitions}.
1433 If your version of GNU @code{m4} has the @code{changeword} feature
1434 compiled in, it offers far more flexibility in specifying the
1435 syntax of macro names, both builtin or user-defined. @xref{Changeword},
1436 for more information on this experimental feature.
1438 Of course, the simplest way to prevent a name from being interpreted
1439 as a call to an existing macro is to quote it. The remainder of
1440 this section studies a little more deeply how quoting affects macro
1441 invocation, and how quoting can be used to inhibit macro invocation.
1443 Even if quoting is usually done over the whole macro name, it can also
1444 be done over only a few characters of this name (provided, of course,
1445 that the unquoted portions are not also a macro). It is also possible
1446 to quote the empty string, but this works only @emph{inside} the name.
1461 all yield the string @samp{divert}. While in both:
1471 the @code{divert} builtin macro will be called, which expands to the
1475 The output of macro evaluations is always rescanned. In the following
1476 example, the input @samp{x`'y} yields the string @samp{bCD}, exactly as
1478 has been given @w{@samp{substr(ab`'cde, `1', `3')}} as input:
1481 define(`cde', `CDE')
1483 define(`x', `substr(ab')
1485 define(`y', `cde, `1', `3')')
1492 @comment Similar, but with argument references, to ensure good test
1495 define(`x1', `len(`$1'')
1497 define(`y1', ``$1')')
1499 x1(`01234567890123456789')y1(`98765432109876543210')
1504 Unquoted strings on either side of a quoted string are subject to
1505 being recognized as macro names. In the following example, quoting the
1506 empty string allows for the second @code{macro} to be recognized as such:
1509 define(`macro', `m')
1517 Quoting may prevent recognizing as a macro name the concatenation of a
1518 macro expansion with the surrounding characters. In this example:
1521 define(`macro', `di$1')
1530 the input will produce the string @samp{divert}. When the quotes were
1531 removed, the @code{divert} builtin was called instead.
1533 @node Macro Arguments
1534 @section Macro arguments
1536 @cindex macros, arguments to
1537 @cindex arguments to macros
1538 When a name is seen, and it has a macro definition, it will be expanded
1541 If the name is followed by an opening parenthesis, the arguments will be
1542 collected before the macro is called. If too few arguments are
1543 supplied, the missing arguments are taken to be the empty string.
1544 However, some builtins are documented to behave differently for a
1545 missing optional argument than for an explicit empty string. If there
1546 are too many arguments, the excess arguments are ignored. Unquoted
1547 leading whitespace is stripped off all arguments, but whitespace
1548 generated by a macro expansion or occurring after a macro that expanded
1549 to an empty string remains intact. Whitespace includes space, tab,
1550 newline, carriage return, vertical tab, and formfeed.
1553 define(`macro', `$1')
1555 macro( unquoted leading space lost)
1556 @result{}unquoted leading space lost
1557 macro(` quoted leading space kept')
1558 @result{} quoted leading space kept
1560 divert `unquoted space kept after expansion')
1561 @result{} unquoted space kept after expansion
1563 ')`whitespace from expansion kept')
1565 @result{}whitespace from expansion kept
1566 macro(`unquoted trailing whitespace kept'
1568 @result{}unquoted trailing whitespace kept
1572 @cindex warnings, suppressing
1573 @cindex suppressing warnings
1574 Normally @code{m4} will issue warnings if a builtin macro is called
1575 with an inappropriate number of arguments, but it can be suppressed with
1576 the @option{--quiet} command line option (or @option{--silent}, or
1577 @option{-Q}, @pxref{Operation modes, , Invoking m4}). For user
1578 defined macros, there is no check of the number of arguments given.
1583 @error{}m4:stdin:1: Warning: too few arguments to builtin `index'
1587 index(`abc', `b', `ignored')
1588 @error{}m4:stdin:3: Warning: excess arguments to builtin `index' ignored
1592 @comment options: -Q
1599 index(`abc', `b', `ignored')
1603 Macros are expanded normally during argument collection, and whatever
1604 commas, quotes and parentheses that might show up in the resulting
1605 expanded text will serve to define the arguments as well. Thus, if
1606 @var{foo} expands to @samp{, b, c}, the macro call
1614 is a macro call with four arguments, which are @samp{a }, @samp{b},
1615 @samp{c} and @samp{d}. To understand why the first argument contains
1616 whitespace, remember that unquoted leading whitespace is never part
1617 of an argument, but trailing whitespace always is.
1619 It is possible for a macro's definition to change during argument
1620 collection, in which case the expansion uses the definition that was in
1621 effect at the time the opening @samp{(} was seen.
1632 It is an error if the end of file occurs while collecting arguments.
1637 @result{}hello world
1640 @error{}m4:stdin:2: ERROR: end of file in argument list
1643 @node Quoting Arguments
1644 @section On Quoting Arguments to macros
1646 @cindex quoted macro arguments
1647 @cindex macros, quoted arguments to
1648 @cindex arguments, quoted macro
1649 Each argument has unquoted leading whitespace removed. Within each
1650 argument, all unquoted parentheses must match. For example, if
1651 @var{foo} is a macro,
1659 is a macro call, with one argument, whose value is @samp{() (() (}.
1660 Commas separate arguments, except when they occur inside quotes,
1661 comments, or unquoted parentheses. @xref{Pseudo Arguments}, for
1664 It is common practice to quote all arguments to macros, unless you are
1665 sure you want the arguments expanded. Thus, in the above
1666 example with the parentheses, the `right' way to do it is like this:
1673 @cindex quoting rule of thumb
1674 @cindex rule of thumb, quoting
1675 It is, however, in certain cases necessary (because nested expansion
1676 must occur to create the arguments for the outer macro) or convenient
1677 (because it uses fewer characters) to leave out quotes for some
1678 arguments, and there is nothing wrong in doing it. It just makes life a
1679 bit harder, if you are not careful to follow a consistent quoting style.
1680 For consistency, this manual follows the rule of thumb that each layer
1681 of parentheses introduces another layer of single quoting, except when
1682 showing the consequences of quoting rules. This is done even when the
1683 quoted string cannot be a macro, such as with integers when you have not
1684 changed the syntax via @code{changeword} (@pxref{Changeword}).
1686 The quoting rule of thumb of one level of quoting per parentheses has a
1687 nice property: when a macro name appears inside parentheses, you can
1688 determine when it will be expanded. If it is not quoted, it will be
1689 expanded prior to the outer macro, so that its expansion becomes the
1690 argument. If it is single-quoted, it will be expanded after the outer
1691 macro. And if it is double-quoted, it will be used as literal text
1692 instead of a macro name.
1695 define(`active', `ACT, IVE')
1697 define(`show', `$1 $1')
1702 @result{}ACT, IVE ACT, IVE
1704 @result{}active active
1707 @node Macro expansion
1708 @section Macro expansion
1710 @cindex macros, expansion of
1711 @cindex expansion of macros
1712 When the arguments, if any, to a macro call have been collected, the
1713 macro is expanded, and the expansion text is pushed back onto the input
1714 (unquoted), and reread. The expansion text from one macro call might
1715 therefore result in more macros being called, if the calls are included,
1716 completely or partially, in the first macro calls' expansion.
1718 Taking a very simple example, if @var{foo} expands to @samp{bar}, and
1719 @var{bar} expands to @samp{Hello}, the input
1721 @comment options: -Dbar=Hello -Dfoo=bar
1723 $ @kbd{m4 -Dbar=Hello -Dfoo=bar}
1729 will expand first to @samp{bar}, and when this is reread and
1730 expanded, into @samp{Hello}.
1733 @comment not worth documenting, but test that the command line can
1734 @comment define macros that take parameters
1736 @comment options: -Dfoo -Decho=$@
1738 $ @kbd{m4 -Dfoo -Decho='$@'}
1741 foo(`silently ignored')
1749 @chapter How to define new macros
1751 @cindex macros, how to define new
1752 @cindex defining new macros
1753 Macros can be defined, redefined and deleted in several different ways.
1754 Also, it is possible to redefine a macro without losing a previous
1755 value, and bring back the original value at a later time.
1758 * Define:: Defining a new macro
1759 * Arguments:: Arguments to macros
1760 * Pseudo Arguments:: Special arguments to macros
1761 * Undefine:: Deleting a macro
1762 * Defn:: Renaming macros
1763 * Pushdef:: Temporarily redefining macros
1765 * Indir:: Indirect call of macros
1766 * Builtin:: Indirect call of builtins
1770 @section Defining a macro
1772 The normal way to define or redefine macros is to use the builtin
1775 @deffn Builtin define (@var{name}, @ovar{expansion})
1776 Defines @var{name} to expand to @var{expansion}. If
1777 @var{expansion} is not given, it is taken to be empty.
1779 The expansion of @code{define} is void.
1780 The macro @code{define} is recognized only with parameters.
1783 The following example defines the macro @var{foo} to expand to the text
1784 @samp{Hello World.}.
1787 define(`foo', `Hello world.')
1790 @result{}Hello world.
1793 The empty line in the output is there because the newline is not
1794 a part of the macro definition, and it is consequently copied to
1795 the output. This can be avoided by use of the macro @code{dnl}.
1796 @xref{Dnl}, for details.
1798 The first argument to @code{define} should be quoted; otherwise, if the
1799 macro is already defined, you will be defining a different macro. This
1800 example shows the problems with underquoting, since we did not want to
1801 redefine @code{one}:
1812 @cindex GNU extensions
1813 GNU @code{m4} normally replaces only the @emph{topmost}
1814 definition of a macro if it has several definitions from @code{pushdef}
1815 (@pxref{Pushdef}). Some other implementations of @code{m4} replace all
1816 definitions of a macro with @code{define}. @xref{Incompatibilities},
1819 As a GNU extension, the first argument to @code{define} does
1820 not have to be a simple word.
1821 It can be any text string, even the empty string. A macro with a
1822 non-standard name cannot be invoked in the normal way, as the name is
1823 not recognized. It can only be referenced by the builtins @code{indir}
1824 (@pxref{Indir}) and @code{defn} (@pxref{Defn}).
1827 Arrays and associative arrays can be simulated by using non-standard
1830 @deffn Composite array (@var{index})
1831 @deffnx Composite array_set (@var{index}, @ovar{value})
1832 Provide access to entries within an array. @code{array} reads the entry
1833 at location @var{index}, and @code{array_set} assigns @var{value} to
1834 location @var{index}.
1838 define(`array', `defn(format(``array[%d]'', `$1'))')
1840 define(`array_set', `define(format(``array[%d]'', `$1'), `$2')')
1842 array_set(`4', `array element no. 4')
1844 array_set(`17', `array element no. 17')
1847 @result{}array element no. 4
1848 array(eval(`10 + 7'))
1849 @result{}array element no. 17
1852 Change the @samp{%d} to @samp{%s} and it is an associative array.
1855 @section Arguments to macros
1857 @cindex macros, arguments to
1858 @cindex arguments to macros
1859 Macros can have arguments. The @var{n}th argument is denoted by
1860 @code{$n} in the expansion text, and is replaced by the @var{n}th actual
1861 argument, when the macro is expanded. Replacement of arguments happens
1862 before rescanning, regardless of how many nesting levels of quoting
1863 appear in the expansion. Here is an example of a macro with
1866 @deffn Composite exch (@var{arg1}, @var{arg2})
1867 Expands to @var{arg2} followed by @var{arg1}, effectively exchanging
1872 define(`exch', `$2, $1')
1874 exch(`arg1', `arg2')
1878 This can be used, for example, if you like the arguments to
1879 @code{define} to be reversed.
1882 define(`exch', `$2, $1')
1884 define(exch(``expansion text'', ``macro''))
1887 @result{}expansion text
1890 @xref{Quoting Arguments}, for an explanation of the double quotes.
1891 (You should try and improve this example so that clients of @code{exch}
1892 do not have to double quote; or @pxref{Improved exch, , Answers}).
1894 As a special case, the zeroth argument, @code{$0}, is always the name
1895 of the macro being expanded.
1898 define(`test', ``Macro name: $0'')
1901 @result{}Macro name: test
1904 If you want quoted text to appear as part of the expansion text,
1905 remember that quotes can be nested in quoted strings. Thus, in
1908 define(`foo', `This is macro `foo'.')
1911 @result{}This is macro foo.
1915 The @samp{foo} in the expansion text is @emph{not} expanded, since it is
1916 a quoted string, and not a name.
1918 @cindex GNU extensions
1919 @cindex nine arguments, more than
1920 @cindex more than nine arguments
1921 @cindex arguments, more than nine
1922 @cindex positional parameters, more than nine
1923 GNU @code{m4} allows the number following the @samp{$} to
1924 consist of one or more digits, allowing macros to have any number of
1925 arguments. The extension of accepting multiple digits is incompatible
1926 with POSIX, and is different than traditional implementations
1927 of @code{m4}, which only recognize one digit. Therefore, future
1928 versions of GNU M4 will phase out this feature. To portably
1929 access beyond the ninth argument, you can use the @code{argn} macro
1930 documented later (@pxref{Shift}).
1932 POSIX also states that @samp{$} followed immediately by
1933 @samp{@{} in a macro definition is implementation-defined. This version
1934 of M4 passes the literal characters @samp{$@{} through unchanged, but M4
1935 2.0 will implement an optional feature similar to @command{sh}, where
1936 @samp{$@{11@}} expands to the eleventh argument, to replace the current
1937 recognition of @samp{$11}. Meanwhile, if you want to guarantee that you
1938 will get a literal @samp{$@{} in output when expanding a macro, even
1939 when you upgrade to M4 2.0, you can use nested quoting to your
1943 define(`foo', `single quoted $`'@{1@} output')
1945 define(`bar', ``double quoted $'`@{2@} output'')
1948 @result{}single quoted $@{1@} output
1950 @result{}double quoted $@{2@} output
1953 To help you detect places in your M4 input files that might change in
1954 behavior due to the changed behavior of M4 2.0, you can use the
1955 @option{--warn-macro-sequence} command-line option (@pxref{Operation
1956 modes, , Invoking m4}) with the default regular expression. This will
1957 add a warning any time a macro definition includes @samp{$} followed by
1958 multiple digits, or by @samp{@{}. The warning is not enabled by
1959 default, because it triggers a number of warnings in Autoconf 2.61 (and
1960 Autoconf uses @option{-E} to treat warnings as errors), and because it
1961 will still be possible to restore older behavior in M4 2.0.
1963 @comment options: --warn-macro-sequence
1965 $ @kbd{m4 --warn-macro-sequence}
1966 define(`foo', `$001 $@{1@} $1')
1967 @error{}m4:stdin:1: Warning: definition of `foo' contains sequence `$001'
1968 @error{}m4:stdin:1: Warning: definition of `foo' contains sequence `$@{1@}'
1971 @result{}bar $@{1@} bar
1974 @node Pseudo Arguments
1975 @section Special arguments to macros
1977 @cindex special arguments to macros
1978 @cindex macros, special arguments to
1979 @cindex arguments to macros, special
1980 There is a special notation for the number of actual arguments supplied,
1981 and for all the actual arguments.
1983 The number of actual arguments in a macro call is denoted by @code{$#}
1984 in the expansion text.
1986 @deffn Composite nargs (@dots{})
1987 Expands to a count of the number of arguments supplied.
1991 define(`nargs', `$#')
1997 nargs(`arg1', `arg2', `arg3')
1999 nargs(`commas can be quoted, like this')
2001 nargs(arg1#inside comments, commas do not separate arguments
2004 nargs((unquoted parentheses, like this, group arguments))
2008 Remember that @samp{#} defaults to the comment character; if you forget
2009 quotes to inhibit the comment behavior, your macro definition may not
2010 end where you expected.
2013 dnl Attempt to define a macro to just `$#'
2014 define(underquoted, $#)
2022 The notation @code{$*} can be used in the expansion text to denote all
2023 the actual arguments, unquoted, with commas in between. For example
2026 define(`echo', `$*')
2028 echo(arg1, arg2, arg3 , arg4)
2029 @result{}arg1,arg2,arg3 ,arg4
2032 Often each argument should be quoted, and the notation @code{$@@} handles
2033 that. It is just like @code{$*}, except that it quotes each argument.
2034 A simple example of that is:
2037 define(`echo', `$@@')
2039 echo(arg1, arg2, arg3 , arg4)
2040 @result{}arg1,arg2,arg3 ,arg4
2043 Where did the quotes go? Of course, they were eaten, when the expanded
2044 text were reread by @code{m4}. To show the difference, try
2047 define(`echo1', `$*')
2049 define(`echo2', `$@@')
2051 define(`foo', `This is macro `foo'.')
2054 @result{}This is macro This is macro foo..
2056 @result{}This is macro foo.
2058 @result{}This is macro foo.
2064 @xref{Trace}, if you do not understand this. As another example of the
2065 difference, remember that comments encountered in arguments are passed
2066 untouched to the macro, and that quoting disables comments.
2069 define(`echo1', `$*')
2071 define(`echo2', `$@@')
2073 define(`foo', `bar')
2086 @comment Not worth putting in the manual, but this example is needed for
2087 @comment good test coverage of copying large strings across recursion
2091 define(`echo', `$@@')dnl
2092 echo(echo(`01234567890123456789', `01234567890123456789')
2093 echo(`98765432109876543210', `98765432109876543210'))
2094 @result{}01234567890123456789,01234567890123456789
2095 @result{}98765432109876543210,98765432109876543210
2096 len((echo(`01234567890123456789',
2097 `01234567890123456789')echo(`98765432109876543210',
2098 `98765432109876543210')))
2100 indir(`echo', indir(`echo', `01234567890123456789',
2101 `01234567890123456789')
2102 indir(`echo', `98765432109876543210', `98765432109876543210'))
2103 @result{}01234567890123456789,01234567890123456789
2104 @result{}98765432109876543210,98765432109876543210
2105 define(`argn', `$#')dnl
2106 define(`echo1', `-$@@-')define(`echo2', `,$@@,')dnl
2107 echo1(`1', `2', `3') argn(echo1(`1', `2', `3'))
2109 echo2(`1', `2', `3') argn(echo2(`1', `2', `3'))
2114 A @samp{$} sign in the expansion text, that is not followed by anything
2115 @code{m4} understands, is simply copied to the macro expansion, as any
2119 define(`foo', `$$$ hello $$$')
2122 @result{}$$$ hello $$$
2126 @cindex literal output
2127 @cindex output, literal
2128 If you want a macro to expand to something like @samp{$12}, the
2129 judicious use of nested quoting can put a safe character between the
2130 @code{$} and the next character, relying on the rescanning to remove the
2131 nested quote. This will prevent @code{m4} from interpreting the
2132 @code{$} sign as a reference to an argument.
2135 define(`foo', `no nested quote: $1')
2138 @result{}no nested quote: arg
2139 define(`foo', `nested quote around $: `$'1')
2142 @result{}nested quote around $: $1
2143 define(`foo', `nested empty quote after $: $`'1')
2146 @result{}nested empty quote after $: $1
2147 define(`foo', `nested quote around next character: $`1'')
2150 @result{}nested quote around next character: $1
2151 define(`foo', `nested quote around both: `$1'')
2154 @result{}nested quote around both: arg
2158 @section Deleting a macro
2160 @cindex macros, how to delete
2161 @cindex deleting macros
2162 @cindex undefining macros
2163 A macro definition can be removed with @code{undefine}:
2165 @deffn Builtin undefine (@var{name}@dots{})
2166 For each argument, remove the macro @var{name}. The macro names must
2167 necessarily be quoted, since they will be expanded otherwise.
2169 The expansion of @code{undefine} is void.
2170 The macro @code{undefine} is recognized only with parameters.
2175 @result{}foo bar blah
2176 define(`foo', `some')define(`bar', `other')define(`blah', `text')
2179 @result{}some other text
2183 @result{}foo other text
2184 undefine(`bar', `blah')
2187 @result{}foo bar blah
2190 Undefining a macro inside that macro's expansion is safe; the macro
2191 still expands to the definition that was in effect at the @samp{(}.
2194 define(`f', ``$0':$1')
2196 f(f(f(undefine(`f')`hello world')))
2197 @result{}f:f:f:hello world
2202 It is not an error for @var{name} to have no macro definition. In that
2203 case, @code{undefine} does nothing.
2206 @section Renaming macros
2208 @cindex macros, how to rename
2209 @cindex renaming macros
2210 @cindex macros, displaying definitions
2211 @cindex definitions, displaying macro
2212 It is possible to rename an already defined macro. To do this, you need
2213 the builtin @code{defn}:
2215 @deffn Builtin defn (@var{name}@dots{})
2216 Expands to the @emph{quoted definition} of each @var{name}. If an
2217 argument is not a defined macro, the expansion for that argument is
2220 If @var{name} is a user-defined macro, the quoted definition is simply
2221 the quoted expansion text. If, instead, there is only one @var{name}
2222 and it is a builtin, the
2223 expansion is a special token, which points to the builtin's internal
2224 definition. This token is only meaningful as the second argument to
2225 @code{define} (and @code{pushdef}), and is silently converted to an
2226 empty string in most other contexts. Combining a builtin with anything
2227 else is not supported; a warning is issued and the builtin is omitted
2228 from the final expansion.
2230 The macro @code{defn} is recognized only with parameters.
2233 Its normal use is best understood through an example, which shows how to
2234 rename @code{undefine} to @code{zap}:
2237 define(`zap', defn(`undefine'))
2242 @result{}undefine(zap)
2245 In this way, @code{defn} can be used to copy macro definitions, and also
2246 definitions of builtin macros. Even if the original macro is removed,
2247 the other name can still be used to access the definition.
2249 The fact that macro definitions can be transferred also explains why you
2250 should use @code{$0}, rather than retyping a macro's name in its
2254 define(`foo', `This is `$0'')
2256 define(`bar', defn(`foo'))
2259 @result{}This is bar
2262 Macros used as string variables should be referred through @code{defn},
2263 to avoid unwanted expansion of the text:
2266 define(`string', `The macro dnl is very useful
2270 @result{}The macro@w{ }
2272 @result{}The macro dnl is very useful
2277 However, it is important to remember that @code{m4} rescanning is purely
2278 textual. If an unbalanced end-quote string occurs in a macro
2279 definition, the rescan will see that embedded quote as the termination
2280 of the quoted string, and the remainder of the macro's definition will
2281 be rescanned unquoted. Thus it is a good idea to avoid unbalanced
2282 end-quotes in macro definitions or arguments to macros.
2289 define(`echo', `$@@')
2299 On the other hand, it is possible to exploit the fact that @code{defn}
2300 can concatenate multiple macros prior to the rescanning phase, in order
2301 to join the definitions of macros that, in isolation, have unbalanced
2302 quotes. This is particularly useful when one has used several macros to
2303 accumulate text that M4 should rescan as a whole. In the example below,
2304 note how the use of @code{defn} on @code{l} in isolation opens a string,
2305 which is not closed until the next line; but used on @code{l} and
2306 @code{r} together results in nested quoting.
2309 define(`l', `<[>')define(`r', `<]>')
2311 changequote(`[', `]')
2315 @result{}<[>]defn([r])
2321 @cindex builtins, special tokens
2322 @cindex tokens, builtin macro
2323 Using @code{defn} to generate special tokens for builtin macros outside
2324 of expected contexts can sometimes trigger warnings. But most of the
2325 time, such tokens are silently converted to the empty string.
2331 define(defn(`divnum'), `cannot redefine a builtin token')
2332 @error{}m4:stdin:2: Warning: define: invalid macro name ignored
2340 Also note that @code{defn} with multiple arguments can only join text
2341 macros, not builtins, although a future version of GNU M4 may
2342 lift this restriction.
2346 define(`a', `A')define(`AA', `b')
2348 traceon(`defn', `define')
2350 defn(`a', `divnum', `a')
2351 @error{}m4:stdin:3: Warning: cannot concatenate builtin `divnum'
2352 @error{}m4trace: -1- defn(`a', `divnum', `a') -> ``A'`A''
2354 define(`mydivnum', defn(`divnum', `divnum'))mydivnum
2355 @error{}m4:stdin:4: Warning: cannot concatenate builtin `divnum'
2356 @error{}m4:stdin:4: Warning: cannot concatenate builtin `divnum'
2357 @error{}m4trace: -2- defn(`divnum', `divnum')
2358 @error{}m4trace: -1- define(`mydivnum', `')
2360 traceoff(`defn', `define')
2365 @section Temporarily redefining macros
2367 @cindex macros, temporary redefinition of
2368 @cindex temporary redefinition of macros
2369 @cindex redefinition of macros, temporary
2370 @cindex definition stack
2371 @cindex pushdef stack
2372 @cindex stack, macro definition
2373 It is possible to redefine a macro temporarily, reverting to the
2374 previous definition at a later time. This is done with the builtins
2375 @code{pushdef} and @code{popdef}:
2377 @deffn Builtin pushdef (@var{name}, @ovar{expansion})
2378 @deffnx Builtin popdef (@var{name}@dots{})
2379 Analogous to @code{define} and @code{undefine}.
2381 These macros work in a stack-like fashion. A macro is temporarily
2382 redefined with @code{pushdef}, which replaces an existing definition of
2383 @var{name}, while saving the previous definition, before the new one is
2384 installed. If there is no previous definition, @code{pushdef} behaves
2385 exactly like @code{define}.
2387 If a macro has several definitions (of which only one is accessible),
2388 the topmost definition can be removed with @code{popdef}. If there is
2389 no previous definition, @code{popdef} behaves like @code{undefine}.
2391 The expansion of both @code{pushdef} and @code{popdef} is void.
2392 The macros @code{pushdef} and @code{popdef} are recognized only with
2397 define(`foo', `Expansion one.')
2400 @result{}Expansion one.
2401 pushdef(`foo', `Expansion two.')
2404 @result{}Expansion two.
2405 pushdef(`foo', `Expansion three.')
2407 pushdef(`foo', `Expansion four.')
2412 @result{}Expansion three.
2413 popdef(`foo', `foo')
2416 @result{}Expansion one.
2423 If a macro with several definitions is redefined with @code{define}, the
2424 topmost definition is @emph{replaced} with the new definition. If it is
2425 removed with @code{undefine}, @emph{all} the definitions are removed,
2426 and not only the topmost one. However, POSIX allows other
2427 implementations that treat @code{define} as replacing an entire stack
2428 of definitions with a single new definition, so to be portable to other
2429 implementations, it may be worth explicitly using @code{popdef} and
2430 @code{pushdef} rather than relying on the GNU behavior of
2434 define(`foo', `Expansion one.')
2437 @result{}Expansion one.
2438 pushdef(`foo', `Expansion two.')
2441 @result{}Expansion two.
2442 define(`foo', `Second expansion two.')
2445 @result{}Second expansion two.
2452 @cindex local variables
2453 @cindex variables, local
2454 Local variables within macros are made with @code{pushdef} and
2455 @code{popdef}. At the start of the macro a new definition is pushed,
2456 within the macro it is manipulated and at the end it is popped,
2457 revealing the former definition.
2459 It is possible to temporarily redefine a builtin with @code{pushdef}
2463 @section Indirect call of macros
2465 @cindex indirect call of macros
2466 @cindex call of macros, indirect
2467 @cindex macros, indirect call of
2468 @cindex GNU extensions
2469 Any macro can be called indirectly with @code{indir}:
2471 @deffn Builtin indir (@var{name}, @ovar{args@dots{}})
2472 Results in a call to the macro @var{name}, which is passed the
2473 rest of the arguments @var{args}. If @var{name} is not defined, an
2474 error message is printed, and the expansion is void.
2476 The macro @code{indir} is recognized only with parameters.
2479 This can be used to call macros with computed or ``invalid''
2480 names (@code{define} allows such names to be defined):
2483 define(`$$internal$macro', `Internal macro (name `$0')')
2486 @result{}$$internal$macro
2487 indir(`$$internal$macro')
2488 @result{}Internal macro (name $$internal$macro)
2491 The point is, here, that larger macro packages can have private macros
2492 defined, that will not be called by accident. They can @emph{only} be
2493 called through the builtin @code{indir}.
2495 One other point to observe is that argument collection occurs before
2496 @code{indir} invokes @var{name}, so if argument collection changes the
2497 value of @var{name}, that will be reflected in the final expansion.
2498 This is different than the behavior when invoking macros directly,
2499 where the definition that was in effect before argument collection is
2508 indir(`f', define(`f', `3'))
2510 indir(`f', undefine(`f'))
2511 @error{}m4:stdin:4: undefined macro `f'
2515 When handed the result of @code{defn} (@pxref{Defn}) as one of its
2516 arguments, @code{indir} defers to the invoked @var{name} for whether a
2517 token representing a builtin is recognized or flattened to the empty
2522 indir(defn(`defn'), `divnum')
2523 @error{}m4:stdin:1: Warning: indir: invalid macro name ignored
2525 indir(`define', defn(`defn'), `divnum')
2526 @error{}m4:stdin:2: Warning: define: invalid macro name ignored
2528 indir(`define', `foo', defn(`divnum'))
2532 indir(`divert', defn(`foo'))
2533 @error{}m4:stdin:5: empty string treated as 0 in builtin `divert'
2538 @section Indirect call of builtins
2540 @cindex indirect call of builtins
2541 @cindex call of builtins, indirect
2542 @cindex builtins, indirect call of
2543 @cindex GNU extensions
2544 Builtin macros can be called indirectly with @code{builtin}:
2546 @deffn Builtin builtin (@var{name}, @ovar{args@dots{}})
2547 Results in a call to the builtin @var{name}, which is passed the
2548 rest of the arguments @var{args}. If @var{name} does not name a
2549 builtin, an error message is printed, and the expansion is void.
2551 The macro @code{builtin} is recognized only with parameters.
2554 This can be used even if @var{name} has been given another definition
2555 that has covered the original, or been undefined so that no macro
2556 maps to the builtin.
2559 pushdef(`define', `hidden')
2561 undefine(`undefine')
2563 define(`foo', `bar')
2567 builtin(`define', `foo', defn(`divnum'))
2571 builtin(`define', `foo', `BAR')
2576 @result{}undefine(foo)
2579 builtin(`undefine', `foo')
2585 The @var{name} argument only matches the original name of the builtin,
2586 even when the @option{--prefix-builtins} option (or @option{-P},
2587 @pxref{Operation modes, , Invoking m4}) is in effect. This is different
2588 from @code{indir}, which only tracks current macro names.
2590 @comment options: -P
2593 m4_builtin(`divnum')
2595 m4_builtin(`m4_divnum')
2596 @error{}m4:stdin:2: undefined builtin `m4_divnum'
2599 @error{}m4:stdin:3: undefined macro `divnum'
2601 m4_indir(`m4_divnum')
2605 Note that @code{indir} and @code{builtin} can be used to invoke builtins
2606 without arguments, even when they normally require parameters to be
2607 recognized; but it will provoke a warning, and result in a void expansion.
2613 @error{}m4:stdin:2: undefined builtin `'
2616 @error{}m4:stdin:3: Warning: too few arguments to builtin `builtin'
2619 @error{}m4:stdin:4: undefined builtin `'
2621 builtin(`builtin', ``'
2623 @error{}m4:stdin:5: undefined builtin ``'
2627 @error{}m4:stdin:7: Warning: too few arguments to builtin `index'
2632 @comment This example is not worth putting in the manual, but it is
2633 @comment needed for full coverage. Autoconf's m4_include relies heavily
2634 @comment on this feature.
2637 builtin(`include', `foo')dnl
2641 @comment And this example triggers a regression present in 1.4.10b.
2644 define(`s', `builtin(`shift', $@@)')dnl
2645 define(`loop', `ifelse(`$2', `', `-', `$1$2: $0(`$1', s(s($@@)))')')dnl
2652 loop(`1', `2', `3', `4')
2653 @result{}12: 13: 14: -
2654 loop(`1', `2', `3', `4', `5')
2655 @result{}12: 13: 14: 15: -
2660 @chapter Conditionals, loops, and recursion
2662 Macros, expanding to plain text, perhaps with arguments, are not quite
2663 enough. We would like to have macros expand to different things, based
2664 on decisions taken at run-time. For that, we need some kind of conditionals.
2665 Also, we would like to have some kind of loop construct, so we could do
2666 something a number of times, or while some condition is true.
2669 * Ifdef:: Testing if a macro is defined
2670 * Ifelse:: If-else construct, or multibranch
2671 * Shift:: Recursion in @code{m4}
2672 * Forloop:: Iteration by counting
2673 * Foreach:: Iteration by list contents
2674 * Stacks:: Working with definition stacks
2675 * Composition:: Building macros with macros
2679 @section Testing if a macro is defined
2681 @cindex conditionals
2682 There are two different builtin conditionals in @code{m4}. The first is
2685 @deffn Builtin ifdef (@var{name}, @var{string-1}, @ovar{string-2})
2686 If @var{name} is defined as a macro, @code{ifdef} expands to
2687 @var{string-1}, otherwise to @var{string-2}. If @var{string-2} is
2688 omitted, it is taken to be the empty string (according to the normal
2691 The macro @code{ifdef} is recognized only with parameters.
2695 ifdef(`foo', ``foo' is defined', ``foo' is not defined')
2696 @result{}foo is not defined
2699 ifdef(`foo', ``foo' is defined', ``foo' is not defined')
2700 @result{}foo is defined
2701 ifdef(`no_such_macro', `yes', `no', `extra argument')
2702 @error{}m4:stdin:4: Warning: excess arguments to builtin `ifdef' ignored
2707 @section If-else construct, or multibranch
2709 @cindex comparing strings
2710 @cindex discarding input
2711 @cindex input, discarding
2712 The other conditional, @code{ifelse}, is much more powerful. It can be
2713 used as a way to introduce a long comment, as an if-else construct, or
2714 as a multibranch, depending on the number of arguments supplied:
2716 @deffn Builtin ifelse (@var{comment})
2717 @deffnx Builtin ifelse (@var{string-1}, @var{string-2}, @var{equal}, @
2719 @deffnx Builtin ifelse (@var{string-1}, @var{string-2}, @var{equal-1}, @
2720 @var{string-3}, @var{string-4}, @var{equal-2}, @dots{}, @ovar{not-equal})
2721 Used with only one argument, the @code{ifelse} simply discards it and
2724 If called with three or four arguments, @code{ifelse} expands into
2725 @var{equal}, if @var{string-1} and @var{string-2} are equal (character
2726 for character), otherwise it expands to @var{not-equal}. A final fifth
2727 argument is ignored, after triggering a warning.
2729 If called with six or more arguments, and @var{string-1} and
2730 @var{string-2} are equal, @code{ifelse} expands into @var{equal-1},
2731 otherwise the first three arguments are discarded and the processing
2734 The macro @code{ifelse} is recognized only with parameters.
2737 Using only one argument is a common @code{m4} idiom for introducing a
2738 block comment, as an alternative to repeatedly using @code{dnl}. This
2739 special usage is recognized by GNU @code{m4}, so that in this
2740 case, the warning about missing arguments is never triggered.
2743 ifelse(`some comments')
2745 ifelse(`foo', `bar')
2746 @error{}m4:stdin:2: Warning: too few arguments to builtin `ifelse'
2750 Using three or four arguments provides decision points.
2753 ifelse(`foo', `bar', `true')
2755 ifelse(`foo', `foo', `true')
2757 define(`foo', `bar')
2759 ifelse(foo, `bar', `true', `false')
2761 ifelse(foo, `foo', `true', `false')
2765 @cindex macro, blind
2767 Notice how the first argument was used unquoted; it is common to compare
2768 the expansion of a macro with a string. With this macro, you can now
2769 reproduce the behavior of blind builtins, where the macro is recognized
2770 only with arguments.
2773 define(`foo', `ifelse(`$#', `0', ``$0'', `arguments:$#')')
2778 @result{}arguments:1
2780 @result{}arguments:3
2783 For an example of a way to make defining blind macros easier, see
2786 @cindex multibranches
2787 @cindex switch statement
2788 @cindex case statement
2789 The macro @code{ifelse} can take more than four arguments. If given more
2790 than four arguments, @code{ifelse} works like a @code{case} or @code{switch}
2791 statement in traditional programming languages. If @var{string-1} and
2792 @var{string-2} are equal, @code{ifelse} expands into @var{equal-1}, otherwise
2793 the procedure is repeated with the first three arguments discarded. This
2794 calls for an example:
2797 ifelse(`foo', `bar', `third', `gnu', `gnats')
2798 @error{}m4:stdin:1: Warning: excess arguments to builtin `ifelse' ignored
2800 ifelse(`foo', `bar', `third', `gnu', `gnats', `sixth')
2802 ifelse(`foo', `bar', `third', `gnu', `gnats', `sixth', `seventh')
2804 ifelse(`foo', `bar', `3', `gnu', `gnats', `6', `7', `8')
2805 @error{}m4:stdin:4: Warning: excess arguments to builtin `ifelse' ignored
2810 @comment Stress tests, not worth documenting.
2812 @comment Ensure that references compared to strings work regardless of
2813 @comment similar prefixes.
2815 define(`e', `$@@')define(`long', `01234567890123456789')
2817 ifelse(long, `01234567890123456789', `yes', `no')
2819 ifelse(`01234567890123456789', long, `yes', `no')
2821 ifelse(long, `01234567890123456789-', `yes', `no')
2823 ifelse(`01234567890123456789-', long, `yes', `no')
2825 ifelse(e(long), `01234567890123456789', `yes', `no')
2827 ifelse(`01234567890123456789', e(long), `yes', `no')
2829 ifelse(e(long), `01234567890123456789-', `yes', `no')
2831 ifelse(`01234567890123456789-', e(long), `yes', `no')
2833 ifelse(-e(long), `-01234567890123456789', `yes', `no')
2835 ifelse(-`01234567890123456789', -e(long), `yes', `no')
2837 ifelse(-e(long), `-01234567890123456789-', `yes', `no')
2839 ifelse(`-01234567890123456789-', -e(long), `yes', `no')
2841 ifelse(-e(long)-, `-01234567890123456789-', `yes', `no')
2843 ifelse(-`01234567890123456789-', -e(long)-, `yes', `no')
2845 ifelse(-e(long)-, `-01234567890123456789', `yes', `no')
2847 ifelse(`-01234567890123456789', -e(long)-, `yes', `no')
2849 ifelse(`-'e(long), `-01234567890123456789', `yes', `no')
2851 ifelse(-`01234567890123456789', `-'e(long), `yes', `no')
2853 ifelse(`-'e(long), `-01234567890123456789-', `yes', `no')
2855 ifelse(`-01234567890123456789-', `-'e(long), `yes', `no')
2857 ifelse(`-'e(long)`-', `-01234567890123456789-', `yes', `no')
2859 ifelse(-`01234567890123456789-', `-'e(long)`-', `yes', `no')
2861 ifelse(`-'e(long)`-', `-01234567890123456789', `yes', `no')
2863 ifelse(`-01234567890123456789', `-'e(long)`-', `yes', `no')
2868 Naturally, the normal case will be slightly more advanced than these
2869 examples. A common use of @code{ifelse} is in macros implementing loops
2873 @section Recursion in @code{m4}
2875 @cindex recursive macros
2876 @cindex macros, recursive
2877 There is no direct support for loops in @code{m4}, but macros can be
2878 recursive. There is no limit on the number of recursion levels, other
2879 than those enforced by your hardware and operating system.
2882 Loops can be programmed using recursion and the conditionals described
2885 There is a builtin macro, @code{shift}, which can, among other things,
2886 be used for iterating through the actual arguments to a macro:
2888 @deffn Builtin shift (@var{arg1}, @dots{})
2889 Takes any number of arguments, and expands to all its arguments except
2890 @var{arg1}, separated by commas, with each argument quoted.
2892 The macro @code{shift} is recognized only with parameters.
2900 shift(`foo', `bar', `baz')
2904 An example of the use of @code{shift} is this macro:
2906 @cindex reversing arguments
2907 @cindex arguments, reversing
2908 @deffn Composite reverse (@dots{})
2909 Takes any number of arguments, and reverses their order.
2912 It is implemented as:
2915 define(`reverse', `ifelse(`$#', `0', , `$#', `1', ``$1'',
2916 `reverse(shift($@@)), `$1'')')
2922 reverse(`foo', `bar', `gnats', `and gnus')
2923 @result{}and gnus, gnats, bar, foo
2926 While not a very interesting macro, it does show how simple loops can be
2927 made with @code{shift}, @code{ifelse} and recursion. It also shows
2928 that @code{shift} is usually used with @samp{$@@}. Another example of
2929 this is an implementation of a short-circuiting conditional operator.
2931 @cindex short-circuiting conditional
2932 @cindex conditional, short-circuiting
2933 @deffn Composite cond (@var{test-1}, @var{string-1}, @var{equal-1}, @
2934 @ovar{test-2}, @ovar{string-2}, @ovar{equal-2}, @dots{}, @ovar{not-equal})
2935 Similar to @code{ifelse}, where an equal comparison between the first
2936 two strings results in the third, otherwise the first three arguments
2937 are discarded and the process repeats. The difference is that each
2938 @var{test-<n>} is expanded only when it is encountered. This means that
2939 every third argument to @code{cond} is normally given one more level of
2940 quoting than the corresponding argument to @code{ifelse}.
2943 Here is the implementation of @code{cond}, along with a demonstration of
2944 how it can short-circuit the side effects in @code{side}. Notice how
2945 all the unquoted side effects happen regardless of how many comparisons
2946 are made with @code{ifelse}, compared with only the relevant effects
2951 `ifelse(`$#', `1', `$1',
2952 `ifelse($1, `$2', `$3',
2953 `$0(shift(shift(shift($@@))))')')')dnl
2954 define(`side', `define(`counter', incr(counter))$1')dnl
2956 `define(`counter', `0')dnl
2957 ifelse(side(`$1'), `yes', `one comparison: ',
2958 side(`$1'), `no', `two comparisons: ',
2959 side(`$1'), `maybe', `three comparisons: ',
2960 `side(`default answer: ')')counter')dnl
2962 `define(`counter', `0')dnl
2963 cond(`side(`$1')', `yes', `one comparison: ',
2964 `side(`$1')', `no', `two comparisons: ',
2965 `side(`$1')', `maybe', `three comparisons: ',
2966 `side(`default answer: ')')counter')dnl
2968 @result{}one comparison: 3
2970 @result{}two comparisons: 3
2972 @result{}three comparisons: 3
2973 example1(`feeling rather indecisive today')
2974 @result{}default answer: 4
2976 @result{}one comparison: 1
2978 @result{}two comparisons: 2
2980 @result{}three comparisons: 3
2981 example2(`feeling rather indecisive today')
2982 @result{}default answer: 4
2985 @cindex joining arguments
2986 @cindex arguments, joining
2987 @cindex concatenating arguments
2988 Another common task that requires iteration is joining a list of
2989 arguments into a single string.
2991 @deffn Composite join (@ovar{separator}, @ovar{args@dots{}})
2992 @deffnx Composite joinall (@ovar{separator}, @ovar{args@dots{}})
2993 Generate a single-quoted string, consisting of each @var{arg} separated
2994 by @var{separator}. While @code{joinall} always outputs a
2995 @var{separator} between arguments, @code{join} avoids the
2996 @var{separator} for an empty @var{arg}.
2999 Here are some examples of its usage, based on the implementation
3000 @file{m4-@value{VERSION}/@/examples/@/join.m4} distributed in this
3005 $ @kbd{m4 -I examples}
3008 join,join(`-'),join(`-', `'),join(`-', `', `')
3010 joinall,joinall(`-'),joinall(`-', `'),joinall(`-', `', `')
3014 join(`-', `1', `2', `3')
3016 join(`', `1', `2', `3')
3018 join(`-', `', `1', `', `', `2', `')
3020 joinall(`-', `', `1', `', `', `2', `')
3022 join(`,', `1', `2', `3')
3024 define(`nargs', `$#')dnl
3025 nargs(join(`,', `1', `2', `3'))
3029 Examining the implementation shows some interesting points about several
3030 m4 programming idioms.
3034 $ @kbd{m4 -I examples}
3035 undivert(`join.m4')dnl
3036 @result{}divert(`-1')
3037 @result{}# join(sep, args) - join each non-empty ARG into a single
3038 @result{}# string, with each element separated by SEP
3039 @result{}define(`join',
3040 @result{}`ifelse(`$#', `2', ``$2'',
3041 @result{} `ifelse(`$2', `', `', ``$2'_')$0(`$1', shift(shift($@@)))')')
3042 @result{}define(`_join',
3043 @result{}`ifelse(`$#$2', `2', `',
3044 @result{} `ifelse(`$2', `', `', ``$1$2'')$0(`$1', shift(shift($@@)))')')
3045 @result{}# joinall(sep, args) - join each ARG, including empty ones,
3046 @result{}# into a single string, with each element separated by SEP
3047 @result{}define(`joinall', ``$2'_$0(`$1', shift($@@))')
3048 @result{}define(`_joinall',
3049 @result{}`ifelse(`$#', `2', `', ``$1$3'$0(`$1', shift(shift($@@)))')')
3050 @result{}divert`'dnl
3053 First, notice that this implementation creates helper macros
3054 @code{_join} and @code{_joinall}. This division of labor makes it
3055 easier to output the correct number of @var{separator} instances:
3056 @code{join} and @code{joinall} are responsible for the first argument,
3057 without a separator, while @code{_join} and @code{_joinall} are
3058 responsible for all remaining arguments, always outputting a separator
3059 when outputting an argument.
3061 Next, observe how @code{join} decides to iterate to itself, because the
3062 first @var{arg} was empty, or to output the argument and swap over to
3063 @code{_join}. If the argument is non-empty, then the nested
3064 @code{ifelse} results in an unquoted @samp{_}, which is concatenated
3065 with the @samp{$0} to form the next macro name to invoke. The
3066 @code{joinall} implementation is simpler since it does not have to
3067 suppress empty @var{arg}; it always executes once then defers to
3070 Another important idiom is the idea that @var{separator} is reused for
3071 each iteration. Each iteration has one less argument, but rather than
3072 discarding @samp{$1} by iterating with @code{$0(shift($@@))}, the macro
3073 discards @samp{$2} by using @code{$0(`$1', shift(shift($@@)))}.
3075 Next, notice that it is possible to compare more than one condition in a
3076 single @code{ifelse} test. The test of @samp{$#$2} against @samp{2}
3077 allows @code{_join} to iterate for two separate reasons---either there
3078 are still more than two arguments, or there are exactly two arguments
3079 but the last argument is not empty.
3081 Finally, notice that these macros require exactly two arguments to
3082 terminate recursion, but that they still correctly result in empty
3083 output when given no @var{args} (i.e., zero or one macro argument). On
3084 the first pass when there are too few arguments, the @code{shift}
3085 results in no output, but leaves an empty string to serve as the
3086 required second argument for the second pass. Put another way,
3087 @samp{`$1', shift($@@)} is not the same as @samp{$@@}, since only the
3088 former guarantees at least two arguments.
3090 @cindex quote manipulation
3091 @cindex manipulating quotes
3092 Sometimes, a recursive algorithm requires adding quotes to each element,
3093 or treating multiple arguments as a single element:
3095 @deffn Composite quote (@dots{})
3096 @deffnx Composite dquote (@dots{})
3097 @deffnx Composite dquote_elt (@dots{})
3098 Takes any number of arguments, and adds quoting. With @code{quote},
3099 only one level of quoting is added, effectively removing whitespace
3100 after commas and turning multiple arguments into a single string. With
3101 @code{dquote}, two levels of quoting are added, one around each element,
3102 and one around the list. And with @code{dquote_elt}, two levels of
3103 quoting are added around each element.
3106 An actual implementation of these three macros is distributed as
3107 @file{m4-@value{VERSION}/@/examples/@/quote.m4} in this package. First,
3108 let's examine their usage:
3112 $ @kbd{m4 -I examples}
3115 -quote-dquote-dquote_elt-
3117 -quote()-dquote()-dquote_elt()-
3119 -quote(`1')-dquote(`1')-dquote_elt(`1')-
3120 @result{}-1-`1'-`1'-
3121 -quote(`1', `2')-dquote(`1', `2')-dquote_elt(`1', `2')-
3122 @result{}-1,2-`1',`2'-`1',`2'-
3123 define(`n', `$#')dnl
3124 -n(quote(`1', `2'))-n(dquote(`1', `2'))-n(dquote_elt(`1', `2'))-
3126 dquote(dquote_elt(`1', `2'))
3127 @result{}``1'',``2''
3128 dquote_elt(dquote(`1', `2'))
3132 The last two lines show that when given two arguments, @code{dquote}
3133 results in one string, while @code{dquote_elt} results in two. Now,
3134 examine the implementation. Note that @code{quote} and
3135 @code{dquote_elt} make decisions based on their number of arguments, so
3136 that when called without arguments, they result in nothing instead of a
3137 quoted empty string; this is so that it is possible to distinguish
3138 between no arguments and an empty first argument. @code{dquote}, on the
3139 other hand, results in a string no matter what, since it is still
3140 possible to tell whether it was invoked without arguments based on the
3145 $ @kbd{m4 -I examples}
3146 undivert(`quote.m4')dnl
3147 @result{}divert(`-1')
3148 @result{}# quote(args) - convert args to single-quoted string
3149 @result{}define(`quote', `ifelse(`$#', `0', `', ``$*'')')
3150 @result{}# dquote(args) - convert args to quoted list of quoted strings
3151 @result{}define(`dquote', ``$@@'')
3152 @result{}# dquote_elt(args) - convert args to list of double-quoted strings
3153 @result{}define(`dquote_elt', `ifelse(`$#', `0', `', `$#', `1', ```$1''',
3154 @result{} ```$1'',$0(shift($@@))')')
3155 @result{}divert`'dnl
3158 It is worth pointing out that @samp{quote(@var{args})} is more efficient
3159 than @samp{joinall(`,', @var{args})} for producing the same output.
3161 @cindex nine arguments, more than
3162 @cindex more than nine arguments
3163 @cindex arguments, more than nine
3164 One more useful macro based on @code{shift} allows portably selecting
3165 an arbitrary argument (usually greater than the ninth argument), without
3166 relying on the GNU extension of multi-digit arguments
3167 (@pxref{Arguments}).
3169 @deffn Composite argn (@var{n}, @dots{})
3170 Expands to argument @var{n} out of the remaining arguments. @var{n}
3171 must be a positive number. Usually invoked as
3172 @samp{argn(`@var{n}',$@@)}.
3175 It is implemented as:
3178 define(`argn', `ifelse(`$1', 1, ``$2'',
3179 `argn(decr(`$1'), shift(shift($@@)))')')
3183 define(`foo', `argn(`11', $@@)')
3185 foo(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k', `l')
3190 @section Iteration by counting
3193 @cindex loops, counting
3194 @cindex counting loops
3195 Here is an example of a loop macro that implements a simple for loop.
3197 @deffn Composite forloop (@var{iterator}, @var{start}, @var{end}, @var{text})
3198 Takes the name in @var{iterator}, which must be a valid macro name, and
3199 successively assign it each integer value from @var{start} to @var{end},
3200 inclusive. For each assignment to @var{iterator}, append @var{text} to
3201 the expansion of the @code{forloop}. @var{text} may refer to
3202 @var{iterator}. Any definition of @var{iterator} prior to this
3203 invocation is restored.
3206 It can, for example, be used for simple counting:
3210 $ @kbd{m4 -I examples}
3211 include(`forloop.m4')
3213 forloop(`i', `1', `8', `i ')
3214 @result{}1 2 3 4 5 6 7 8@w{ }
3217 For-loops can be nested, like:
3221 $ @kbd{m4 -I examples}
3222 include(`forloop.m4')
3224 forloop(`i', `1', `4', `forloop(`j', `1', `8', ` (i, j)')
3226 @result{} (1, 1) (1, 2) (1, 3) (1, 4) (1, 5) (1, 6) (1, 7) (1, 8)
3227 @result{} (2, 1) (2, 2) (2, 3) (2, 4) (2, 5) (2, 6) (2, 7) (2, 8)
3228 @result{} (3, 1) (3, 2) (3, 3) (3, 4) (3, 5) (3, 6) (3, 7) (3, 8)
3229 @result{} (4, 1) (4, 2) (4, 3) (4, 4) (4, 5) (4, 6) (4, 7) (4, 8)
3233 The implementation of the @code{forloop} macro is fairly
3234 straightforward. The @code{forloop} macro itself is simply a wrapper,
3235 which saves the previous definition of the first argument, calls the
3236 internal macro @code{@w{_forloop}}, and re-establishes the saved
3237 definition of the first argument.
3239 The macro @code{@w{_forloop}} expands the fourth argument once, and
3240 tests to see if the iterator has reached the final value. If it has
3241 not finished, it increments the iterator (using the predefined macro
3242 @code{incr}, @pxref{Incr}), and recurses.
3244 Here is an actual implementation of @code{forloop}, distributed as
3245 @file{m4-@value{VERSION}/@/examples/@/forloop.m4} in this package:
3249 $ @kbd{m4 -I examples}
3250 undivert(`forloop.m4')dnl
3251 @result{}divert(`-1')
3252 @result{}# forloop(var, from, to, stmt) - simple version
3253 @result{}define(`forloop', `pushdef(`$1', `$2')_forloop($@@)popdef(`$1')')
3254 @result{}define(`_forloop',
3255 @result{} `$4`'ifelse($1, `$3', `', `define(`$1', incr($1))$0($@@)')')
3256 @result{}divert`'dnl
3259 Notice the careful use of quotes. Certain macro arguments are left
3260 unquoted, each for its own reason. Try to find out @emph{why} these
3261 arguments are left unquoted, and see what happens if they are quoted.
3262 (As presented, these two macros are useful but not very robust for
3263 general use. They lack even basic error handling for cases like
3264 @var{start} less than @var{end}, @var{end} not numeric, or
3265 @var{iterator} not being a macro name. See if you can improve these
3266 macros; or @pxref{Improved forloop, , Answers}).
3269 @section Iteration by list contents
3271 @cindex for each loops
3272 @cindex loops, list iteration
3273 @cindex iterating over lists
3274 Here is an example of a loop macro that implements list iteration.
3276 @deffn Composite foreach (@var{iterator}, @var{paren-list}, @var{text})
3277 @deffnx Composite foreachq (@var{iterator}, @var{quote-list}, @var{text})
3278 Takes the name in @var{iterator}, which must be a valid macro name, and
3279 successively assign it each value from @var{paren-list} or
3280 @var{quote-list}. In @code{foreach}, @var{paren-list} is a
3281 comma-separated list of elements contained in parentheses. In
3282 @code{foreachq}, @var{quote-list} is a comma-separated list of elements
3283 contained in a quoted string. For each assignment to @var{iterator},
3284 append @var{text} to the overall expansion. @var{text} may refer to
3285 @var{iterator}. Any definition of @var{iterator} prior to this
3286 invocation is restored.
3289 As an example, this displays each word in a list inside of a sentence,
3290 using an implementation of @code{foreach} distributed as
3291 @file{m4-@value{VERSION}/@/examples/@/foreach.m4}, and @code{foreachq}
3292 in @file{m4-@value{VERSION}/@/examples/@/foreachq.m4}.
3296 $ @kbd{m4 -I examples}
3297 include(`foreach.m4')
3299 foreach(`x', (foo, bar, foobar), `Word was: x
3301 @result{}Word was: foo
3302 @result{}Word was: bar
3303 @result{}Word was: foobar
3304 include(`foreachq.m4')
3306 foreachq(`x', `foo, bar, foobar', `Word was: x
3308 @result{}Word was: foo
3309 @result{}Word was: bar
3310 @result{}Word was: foobar
3313 It is possible to be more complex; each element of the @var{paren-list}
3314 or @var{quote-list} can itself be a list, to pass as further arguments
3315 to a helper macro. This example generates a shell case statement:
3319 $ @kbd{m4 -I examples}
3320 include(`foreach.m4')
3322 define(`_case', ` $1)
3325 define(`_cat', `$1$2')dnl
3328 foreach(`x', `(`(`a', `vara')', `(`b', `varb')', `(`c', `varc')')',
3329 `_cat(`_case', x)')dnl
3331 @result{} vara=" a";;
3333 @result{} varb=" b";;
3335 @result{} varc=" c";;
3340 The implementation of the @code{foreach} macro is a bit more involved;
3341 it is a wrapper around two helper macros. First, @code{@w{_arg1}} is
3342 needed to grab the first element of a list. Second,
3343 @code{@w{_foreach}} implements the recursion, successively walking
3344 through the original list. Here is a simple implementation of
3349 $ @kbd{m4 -I examples}
3350 undivert(`foreach.m4')dnl
3351 @result{}divert(`-1')
3352 @result{}# foreach(x, (item_1, item_2, ..., item_n), stmt)
3353 @result{}# parenthesized list, simple version
3354 @result{}define(`foreach', `pushdef(`$1')_foreach($@@)popdef(`$1')')
3355 @result{}define(`_arg1', `$1')
3356 @result{}define(`_foreach', `ifelse(`$2', `()', `',
3357 @result{} `define(`$1', _arg1$2)$3`'$0(`$1', (shift$2), `$3')')')
3358 @result{}divert`'dnl
3361 Unfortunately, that implementation is not robust to macro names as list
3362 elements. Each iteration of @code{@w{_foreach}} is stripping another
3363 layer of quotes, leading to erratic results if list elements are not
3364 already fully expanded. The first cut at implementing @code{foreachq}
3365 takes this into account. Also, when using quoted elements in a
3366 @var{paren-list}, the overall list must be quoted. A @var{quote-list}
3367 has the nice property of requiring fewer characters to create a list
3368 containing the same quoted elements. To see the difference between the
3369 two macros, we attempt to pass double-quoted macro names in a list,
3370 expecting the macro name on output after one layer of quotes is removed
3371 during list iteration and the final layer removed during the final
3376 $ @kbd{m4 -I examples}
3377 define(`a', `1')define(`b', `2')define(`c', `3')
3379 include(`foreach.m4')
3381 include(`foreachq.m4')
3383 foreach(`x', `(``a'', ``(b'', ``c)'')', `x
3390 foreachq(`x', ```a'', ``(b'', ``c)''', `x
3397 Obviously, @code{foreachq} did a better job; here is its implementation:
3401 $ @kbd{m4 -I examples}
3402 undivert(`foreachq.m4')dnl
3403 @result{}include(`quote.m4')dnl
3404 @result{}divert(`-1')
3405 @result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
3406 @result{}# quoted list, simple version
3407 @result{}define(`foreachq', `pushdef(`$1')_foreachq($@@)popdef(`$1')')
3408 @result{}define(`_arg1', `$1')
3409 @result{}define(`_foreachq', `ifelse(quote($2), `', `',
3410 @result{} `define(`$1', `_arg1($2)')$3`'$0(`$1', `shift($2)', `$3')')')
3411 @result{}divert`'dnl
3414 Notice that @code{@w{_foreachq}} had to use the helper macro
3415 @code{quote} defined earlier (@pxref{Shift}), to ensure that the
3416 embedded @code{ifelse} call does not go haywire if a list element
3417 contains a comma. Unfortunately, this implementation of @code{foreachq}
3418 has its own severe flaw. Whereas the @code{foreach} implementation was
3419 linear, this macro is quadratic in the number of list elements, and is
3420 much more likely to trip up the limit set by the command line option
3421 @option{--nesting-limit} (or @option{-L}, @pxref{Limits control, ,
3422 Invoking m4}). Additionally, this implementation does not expand
3423 @samp{defn(`@var{iterator}')} very well, when compared with
3428 $ @kbd{m4 -I examples}
3429 include(`foreach.m4')include(`foreachq.m4')
3431 foreach(`name', `(`a', `b')', ` defn(`name')')
3433 foreachq(`name', ``a', `b'', ` defn(`name')')
3434 @result{} _arg1(`a', `b') _arg1(shift(`a', `b'))
3437 It is possible to have robust iteration with linear behavior and sane
3438 @var{iterator} contents for either list style. See if you can learn
3439 from the best elements of both of these implementations to create robust
3440 macros (or @pxref{Improved foreach, , Answers}).
3443 @section Working with definition stacks
3445 @cindex definition stack
3446 @cindex pushdef stack
3447 @cindex stack, macro definition
3448 Thanks to @code{pushdef}, manipulation of a stack is an intrinsic
3449 operation in @code{m4}. Normally, only the topmost definition in a
3450 stack is important, but sometimes, it is desirable to manipulate the
3451 entire definition stack.
3453 @deffn Composite stack_foreach (@var{macro}, @var{action})
3454 @deffnx Composite stack_foreach_lifo (@var{macro}, @var{action})
3455 For each of the @code{pushdef} definitions associated with @var{macro},
3456 invoke the macro @var{action} with a single argument of that definition.
3457 @code{stack_foreach} visits the oldest definition first, while
3458 @code{stack_foreach_lifo} visits the current definition first.
3459 @var{action} should not modify or dereference @var{macro}. There are a
3460 few special macros, such as @code{defn}, which cannot be used as the
3461 @var{macro} parameter.
3464 A sample implementation of these macros is distributed in the file
3465 @file{m4-@value{VERSION}/@/examples/@/stack.m4}.
3469 $ @kbd{m4 -I examples}
3472 pushdef(`a', `1')pushdef(`a', `2')pushdef(`a', `3')
3474 define(`show', ``$1'
3477 stack_foreach(`a', `show')dnl
3481 stack_foreach_lifo(`a', `show')dnl
3487 Now for the implementation. Note the definition of a helper macro,
3488 @code{_stack_reverse}, which destructively swaps the contents of one
3489 stack of definitions into the reverse order in the temporary macro
3490 @samp{tmp-$1}. By calling the helper twice, the original order is
3491 restored back into the macro @samp{$1}; since the operation is
3492 destructive, this explains why @samp{$1} must not be modified or
3493 dereferenced during the traversal. The caller can then inject
3494 additional code to pass the definition currently being visited to
3495 @samp{$2}. The choice of helper names is intentional; since @samp{-} is
3496 not valid as part of a macro name, there is no risk of conflict with a
3497 valid macro name, and the code is guaranteed to use @code{defn} where
3498 necessary. Finally, note that any macro used in the traversal of a
3499 @code{pushdef} stack, such as @code{pushdef} or @code{defn}, cannot be
3500 handled by @code{stack_foreach}, since the macro would temporarily be
3501 undefined during the algorithm.
3505 $ @kbd{m4 -I examples}
3506 undivert(`stack.m4')dnl
3507 @result{}divert(`-1')
3508 @result{}# stack_foreach(macro, action)
3509 @result{}# Invoke ACTION with a single argument of each definition
3510 @result{}# from the definition stack of MACRO, starting with the oldest.
3511 @result{}define(`stack_foreach',
3512 @result{}`_stack_reverse(`$1', `tmp-$1')'dnl
3513 @result{}`_stack_reverse(`tmp-$1', `$1', `$2(defn(`$1'))')')
3514 @result{}# stack_foreach_lifo(macro, action)
3515 @result{}# Invoke ACTION with a single argument of each definition
3516 @result{}# from the definition stack of MACRO, starting with the newest.
3517 @result{}define(`stack_foreach_lifo',
3518 @result{}`_stack_reverse(`$1', `tmp-$1', `$2(defn(`$1'))')'dnl
3519 @result{}`_stack_reverse(`tmp-$1', `$1')')
3520 @result{}define(`_stack_reverse',
3521 @result{}`ifdef(`$1', `pushdef(`$2', defn(`$1'))$3`'popdef(`$1')$0($@@)')')
3522 @result{}divert`'dnl
3526 @section Building macros with macros
3528 @cindex macro composition
3529 @cindex composing macros
3530 Since m4 is a macro language, it is possible to write macros that
3531 can build other macros. First on the list is a way to automate the
3532 creation of blind macros.
3534 @cindex macro, blind
3536 @deffn Composite define_blind (@var{name}, @ovar{value})
3537 Defines @var{name} as a blind macro, such that @var{name} will expand to
3538 @var{value} only when given explicit arguments. @var{value} should not
3539 be the result of @code{defn} (@pxref{Defn}). This macro is only
3540 recognized with parameters, and results in an empty string.
3543 Defining a macro to define another macro can be a bit tricky. We want
3544 to use a literal @samp{$#} in the argument to the nested @code{define}.
3545 However, if @samp{$} and @samp{#} are adjacent in the definition of
3546 @code{define_blind}, then it would be expanded as the number of
3547 arguments to @code{define_blind} rather than the intended number of
3548 arguments to @var{name}. The solution is to pass the difficult
3549 characters through extra arguments to a helper macro
3550 @code{_define_blind}. When composing macros, it is a common idiom to
3551 need a helper macro to concatenate text that forms parameters in the
3552 composed macro, rather than interpreting the text as a parameter of the
3555 As for the limitation against using @code{defn}, there are two reasons.
3556 If a macro was previously defined with @code{define_blind}, then it can
3557 safely be renamed to a new blind macro using plain @code{define}; using
3558 @code{define_blind} to rename it just adds another layer of
3559 @code{ifelse}, occupying memory and slowing down execution. And if a
3560 macro is a builtin, then it would result in an attempt to define a macro
3561 consisting of both text and a builtin token; this is not supported, and
3562 the builtin token is flattened to an empty string.
3564 With that explanation, here's the definition, and some sample usage.
3565 Notice that @code{define_blind} is itself a blind macro.
3569 define(`define_blind', `ifelse(`$#', `0', ``$0'',
3570 `_$0(`$1', `$2', `$'`#', `$'`0')')')
3572 define(`_define_blind', `define(`$1',
3573 `ifelse(`$3', `0', ``$4'', `$2')')')
3576 @result{}define_blind
3577 define_blind(`foo', `arguments were $*')
3582 @result{}arguments were bar
3583 define(`blah', defn(`foo'))
3588 @result{}arguments were a,b
3590 @result{}ifelse(`$#', `0', ``$0'', `arguments were $*')
3593 @cindex currying arguments
3594 @cindex argument currying
3595 Another interesting composition tactic is argument @dfn{currying}, or
3596 factoring a macro that takes multiple arguments for use in a context
3597 that provides exactly one argument.
3599 @deffn Composite curry (@var{macro}, @dots{})
3600 Expand to a macro call that takes exactly one argument, then appends
3601 that argument to the original arguments and invokes @var{macro} with the
3602 resulting list of arguments.
3605 A demonstration of currying makes the intent of this macro a little more
3606 obvious. The macro @code{stack_foreach} mentioned earlier is an example
3607 of a context that provides exactly one argument to a macro name. But
3608 coupled with currying, we can invoke @code{reverse} with two arguments
3609 for each definition of a macro stack. This example uses the file
3610 @file{m4-@value{VERSION}/@/examples/@/curry.m4} included in the
3615 $ @kbd{m4 -I examples}
3616 include(`curry.m4')include(`stack.m4')
3618 define(`reverse', `ifelse(`$#', `0', , `$#', `1', ``$1'',
3619 `reverse(shift($@@)), `$1'')')
3621 pushdef(`a', `1')pushdef(`a', `2')pushdef(`a', `3')
3623 stack_foreach(`a', `:curry(`reverse', `4')')
3624 @result{}:1, 4:2, 4:3, 4
3625 curry(`curry', `reverse', `1')(`2')(`3')
3629 Now for the implementation. Notice how @code{curry} leaves off with a
3630 macro name but no open parenthesis, while still in the middle of
3631 collecting arguments for @samp{$1}. The macro @code{_curry} is the
3632 helper macro that takes one argument, then adds it to the list and
3633 finally supplies the closing parenthesis. The use of a comma inside the
3634 @code{shift} call allows currying to also work for a macro that takes
3635 one argument, although it often makes more sense to invoke that macro
3636 directly rather than going through @code{curry}.
3640 $ @kbd{m4 -I examples}
3641 undivert(`curry.m4')dnl
3642 @result{}divert(`-1')
3643 @result{}# curry(macro, args)
3644 @result{}# Expand to a macro call that takes one argument, then invoke
3645 @result{}# macro(args, extra).
3646 @result{}define(`curry', `$1(shift($@@,)_$0')
3647 @result{}define(`_curry', ``$1')')
3648 @result{}divert`'dnl
3651 Unfortunately, with M4 1.4.x, @code{curry} is unable to handle builtin
3652 tokens, which are silently flattened to the empty string when passed
3653 through another text macro. This limitation will be lifted in a future
3656 @cindex renaming macros
3657 @cindex copying macros
3658 @cindex macros, copying
3659 Putting the last few concepts together, it is possible to copy or rename
3660 an entire stack of macro definitions.
3662 @deffn Composite copy (@var{source}, @var{dest})
3663 @deffnx Composite rename (@var{source}, @var{dest})
3664 Ensure that @var{dest} is undefined, then define it to the same stack of
3665 definitions currently in @var{source}. @code{copy} leaves @var{source}
3666 unchanged, while @code{rename} undefines @var{source}. There are only a
3667 few macros, such as @code{copy} or @code{defn}, which cannot be copied
3671 The implementation is relatively straightforward (although since it uses
3672 @code{curry}, it is unable to copy builtin macros, such as the second
3673 definition of @code{a} as a synonym for @code{divnum}. See if you can
3674 design a version that works around this limitation, or @pxref{Improved
3679 $ @kbd{m4 -I examples}
3680 include(`curry.m4')include(`stack.m4')
3682 define(`rename', `copy($@@)undefine(`$1')')dnl
3683 define(`copy', `ifdef(`$2', `errprint(`$2 already defined
3685 `stack_foreach(`$1', `curry(`pushdef', `$2')')')')dnl
3686 pushdef(`a', `1')pushdef(`a', defn(`divnum'))pushdef(`a', `2')
3701 @chapter How to debug macros and input
3703 @cindex debugging macros
3704 @cindex macros, debugging
3705 When writing macros for @code{m4}, they often do not work as intended on
3706 the first try (as is the case with most programming languages).
3707 Fortunately, there is support for macro debugging in @code{m4}.
3710 * Dumpdef:: Displaying macro definitions
3711 * Trace:: Tracing macro calls
3712 * Debug Levels:: Controlling debugging output
3713 * Debug Output:: Saving debugging output
3717 @section Displaying macro definitions
3719 @cindex displaying macro definitions
3720 @cindex macros, displaying definitions
3721 @cindex definitions, displaying macro
3722 @cindex standard error, output to
3723 If you want to see what a name expands into, you can use the builtin
3726 @deffn Builtin dumpdef (@ovar{names@dots{}})
3727 Accepts any number of arguments. If called without any arguments,
3728 it displays the definitions of all known names, otherwise it displays
3729 the definitions of the @var{names} given. The output is printed to the
3730 current debug file (usually standard error), and is sorted by name. If
3731 an unknown name is encountered, a warning is printed.
3733 The expansion of @code{dumpdef} is void.
3738 define(`foo', `Hello world.')
3741 @error{}foo:@tabchar{}`Hello world.'
3744 @error{}define:@tabchar{}<define>
3748 The last example shows how builtin macros definitions are displayed.
3749 The definition that is dumped corresponds to what would occur if the
3750 macro were to be called at that point, even if other definitions are
3751 still live due to redefining a macro during argument collection.
3755 pushdef(`f', ``$0'1')pushdef(`f', ``$0'2')
3757 f(popdef(`f')dumpdef(`f'))
3758 @error{}f:@tabchar{}``$0'1'
3760 f(popdef(`f')dumpdef(`f'))
3761 @error{}m4:stdin:3: undefined macro `f'
3765 @xref{Debug Levels}, for information on controlling the details of the
3769 @section Tracing macro calls
3771 @cindex tracing macro expansion
3772 @cindex macro expansion, tracing
3773 @cindex expansion, tracing macro
3774 @cindex standard error, output to
3775 It is possible to trace macro calls and expansions through the builtins
3776 @code{traceon} and @code{traceoff}:
3778 @deffn Builtin traceon (@ovar{names@dots{}})
3779 @deffnx Builtin traceoff (@ovar{names@dots{}})
3780 When called without any arguments, @code{traceon} and @code{traceoff}
3781 will turn tracing on and off, respectively, for all currently defined
3784 When called with arguments, only the macros listed in @var{names} are
3785 affected, whether or not they are currently defined.
3787 The expansion of @code{traceon} and @code{traceoff} is void.
3790 Whenever a traced macro is called and the arguments have been collected,
3791 the call is displayed. If the expansion of the macro call is not void,
3792 the expansion can be displayed after the call. The output is printed
3793 to the current debug file (defaulting to standard error, @pxref{Debug
3798 define(`foo', `Hello World.')
3800 define(`echo', `$@@')
3802 traceon(`foo', `echo')
3805 @error{}m4trace: -1- foo -> `Hello World.'
3806 @result{}Hello World.
3807 echo(`gnus', `and gnats')
3808 @error{}m4trace: -1- echo(`gnus', `and gnats') -> ``gnus',`and gnats''
3809 @result{}gnus,and gnats
3812 The number between dashes is the depth of the expansion. It is one most
3813 of the time, signifying an expansion at the outermost level, but it
3814 increases when macro arguments contain unquoted macro calls. The
3815 maximum number that will appear between dashes is controlled by the
3816 option @option{--nesting-limit} (or @option{-L}, @pxref{Limits control,
3817 , Invoking m4}). Additionally, the option @option{--trace} (or
3818 @option{-t}) can be used to invoke @code{traceon(@var{name})} before
3821 @comment The explicit -dp neutralizes the testsuite default of -d.
3822 @comment options: -dp -L3 -tifelse
3825 $ @kbd{m4 -L 3 -t ifelse}
3827 @error{}m4trace: -1- ifelse
3829 ifelse(ifelse(ifelse(`three levels')))
3830 @error{}m4trace: -3- ifelse
3831 @error{}m4trace: -2- ifelse
3832 @error{}m4trace: -1- ifelse
3834 ifelse(ifelse(ifelse(ifelse(`four levels'))))
3835 @error{}m4:stdin:3: recursion limit of 3 exceeded, use -L<N> to change it
3838 Tracing by name is an attribute that is preserved whether the macro is
3839 defined or not. This allows the selection of macros to trace before
3840 those macros are defined.
3852 define(`foo', `bar')
3855 @error{}m4trace: -1- foo -> `bar'
3859 ifdef(`foo', `yes', `no')
3862 @error{}m4:stdin:9: undefined macro `foo'
3864 define(`foo', `blah')
3867 @error{}m4trace: -1- foo -> `blah'
3875 Tracing even works on builtins. However, @code{defn} (@pxref{Defn})
3876 does not transfer tracing status.
3883 @error{}m4trace: -1- traceon(`traceoff')
3885 traceoff(`traceoff')
3886 @error{}m4trace: -1- traceoff(`traceoff')
3890 traceon(`eval', `m4_divnum')
3892 define(`m4_eval', defn(`eval'))
3894 define(`m4_divnum', defn(`divnum'))
3897 @error{}m4trace: -1- eval(`0') -> `0'
3900 @error{}m4trace: -2- m4_divnum -> `0'
3904 @xref{Debug Levels}, for information on controlling the details of the
3905 display. The format of the trace output is not specified by
3906 POSIX, and varies between implementations of @code{m4}.
3909 @comment not worth including in the manual, but this tests a trace code
3910 @comment path that was temporarily broken
3911 @comment options: -de --trace ifelse
3913 $ @kbd{m4 -de --trace ifelse}
3914 define(`e', `ifelse(`$1', `$2', `ifelse(`$1', `$2', `e(shift($@@))')')')
3917 @error{}m4trace: -1- ifelse -> ifelse(`1', `1', `e(shift(`1',`1'))')
3918 @error{}m4trace: -1- ifelse -> e(shift(`1',`1'))
3919 @error{}m4trace: -1- ifelse
3925 @section Controlling debugging output
3927 @cindex controlling debugging output
3928 @cindex debugging output, controlling
3929 The @option{-d} option to @code{m4} (or @option{--debug},
3930 @pxref{Debugging options, , Invoking m4}) controls the amount of details
3932 categories of output. Trace output is requested by @code{traceon}
3933 (@pxref{Trace}), and each line is prefixed by @samp{m4trace:} in
3934 relation to a macro invocation. Debug output tracks useful events not
3935 associated with a macro invocation, and each line is prefixed by
3936 @samp{m4debug:}. Finally, @code{dumpdef} (@pxref{Dumpdef}) output is
3937 affected, with no prefix added to the output lines.
3939 The @var{flags} following the option can be one or more of the
3944 In trace output, show the actual arguments that were collected before
3945 invoking the macro. This applies to all macro calls if the @samp{t}
3946 flag is used, otherwise only the macros covered by calls of
3947 @code{traceon}. Arguments are subject to length truncation specified by
3948 the command line option @option{--arglength} (or @option{-l}).
3951 In trace output, show several trace lines for each macro call. A line
3952 is shown when the macro is seen, but before the arguments are collected;
3953 a second line when the arguments have been collected and a third line
3954 after the call has completed.
3957 In trace output, show the expansion of each macro call, if it is not
3958 void. This applies to all macro calls if the @samp{t} flag is used,
3959 otherwise only the macros covered by calls of @code{traceon}. The
3960 expansion is subject to length truncation specified by the command line
3961 option @option{--arglength} (or @option{-l}).
3964 In debug and trace output, include the name of the current input file in
3968 In debug output, print a message each time the current input file is
3972 In debug and trace output, include the current input line number in the
3976 In debug output, print a message when a named file is found through the
3977 path search mechanism (@pxref{Search Path}), giving the actual file name
3981 In trace and dumpdef output, quote actual arguments and macro expansions
3982 in the display with the current quotes. This is useful in connection
3983 with the @samp{a} and @samp{e} flags above.
3986 In trace output, trace all macro calls made in this invocation of
3987 @code{m4}, regardless of the settings of @code{traceon}.
3990 In trace output, add a unique `macro call id' to each line of the trace
3991 output. This is useful in connection with the @samp{c} flag above.
3994 A shorthand for all of the above flags.
3997 If no flags are specified with the @option{-d} option, the default is
3998 @samp{aeq}. The examples throughout this manual assume the default
4001 @cindex GNU extensions
4002 There is a builtin macro @code{debugmode}, which allows on-the-fly control of
4003 the debugging output format:
4005 @deffn Builtin debugmode (@ovar{flags})
4006 The argument @var{flags} should be a subset of the letters listed above.
4007 As special cases, if the argument starts with a @samp{+}, the flags are
4008 added to the current debug flags, and if it starts with a @samp{-}, they
4009 are removed. If no argument is present, all debugging flags are cleared
4010 (as if no @option{-d} was given), and with an empty argument the flags
4011 are reset to the default of @samp{aeq}.
4013 The expansion of @code{debugmode} is void.
4016 @comment The explicit -dp neutralizes the testsuite default of -d.
4017 @comment options: -dp
4020 define(`foo', `FOO')
4027 @error{}m4trace: -1- foo -> `FOO'
4032 @error{}m4trace: -1- foo
4037 @error{}m4trace:8: -1- foo
4041 The following example demonstrates the behavior of length truncation,
4042 when specified on the command line. Note that each argument and the
4043 final result are individually truncated. Also, the special tokens for
4044 builtin functions are not truncated.
4046 @comment options: -l6
4049 define(`echo', `$@@')debugmode(`+t')
4051 echo(`1', `long string')
4052 @error{}m4trace: -1- echo(`1', `long s...') -> ``1',`l...'
4053 @result{}1,long string
4054 indir(`echo', defn(`changequote'))
4055 @error{}m4trace: -2- defn(`change...')
4056 @error{}m4trace: -1- indir(`echo', <changequote>) -> ``''
4060 This example shows the effects of the debug flags that are not related
4064 @comment options: -dip
4066 $ @kbd{m4 -dip -I examples}
4067 @error{}m4debug: input read from stdin
4069 @error{}m4debug: path search for `foo' found `examples/foo'
4070 @error{}m4debug: input read from examples/foo
4072 @error{}m4debug: input reverted to stdin, line 1
4074 @error{}m4debug: input exhausted
4078 @section Saving debugging output
4080 @cindex saving debugging output
4081 @cindex debugging output, saving
4082 @cindex output, saving debugging
4083 @cindex GNU extensions
4084 Debug and tracing output can be redirected to files using either the
4085 @option{--debugfile} option to @code{m4} (@pxref{Debugging options, ,
4086 Invoking m4}), or with the builtin macro @code{debugfile}:
4088 @deffn Builtin debugfile (@ovar{file})
4089 Sends all further debug and trace output to @var{file}, opened in append
4090 mode. If @var{file} is the empty string, debug and trace output are
4091 discarded. If @code{debugfile} is called without any arguments, debug
4092 and trace output are sent to standard error. This does not affect
4093 warnings, error messages, or @code{errprint} output, which are
4094 always sent to standard error. If @var{file} cannot be opened, the
4095 current debug file is unchanged, and an error is issued.
4097 The expansion of @code{debugfile} is void.
4105 @error{}m4:stdin:2: Warning: excess arguments to builtin `divnum' ignored
4106 @error{}m4trace: -1- divnum(`extra') -> `0'
4111 @error{}m4:stdin:4: Warning: excess arguments to builtin `divnum' ignored
4116 @error{}m4trace: -1- divnum -> `0'
4121 @chapter Input control
4123 This chapter describes various builtin macros for controlling the input
4127 * Dnl:: Deleting whitespace in input
4128 * Changequote:: Changing the quote characters
4129 * Changecom:: Changing the comment delimiters
4130 * Changeword:: Changing the lexical structure of words
4131 * M4wrap:: Saving text until end of input
4135 @section Deleting whitespace in input
4137 @cindex deleting whitespace in input
4138 @cindex discarding input
4139 @cindex input, discarding
4140 The builtin @code{dnl} stands for ``Discard to Next Line'':
4143 All characters, up to and including the next newline, are discarded
4144 without performing any macro expansion. A warning is issued if the end
4145 of the file is encountered without a newline.
4147 The expansion of @code{dnl} is void.
4150 It is often used in connection with @code{define}, to remove the
4151 newline that follows the call to @code{define}. Thus
4154 define(`foo', `Macro `foo'.')dnl A very simple macro, indeed.
4159 The input up to and including the next newline is discarded, as opposed
4160 to the way comments are treated (@pxref{Comments}).
4162 Usually, @code{dnl} is immediately followed by an end of line or some
4163 other whitespace. GNU @code{m4} will produce a warning diagnostic if
4164 @code{dnl} is followed by an open parenthesis. In this case, @code{dnl}
4165 will collect and process all arguments, looking for a matching close
4166 parenthesis. All predictable side effects resulting from this
4167 collection will take place. @code{dnl} will return no output. The
4168 input following the matching close parenthesis up to and including the
4169 next newline, on whatever line containing it, will still be discarded.
4172 dnl(`args are ignored, but side effects occur',
4173 define(`foo', `like this')) while this text is ignored: undefine(`foo')
4174 @error{}m4:stdin:1: Warning: excess arguments to builtin `dnl' ignored
4175 See how `foo' was defined, foo?
4176 @result{}See how foo was defined, like this?
4179 If the end of file is encountered without a newline character, a
4180 warning is issued and dnl stops consuming input.
4183 m4wrap(`m4wrap(`2 hi
4189 @error{}m4:stdin:1: Warning: end of file treated as newline
4194 @section Changing the quote characters
4196 @cindex changing quote delimiters
4197 @cindex quote delimiters, changing
4198 @cindex delimiters, changing
4199 The default quote delimiters can be changed with the builtin
4202 @deffn Builtin changequote (@dvar{start, `}, @dvar{end, '})
4203 This sets @var{start} as the new begin-quote delimiter and @var{end} as
4204 the new end-quote delimiter. If both arguments are missing, the default
4205 quotes (@code{`} and @code{'}) are used. If @var{start} is void, then
4206 quoting is disabled. Otherwise, if @var{end} is missing or void, the
4207 default end-quote delimiter (@code{'}) is used. The quote delimiters
4208 can be of any length.
4210 The expansion of @code{changequote} is void.
4214 changequote(`[', `]')
4216 define([foo], [Macro [foo].])
4222 The quotation strings can safely contain eight-bit characters.
4224 @comment Yuck. I know of no clean way to render an 8-bit character in
4225 @comment both info and dvi. This example uses the `open-guillemot' and
4226 @comment `close-guillemot' characters of the Latin-1 character set.
4233 changequote(`«', `»')
4239 If no single character is appropriate, @var{start} and @var{end} can be
4240 of any length. Other implementations cap the delimiter length to five
4241 characters, but GNU has no inherent limit.
4244 changequote(`[[[', `]]]')
4246 define([[[foo]]], [[[Macro [[[[[foo]]]]].]]])
4249 @result{}Macro [[foo]].
4252 Calling @code{changequote} with @var{start} as the empty string will
4253 effectively disable the quoting mechanism, leaving no way to quote text.
4254 However, using an empty string is not portable, as some other
4255 implementations of @code{m4} revert to the default quoting, while others
4256 preserve the prior non-empty delimiter. If @var{start} is not empty,
4257 then an empty @var{end} will use the default end-quote delimiter of
4258 @samp{'}, as otherwise, it would be impossible to end a quoted string.
4259 Again, this is not portable, as some other @code{m4} implementations
4260 reuse @var{start} as the end-quote delimiter, while others preserve the
4261 previous non-empty value. Omitting both arguments restores the default
4262 begin-quote and end-quote delimiters; fortunately this behavior is
4263 portable to all implementations of @code{m4}.
4266 define(`foo', `Macro `FOO'.')
4271 @result{}Macro `FOO'.
4273 @result{}`Macro `FOO'.'
4280 There is no way in @code{m4} to quote a string containing an unmatched
4281 begin-quote, except using @code{changequote} to change the current
4284 If the quotes should be changed from, say, @samp{[} to @samp{[[},
4285 temporary quote characters have to be defined. To achieve this, two
4286 calls of @code{changequote} must be made, one for the temporary quotes
4287 and one for the new quotes.
4289 Macros are recognized in preference to the begin-quote string, so if a
4290 prefix of @var{start} can be recognized as part of a potential macro
4291 name, the quoting mechanism is effectively disabled. Unless you use
4292 @code{changeword} (@pxref{Changeword}), this means that @var{start}
4293 should not begin with a letter, digit, or @samp{_} (underscore).
4294 However, even though quoted strings are not recognized, the quote
4295 characters can still be discerned in macro expansion and in trace
4299 define(`echo', `$@@')
4303 changequote(`q', `Q')
4311 changequote(`-', `EOF')
4317 changequote(`1', `2')
4325 Quotes are recognized in preference to argument collection. In
4326 particular, if @var{start} is a single @samp{(}, then argument
4327 collection is effectively disabled. For portability with other
4328 implementations, it is a good idea to avoid @samp{(}, @samp{,}, and
4329 @samp{)} as the first character in @var{start}.
4332 define(`echo', `$#:$@@:')
4336 changequote(`(',`)')
4342 changequote(`((', `))')
4350 changequote(`,', `)')
4356 However, if you are not worried about portability, using @samp{(} and
4357 @samp{)} as quoting characters has an interesting property---you can use
4358 it to compute a quoted string containing the expansion of any quoted
4359 text, as long as the expansion results in both balanced quotes and
4360 balanced parentheses. The trick is realizing @code{expand} uses
4361 @samp{$1} unquoted, to trigger its expansion using the normal quoting
4362 characters, but uses extra parentheses to group unquoted commas that
4363 occur in the expansion without consuming whitespace following those
4364 commas. Then @code{_expand} uses @code{changequote} to convert the
4365 extra parentheses back into quoting characters. Note that it takes two
4366 more @code{changequote} invocations to restore the original quotes.
4367 Contrast the behavior on whitespace when using @samp{$*}, via
4368 @code{quote}, to attempt the same task.
4371 changequote(`[', `]')dnl
4372 define([a], [1, (b)])dnl
4374 define([quote], [[$*]])dnl
4375 define([expand], [_$0(($1))])dnl
4377 [changequote([(], [)])$1changequote`'changequote(`[', `]')])dnl
4378 expand([a, a, [a, a], [[a, a]]])
4379 @result{}1, (2), 1, (2), a, a, [a, a]
4380 quote(a, a, [a, a], [[a, a]])
4381 @result{}1,(2),1,(2),a, a,[a, a]
4384 If @var{end} is a prefix of @var{start}, the end-quote will be
4385 recognized in preference to a nested begin-quote. In particular,
4386 changing the quotes to have the same string for @var{start} and
4387 @var{end} disables nesting of quotes. When quote nesting is disabled,
4388 it is impossible to double-quote strings across macro expansions, so
4389 using the same string is not done very often.
4394 changequote(`""', `"')
4406 changequote(`"', `"')
4413 @comment And another stress test, not worth documenting in the manual.
4415 define(`aaaaaaaaaaaaaaaaaaaa', `A')define(`q', `"$@@"')
4417 changequote(`"', `"')
4419 q(q("aaaaaaaaaaaaaaaaaaaa", "a"))
4424 It is an error if the end of file occurs within a quoted string.
4429 @result{}hello world
4432 @error{}m4:stdin:2: ERROR: end of file in string
4437 ifelse(`dangling quote
4439 @error{}m4:stdin:1: ERROR: end of file in string
4443 @section Changing the comment delimiters
4445 @cindex changing comment delimiters
4446 @cindex comment delimiters, changing
4447 @cindex delimiters, changing
4448 The default comment delimiters can be changed with the builtin
4449 macro @code{changecom}:
4451 @deffn Builtin changecom (@ovar{start}, @dvar{end, @key{NL}})
4452 This sets @var{start} as the new begin-comment delimiter and @var{end}
4453 as the new end-comment delimiter. If both arguments are missing, or
4454 @var{start} is void, then comments are disabled. Otherwise, if
4455 @var{end} is missing or void, the default end-comment delimiter of
4456 newline is used. The comment delimiters can be of any length.
4458 The expansion of @code{changecom} is void.
4462 define(`comment', `COMMENT')
4465 @result{}# A normal comment
4466 changecom(`/*', `*/')
4468 # Not a comment anymore
4469 @result{}# Not a COMMENT anymore
4470 But: /* this is a comment now */ while this is not a comment
4471 @result{}But: /* this is a comment now */ while this is not a COMMENT
4474 @cindex comments, copied to output
4475 Note how comments are copied to the output, much as if they were quoted
4476 strings. If you want the text inside a comment expanded, quote the
4477 begin-comment delimiter.
4479 Calling @code{changecom} without any arguments, or with @var{start} as
4480 the empty string, will effectively disable the commenting mechanism. To
4481 restore the original comment start of @samp{#}, you must explicitly ask
4482 for it. If @var{start} is not empty, then an empty @var{end} will use
4483 the default end-comment delimiter of newline, as otherwise, it would be
4484 impossible to end a comment. However, this is not portable, as some
4485 other @code{m4} implementations preserve the previous non-empty
4489 define(`comment', `COMMENT')
4493 # Not a comment anymore
4494 @result{}# Not a COMMENT anymore
4498 @result{}# comment again
4501 The comment strings can safely contain eight-bit characters.
4503 @comment Yuck. I know of no clean way to render an 8-bit character in
4504 @comment both info and dvi. This example uses the `open-guillemot' and
4505 @comment `close-guillemot' characters of the Latin-1 character set.
4512 changecom(`«', `»')
4518 If no single character is appropriate, @var{start} and @var{end} can be
4519 of any length. Other implementations cap the delimiter length to five
4520 characters, but GNU has no inherent limit.
4522 Comments are recognized in preference to macros. However, this is not
4523 compatible with other implementations, where macros and even quoting
4524 takes precedence over comments, so it may change in a future release.
4525 For portability, this means that @var{start} should not begin with a
4526 letter, digit, or @samp{_} (underscore), and that neither the
4527 start-quote nor the start-comment string should be a prefix of the
4533 define(`hi1hi2', `hello')
4547 Comments are recognized in preference to argument collection. In
4548 particular, if @var{start} is a single @samp{(}, then argument
4549 collection is effectively disabled. For portability with other
4550 implementations, it is a good idea to avoid @samp{(}, @samp{,}, and
4551 @samp{)} as the first character in @var{start}.
4554 define(`echo', `$#:$*:$@@:')
4564 changecom(`((', `))')
4573 @result{}1:HI,hi)bye:HI,hi)bye:
4577 @result{}3:HI,,HI,HI:HI,,`'hi,HI:
4578 echo(hi,`,`'hi',hi`'changecom(`,,', `hi'))
4579 @result{}3:HI,,`'hi,HI:HI,,`'hi,HI:
4582 It is an error if the end of file occurs within a comment.
4586 changecom(`/*', `*/')
4590 @error{}m4:stdin:2: ERROR: end of file in comment
4594 @section Changing the lexical structure of words
4596 @cindex lexical structure of words
4597 @cindex words, lexical structure of
4598 @cindex syntax, changing
4599 @cindex changing syntax
4600 @cindex regular expressions
4602 The macro @code{changeword} and all associated functionality is
4603 experimental. It is only available if the @option{--enable-changeword}
4604 option was given to @command{configure}, at GNU @code{m4}
4606 time. The functionality will go away in the future, to be replaced by
4607 other new features that are more efficient at providing the same
4608 capabilities. @emph{Do not rely on it}. Please direct your comments
4609 about it the same way you would do for bugs.
4612 A file being processed by @code{m4} is split into quoted strings, words
4613 (potential macro names) and simple tokens (any other single character).
4614 Initially a word is defined by the following regular expression:
4618 [_a-zA-Z][_a-zA-Z0-9]*
4621 Using @code{changeword}, you can change this regular expression:
4623 @deffn {Optional builtin} changeword (@var{regex})
4624 Changes the regular expression for recognizing macro names to be
4625 @var{regex}. If @var{regex} is empty, use
4626 @samp{[_a-zA-Z][_a-zA-Z0-9]*}. @var{regex} must obey the constraint
4627 that every prefix of the desired final pattern is also accepted by the
4628 regular expression. If @var{regex} contains grouping parentheses, the
4629 macro invoked is the portion that matched the first group, rather than
4630 the entire matching string.
4632 The expansion of @code{changeword} is void.
4633 The macro @code{changeword} is recognized only with parameters.
4636 Relaxing the lexical rules of @code{m4} might be useful (for example) if
4637 you wanted to apply translations to a file of numbers:
4640 ifdef(`changeword', `', `errprint(` skipping: no changeword support
4642 changeword(`[_a-zA-Z0-9]+')
4648 Tightening the lexical rules is less useful, because it will generally
4649 make some of the builtins unavailable. You could use it to prevent
4650 accidental call of builtins, for example:
4653 ifdef(`changeword', `', `errprint(` skipping: no changeword support
4655 define(`_indir', defn(`indir'))
4657 changeword(`_[_a-zA-Z0-9]*')
4660 @result{}esyscmd(foo)
4661 _indir(`esyscmd', `echo hi')
4666 Because @code{m4} constructs its words a character at a time, there
4667 is a restriction on the regular expressions that may be passed to
4668 @code{changeword}. This is that if your regular expression accepts
4669 @samp{foo}, it must also accept @samp{f} and @samp{fo}.
4672 ifdef(`changeword', `', `errprint(` skipping: no changeword support
4678 dnl This example wants to recognize changeword, dnl, and `foo\n'.
4679 dnl First, we check that our regexp will match.
4680 regexp(`changeword', `[cd][a-z]*\|foo[
4684 ', `[cd][a-z]*\|foo[
4687 regexp(`f', `[cd][a-z]*\|foo[
4692 changeword(`[cd][a-z]*\|foo[
4695 dnl Even though `foo\n' matches, we forgot to allow `f'.
4698 changeword(`[cd][a-z]*\|fo*[
4701 dnl Now we can call `foo\n'.
4707 @comment One more test of including newline in a macro name; but this
4708 @comment does not need to be displayed in the manual. This ensures
4709 @comment that line numbering is correct when dnl cuts across include
4710 @comment file boundaries, and when __file__ or __line__ is the last
4711 @comment token in an include file.
4714 ifdef(`changeword', `', `errprint(` skipping: no changeword support
4719 include(`foo') ignored
4721 changeword(`\([_a-zA-Z][_a-zA-Z0-9]*\|bar
4726 include(`foo') ignored
4733 ', defn(`__file__'))
4736 @result{}examples/foo
4738 ', defn(`__line__'))
4747 @code{changeword} has another function. If the regular expression
4748 supplied contains any grouped subexpressions, then text outside
4749 the first of these is discarded before symbol lookup. So:
4752 ifdef(`changeword', `', `errprint(` skipping: no changeword support
4755 `errprint(` skipping: syscmd does not have unix semantics
4757 changecom(`/*', `*/')dnl
4758 define(`foo', `bar')dnl
4759 changeword(`#\([_a-zA-Z0-9]*\)')
4761 #esyscmd(`echo foo \#foo')
4766 @code{m4} now requires a @samp{#} mark at the beginning of every
4767 macro invocation, so one can use @code{m4} to preprocess plain
4768 text without losing various words like @samp{divert}.
4770 In @code{m4}, macro substitution is based on text, while in @TeX{}, it
4771 is based on tokens. @code{changeword} can throw this difference into
4772 relief. For example, here is the same idea represented in @TeX{} and
4773 @code{m4}. First, the @TeX{} version:
4777 \def\a@{\message@{Hello@}@}
4786 Then, the @code{m4} version:
4789 ifdef(`changeword', `', `errprint(` skipping: no changeword support
4791 define(`a', `errprint(`Hello')')dnl
4792 changeword(`@@\([_a-zA-Z0-9]*\)')
4795 @result{}errprint(Hello)
4798 In the @TeX{} example, the first line defines a macro @code{a} to
4799 print the message @samp{Hello}. The second line defines @key{@@} to
4800 be usable instead of @key{\} as an escape character. The third line
4801 defines @key{\} to be a normal printing character, not an escape.
4802 The fourth line invokes the macro @code{a}. So, when @TeX{} is run
4803 on this file, it displays the message @samp{Hello}.
4805 When the @code{m4} example is passed through @code{m4}, it outputs
4806 @samp{errprint(Hello)}. The reason for this is that @TeX{} does
4807 lexical analysis of macro definition when the macro is @emph{defined}.
4808 @code{m4} just stores the text, postponing the lexical analysis until
4809 the macro is @emph{used}.
4811 You should note that using @code{changeword} will slow @code{m4} down
4812 by a factor of about seven, once it is changed to something other
4813 than the default regular expression. You can invoke @code{changeword}
4814 with the empty string to restore the default word definition, and regain
4818 @section Saving text until end of input
4820 @cindex saving input
4821 @cindex input, saving
4822 @cindex deferring expansion
4823 @cindex expansion, deferring
4824 It is possible to `save' some text until the end of the normal input has
4825 been seen. Text can be saved, to be read again by @code{m4} when the
4826 normal input has been exhausted. This feature is normally used to
4827 initiate cleanup actions before normal exit, e.g., deleting temporary
4830 To save input text, use the builtin @code{m4wrap}:
4832 @deffn Builtin m4wrap (@var{string}, @dots{})
4833 Stores @var{string} in a safe place, to be reread when end of input is
4834 reached. As a GNU extension, additional arguments are
4835 concatenated with a space to the @var{string}.
4837 The expansion of @code{m4wrap} is void.
4838 The macro @code{m4wrap} is recognized only with parameters.
4842 define(`cleanup', `This is the `cleanup' action.
4847 This is the first and last normal input line.
4848 @result{}This is the first and last normal input line.
4850 @result{}This is the cleanup action.
4853 The saved input is only reread when the end of normal input is seen, and
4854 not if @code{m4exit} is used to exit @code{m4}.
4856 @comment FIXME: this contradicts POSIX, which requires that "If the
4857 @comment m4wrap macro is used multiple times, the arguments specified
4858 @comment shall be processed in the order in which the m4wrap macros were
4859 @comment processed."
4860 It is safe to call @code{m4wrap} from saved text, but then the order in
4861 which the saved text is reread is undefined. If @code{m4wrap} is not used
4862 recursively, the saved pieces of text are reread in the opposite order
4863 in which they were saved (LIFO---last in, first out). However, this
4864 behavior is likely to change in a future release, to match
4865 POSIX, so you should not depend on this order.
4867 It is possible to emulate POSIX behavior even
4868 with older versions of GNU M4 by including the file
4869 @file{m4-@value{VERSION}/@/examples/@/wrapfifo.m4} from the
4874 $ @kbd{m4 -I examples}
4875 undivert(`wrapfifo.m4')dnl
4876 @result{}dnl Redefine m4wrap to have FIFO semantics.
4877 @result{}define(`_m4wrap_level', `0')dnl
4878 @result{}define(`m4wrap',
4879 @result{}`ifdef(`m4wrap'_m4wrap_level,
4880 @result{} `define(`m4wrap'_m4wrap_level,
4881 @result{} defn(`m4wrap'_m4wrap_level)`$1')',
4882 @result{} `builtin(`m4wrap', `define(`_m4wrap_level',
4883 @result{} incr(_m4wrap_level))dnl
4884 @result{}m4wrap'_m4wrap_level)dnl
4885 @result{}define(`m4wrap'_m4wrap_level, `$1')')')dnl
4886 include(`wrapfifo.m4')
4888 m4wrap(`a`'m4wrap(`c
4889 ', `d')')m4wrap(`b')
4895 It is likewise possible to emulate LIFO behavior without resorting to
4896 the GNU M4 extension of @code{builtin}, by including the file
4897 @file{m4-@value{VERSION}/@/examples/@/wraplifo.m4} from the
4898 distribution. (Unfortunately, both examples shown here share some
4899 subtle bugs. See if you can find and correct them; or @pxref{Improved
4900 m4wrap, , Answers}).
4904 $ @kbd{m4 -I examples}
4905 undivert(`wraplifo.m4')dnl
4906 @result{}dnl Redefine m4wrap to have LIFO semantics.
4907 @result{}define(`_m4wrap_level', `0')dnl
4908 @result{}define(`_m4wrap', defn(`m4wrap'))dnl
4909 @result{}define(`m4wrap',
4910 @result{}`ifdef(`m4wrap'_m4wrap_level,
4911 @result{} `define(`m4wrap'_m4wrap_level,
4912 @result{} `$1'defn(`m4wrap'_m4wrap_level))',
4913 @result{} `_m4wrap(`define(`_m4wrap_level', incr(_m4wrap_level))dnl
4914 @result{}m4wrap'_m4wrap_level)dnl
4915 @result{}define(`m4wrap'_m4wrap_level, `$1')')')dnl
4916 include(`wraplifo.m4')
4918 m4wrap(`a`'m4wrap(`c
4919 ', `d')')m4wrap(`b')
4925 Here is an example of implementing a factorial function using
4929 define(`f', `ifelse(`$1', `0', `Answer: 0!=1
4930 ', eval(`$1>1'), `0', `Answer: $2$1=eval(`$2$1')
4931 ', `m4wrap(`f(decr(`$1'), `$2$1*')')')')
4936 @result{}Answer: 10*9*8*7*6*5*4*3*2*1=3628800
4939 Invocations of @code{m4wrap} at the same recursion level are
4940 concatenated and rescanned as usual:
4946 m4wrap(`a')m4wrap(`a')
4953 however, the transition between recursion levels behaves like an end of
4954 file condition between two input files.
4958 m4wrap(`m4wrap(`)')len(abc')
4961 @error{}m4:stdin:1: ERROR: end of file in argument list
4964 @node File Inclusion
4965 @chapter File inclusion
4967 @cindex file inclusion
4968 @cindex inclusion, of files
4969 @code{m4} allows you to include named files at any point in the input.
4972 * Include:: Including named files
4973 * Search Path:: Searching for include files
4977 @section Including named files
4979 There are two builtin macros in @code{m4} for including files:
4981 @deffn Builtin include (@var{file})
4982 @deffnx Builtin sinclude (@var{file})
4983 Both macros cause the file named @var{file} to be read by
4984 @code{m4}. When the end of the file is reached, input is resumed from
4985 the previous input file.
4987 The expansion of @code{include} and @code{sinclude} is therefore the
4988 contents of @var{file}.
4990 If @var{file} does not exist, is a directory, or cannot otherwise be
4991 read, the expansion is void,
4992 and @code{include} will fail with an error while @code{sinclude} is
4993 silent. The empty string counts as a file that does not exist.
4995 The macros @code{include} and @code{sinclude} are recognized only with
5002 @error{}m4:stdin:1: cannot open `none': No such file or directory
5005 @error{}m4:stdin:2: cannot open `': No such file or directory
5013 The rest of this section assumes that @code{m4} is invoked with the
5014 @option{-I} option (@pxref{Preprocessor features, , Invoking m4})
5015 pointing to the @file{m4-@value{VERSION}/@/examples}
5016 directory shipped as part of the GNU @code{m4} package. The
5017 file @file{m4-@value{VERSION}/@/examples/@/incl.m4} in the distribution
5022 $ @kbd{cat examples/incl.m4}
5023 @result{}Include file start
5025 @result{}Include file end
5028 Normally file inclusion is used to insert the contents of a file
5029 into the input stream. The contents of the file will be read by
5030 @code{m4} and macro calls in the file will be expanded:
5034 $ @kbd{m4 -I examples}
5035 define(`foo', `FOO')
5038 @result{}Include file start
5040 @result{}Include file end
5044 The fact that @code{include} and @code{sinclude} expand to the contents
5045 of the file can be used to define macros that operate on entire files.
5046 Here is an example, which defines @samp{bar} to expand to the contents
5051 $ @kbd{m4 -I examples}
5052 define(`bar', include(`incl.m4'))
5054 This is `bar': >>bar<<
5055 @result{}This is bar: >>Include file start
5057 @result{}Include file end
5061 This use of @code{include} is not trivial, though, as files can contain
5062 quotes, commas, and parentheses, which can interfere with the way the
5063 @code{m4} parser works. GNU @code{m4} seamlessly concatenates
5064 the file contents with the next character, even if the included file
5065 ended in the middle of a comment, string, or macro call. These
5066 conditions are only treated as end of file errors if specified as input
5067 files on the command line.
5069 In GNU @code{m4}, an alternative method of reading files is
5070 using @code{undivert} (@pxref{Undivert}) on a named file.
5073 @comment Test that include(`file/') detects that file is not a
5074 @comment directory; we can assume that the current directory contains a
5075 @comment Makefile. mingw fails with EINVAL rather than ENOTDIR.
5078 @comment xerr: ignore
5080 include(`Makefile/')
5081 @error{}m4:stdin:1: cannot open `Makefile/': Not a directory
5085 @comment POSIX allows, but doesn't require, failure on reading
5086 @comment directories. But since they aren't text files, it never makes
5087 @comment sense, so we globally forbid it even if fopen doesn't. mingw
5088 @comment fails with EACCES rather than EISDIR.
5091 @comment xerr: ignore
5094 @error{}m4:stdin:1: cannot open `.': Is a directory
5098 @comment Meanwhile, ignore errors with sinclude.
5101 sinclude(`Makefile/')
5109 @section Searching for include files
5111 @cindex search path for included files
5112 @cindex included files, search path for
5113 @cindex GNU extensions
5114 GNU @code{m4} allows included files to be found in other directories
5115 than the current working directory.
5117 @cindex @env{M4PATH}
5118 If the @option{--prepend-include} or @option{-B} command-line option was
5119 provided (@pxref{Preprocessor features, , Invoking m4}), those
5120 directories are searched first, in reverse order that those options were
5121 listed on the command line. Then @code{m4} looks in the current working
5122 directory. Next comes the directories specified with the
5123 @option{--include} or @option{-I} option, in the order found on the
5124 command line. Finally, if the @env{M4PATH} environment variable is set,
5125 it is expected to contain a colon-separated list of directories, which
5126 will be searched in order.
5128 If the automatic search for include-files causes trouble, the @samp{p}
5129 debug flag (@pxref{Debug Levels}) can help isolate the problem.
5132 @chapter Diverting and undiverting output
5134 @cindex deferring output
5135 Diversions are a way of temporarily saving output. The output of
5136 @code{m4} can at any time be diverted to a temporary file, and be
5137 reinserted into the output stream, @dfn{undiverted}, again at a later
5140 @cindex @env{TMPDIR}
5141 Numbered diversions are counted from 0 upwards, diversion number 0
5142 being the normal output stream. GNU
5143 @code{m4} tries to keep diversions in memory. However, there is a
5144 limit to the overall memory usable by all diversions taken together
5145 (512K, currently). When this maximum is about to be exceeded,
5146 a temporary file is opened to receive the contents of the biggest
5147 diversion still in memory, freeing this memory for other diversions.
5148 When creating the temporary file, @code{m4} honors the value of the
5149 environment variable @env{TMPDIR}, and falls back to @file{/tmp}.
5150 Thus, the amount of available disk space provides the only real limit on
5151 the number and aggregate size of diversions.
5154 @comment We need to test spilled diversions, but don't need to expose
5155 @comment this highly repetitive test in the manual.
5158 divert(`-1')define(`f', `.')
5159 define(`f', defn(`f')defn(`f'))
5160 define(`f', defn(`f')defn(`f'))
5161 define(`f', defn(`f')defn(`f'))
5162 define(`f', defn(`f')defn(`f'))
5163 define(`f', defn(`f')defn(`f'))
5164 define(`f', defn(`f')defn(`f'))
5165 define(`f', defn(`f')defn(`f'))
5166 define(`f', defn(`f')defn(`f'))
5167 define(`f', defn(`f')defn(`f'))
5168 define(`f', defn(`f')defn(`f'))
5169 define(`f', defn(`f')defn(`f'))
5170 define(`f', defn(`f')defn(`f'))
5171 define(`f', defn(`f')defn(`f'))
5172 define(`f', defn(`f')defn(`f'))
5173 define(`f', defn(`f')defn(`f'))
5174 define(`f', defn(`f')defn(`f'))
5175 define(`f', defn(`f')defn(`f'))
5176 define(`f', defn(`f')defn(`f'))
5177 define(`f', defn(`f')defn(`f'))
5178 define(`f', defn(`f')defn(`f'))
5186 divert(`-1')undivert
5192 @comment Another test of spilled diversions.
5195 divert(`-1')define(`f', `.')
5196 define(`f', defn(`f')defn(`f'))
5197 define(`f', defn(`f')defn(`f'))
5198 define(`f', defn(`f')defn(`f'))
5199 define(`f', defn(`f')defn(`f'))
5200 define(`f', defn(`f')defn(`f'))
5201 define(`f', defn(`f')defn(`f'))
5202 define(`f', defn(`f')defn(`f'))
5203 define(`f', defn(`f')defn(`f'))
5204 define(`f', defn(`f')defn(`f'))
5205 define(`f', defn(`f')defn(`f'))
5206 define(`f', defn(`f')defn(`f'))
5207 define(`f', defn(`f')defn(`f'))
5208 define(`f', defn(`f')defn(`f'))
5209 define(`f', defn(`f')defn(`f'))
5210 define(`f', defn(`f')defn(`f'))
5211 define(`f', defn(`f')defn(`f'))
5212 define(`f', defn(`f')defn(`f'))
5213 define(`f', defn(`f')defn(`f'))
5214 define(`f', defn(`f')defn(`f'))
5215 define(`f', defn(`f')defn(`f'))
5224 @comment Catch regression in 1.4.10 with spilled diversions.
5228 `errprint(` skipping: syscmd does not have unix semantics
5230 changequote(`[', `]')dnl
5231 syscmd([echo 'divert(1)hi
5232 format(%1000000d, 1)' | ']__program__[' | sed -n 1p])dnl
5238 @comment Avoid quadratic copying time when transferring diversions;
5239 @comment test both in-memory and spilled to file.
5243 $ @kbd{m4 -I examples}
5244 include(`forloop2.m4')dnl
5245 divert(`1')format(`%10000s', `')dnl
5246 forloop(`i', `1', `10000',
5247 `divert(incr(i))undivert(i)')dnl
5248 divert(`9001')format(`%1000000s', `')dnl
5249 forloop(`i', `9001', `10000',
5250 `divert(incr(i))undivert(i)')dnl
5251 divert(`-1')undivert
5255 Diversions make it possible to generate output in a different order than
5256 the input was read. It is possible to implement topological sorting
5257 dependencies. For example, GNU Autoconf makes use of
5258 diversions under the hood to ensure that the expansion of a prerequisite
5259 macro appears in the output prior to the expansion of a dependent macro,
5260 regardless of which order the two macros were invoked in the user's
5264 * Divert:: Diverting output
5265 * Undivert:: Undiverting output
5266 * Divnum:: Diversion numbers
5267 * Cleardivert:: Discarding diverted text
5271 @section Diverting output
5273 @cindex diverting output to files
5274 @cindex output, diverting to files
5275 @cindex files, diverting output to
5276 Output is diverted using @code{divert}:
5278 @deffn Builtin divert (@dvar{number, 0})
5279 The current diversion is changed to @var{number}. If @var{number} is left
5280 out or empty, it is assumed to be zero. If @var{number} cannot be
5281 parsed, the diversion is unchanged.
5283 The expansion of @code{divert} is void.
5286 When all the @code{m4} input will have been processed, all existing
5287 diversions are automatically undiverted, in numerical order.
5291 This text is diverted.
5294 This text is not diverted.
5295 @result{}This text is not diverted.
5298 @result{}This text is diverted.
5301 Several calls of @code{divert} with the same argument do not overwrite
5302 the previous diverted text, but append to it. Diversions are printed
5303 after any wrapped text is expanded.
5306 define(`text', `TEXT')
5308 divert(`1')`diverted text.'
5311 m4wrap(`Wrapped text precedes ')
5314 @result{}Wrapped TEXT precedes diverted text.
5317 @cindex discarding input
5318 @cindex input, discarding
5319 If output is diverted to a negative diversion, it is simply discarded.
5320 This can be used to suppress unwanted output. A common example of
5321 unwanted output is the trailing newlines after macro definitions. Here
5322 is a common programming idiom in @code{m4} for avoiding them.
5326 define(`foo', `Macro `foo'.')
5327 define(`bar', `Macro `bar'.')
5332 @cindex GNU extensions
5333 Traditional implementations only supported ten diversions. But as a
5334 GNU extension, diversion numbers can be as large as positive
5335 integers will allow, rather than treating a multi-digit diversion number
5336 as a request to discard text.
5339 divert(eval(`1<<28'))world
5346 Note that @code{divert} is an English word, but also an active macro
5347 without arguments. When processing plain text, the word might appear in
5348 normal text and be unintentionally swallowed as a macro invocation. One
5349 way to avoid this is to use the @option{-P} option to rename all
5350 builtins (@pxref{Operation modes, , Invoking m4}). Another is to write
5351 a wrapper that requires a parameter to be recognized.
5354 We decided to divert the stream for irrigation.
5355 @result{}We decided to the stream for irrigation.
5356 define(`divert', `ifelse(`$#', `0', ``$0'', `builtin(`$0', $@@)')')
5362 We decided to divert the stream for irrigation.
5363 @result{}We decided to divert the stream for irrigation.
5367 @section Undiverting output
5369 Diverted text can be undiverted explicitly using the builtin
5372 @deffn Builtin undivert (@ovar{diversions@dots{}})
5373 Undiverts the numeric @var{diversions} given by the arguments, in the
5374 order given. If no arguments are supplied, all diversions are
5375 undiverted, in numerical order.
5377 @cindex file inclusion
5378 @cindex inclusion, of files
5379 @cindex GNU extensions
5380 As a GNU extension, @var{diversions} may contain non-numeric
5381 strings, which are treated as the names of files to copy into the output
5382 without expansion. A warning is issued if a file could not be opened.
5384 The expansion of @code{undivert} is void.
5389 This text is diverted.
5392 This text is not diverted.
5393 @result{}This text is not diverted.
5396 @result{}This text is diverted.
5400 Notice the last two blank lines. One of them comes from the newline
5401 following @code{undivert}, the other from the newline that followed the
5402 @code{divert}! A diversion often starts with a blank line like this.
5404 When diverted text is undiverted, it is @emph{not} reread by @code{m4},
5405 but rather copied directly to the current output, and it is therefore
5406 not an error to undivert into a diversion. Undiverting the empty string
5407 is the same as specifying diversion 0; in either case nothing happens
5408 since the output has already been flushed.
5411 divert(`1')diverted text
5419 @result{}diverted text
5422 divert(`2')undivert(`1')diverted text`'divert
5428 @result{}diverted text
5431 When a diversion has been undiverted, the diverted text is discarded,
5432 and it is not possible to bring back diverted text more than once.
5436 This text is diverted first.
5437 divert(`0')undivert(`1')dnl
5439 @result{}This text is diverted first.
5443 This text is also diverted but not appended.
5444 divert(`0')undivert(`1')dnl
5446 @result{}This text is also diverted but not appended.
5449 Attempts to undivert the current diversion are silently ignored. Thus,
5450 when the current diversion is not 0, the current diversion does not get
5451 rearranged among the other diversions.
5457 divert(`2')undivert`'dnl
5458 divert`'undivert`'dnl
5464 @cindex GNU extensions
5465 @cindex file inclusion
5466 @cindex inclusion, of files
5467 GNU @code{m4} allows named files to be undiverted. Given a
5468 non-numeric argument, the contents of the file named will be copied,
5469 uninterpreted, to the current output. This complements the builtin
5470 @code{include} (@pxref{Include}). To illustrate the difference, assume
5471 the file @file{foo} contains:
5483 define(`bar', `BAR')
5493 If the file is not found (or cannot be read), an error message is
5494 issued, and the expansion is void. It is possible to intermix files
5495 and diversion numbers.
5498 divert(`1')diversion one
5499 divert(`2')undivert(`foo')dnl
5500 divert(`3')diversion three
5502 undivert(`1', `2', `foo', `3')dnl
5503 @result{}diversion one
5506 @result{}diversion three
5510 @section Diversion numbers
5512 @cindex diversion numbers
5513 The current diversion is tracked by the builtin @code{divnum}:
5515 @deffn Builtin divnum
5516 Expands to the number of the current diversion.
5523 Diversion one: divnum
5525 Diversion two: divnum
5528 @result{}Diversion one: 1
5530 @result{}Diversion two: 2
5534 @section Discarding diverted text
5536 @cindex discarding diverted text
5537 @cindex diverted text, discarding
5538 Often it is not known, when output is diverted, whether the diverted
5539 text is actually needed. Since all non-empty diversion are brought back
5540 on the main output stream when the end of input is seen, a method of
5541 discarding a diversion is needed. If all diversions should be
5542 discarded, the easiest is to end the input to @code{m4} with
5543 @samp{divert(`-1')} followed by an explicit @samp{undivert}:
5547 Diversion one: divnum
5549 Diversion two: divnum
5556 No output is produced at all.
5558 Clearing selected diversions can be done with the following macro:
5560 @deffn Composite cleardivert (@ovar{diversions@dots{}})
5561 Discard the contents of each of the listed numeric @var{diversions}.
5565 define(`cleardivert',
5566 `pushdef(`_n', divnum)divert(`-1')undivert($@@)divert(_n)popdef(`_n')')
5570 It is called just like @code{undivert}, but the effect is to clear the
5571 diversions, given by the arguments. (This macro has a nasty bug! You
5572 should try to see if you can find it and correct it; or @pxref{Improved
5573 cleardivert, , Answers}).
5576 @chapter Macros for text handling
5578 There are a number of builtins in @code{m4} for manipulating text in
5579 various ways, extracting substrings, searching, substituting, and so on.
5582 * Len:: Calculating length of strings
5583 * Index macro:: Searching for substrings
5584 * Regexp:: Searching for regular expressions
5585 * Substr:: Extracting substrings
5586 * Translit:: Translating characters
5587 * Patsubst:: Substituting text by regular expression
5588 * Format:: Formatting strings (printf-like)
5592 @section Calculating length of strings
5594 @cindex length of strings
5595 @cindex strings, length of
5596 The length of a string can be calculated by @code{len}:
5598 @deffn Builtin len (@var{string})
5599 Expands to the length of @var{string}, as a decimal number.
5601 The macro @code{len} is recognized only with parameters.
5612 @section Searching for substrings
5614 @cindex substrings, locating
5615 Searching for substrings is done with @code{index}:
5617 @deffn Builtin index (@var{string}, @var{substring})
5618 Expands to the index of the first occurrence of @var{substring} in
5619 @var{string}. The first character in @var{string} has index 0. If
5620 @var{substring} does not occur in @var{string}, @code{index} expands to
5623 The macro @code{index} is recognized only with parameters.
5627 index(`gnus, gnats, and armadillos', `nat')
5629 index(`gnus, gnats, and armadillos', `dag')
5633 Omitting @var{substring} evokes a warning, but still produces output;
5634 contrast this with an empty @var{substring}.
5638 @error{}m4:stdin:1: Warning: too few arguments to builtin `index'
5647 @comment Expose a bug in the strstr() algorithm present in glibc
5648 @comment 2.9 through 2.12 and in gnulib up to Sep 2010.
5651 index(`;:11-:12-:12-:12-:12-:12-:12-:12-:12.:12.:12.:12.:12.:12.:12.:12.:12-',
5652 `:12-:12-:12-:12-:12-:12-:12-:12-')
5656 @comment Expose a bug in the gnulib replacement strstr() algorithm
5657 @comment present from Jun 2010 to Feb 2011, including m4 1.4.15.
5660 index(`..wi.d.', `.d.')
5666 @section Searching for regular expressions
5668 @cindex basic regular expressions
5669 @cindex regular expressions
5670 @cindex expressions, regular
5671 @cindex GNU extensions
5672 Searching for regular expressions is done with the builtin
5675 @deffn Builtin regexp (@var{string}, @var{regexp}, @ovar{replacement})
5676 Searches for @var{regexp} in @var{string}. The syntax for regular
5677 expressions is the same as in GNU Emacs, which is similar to
5678 BRE, Basic Regular Expressions in POSIX.
5680 @xref{Regexps, , Syntax of Regular Expressions, emacs, The GNU Emacs
5685 @uref{http://www.gnu.org/@/software/@/emacs/@/manual/@/emacs.html#Regexps,
5686 Syntax of Regular Expressions} in the GNU Emacs Manual.
5688 Support for ERE, Extended Regular Expressions is not
5689 available, but will be added in GNU M4 2.0.
5691 If @var{replacement} is omitted, @code{regexp} expands to the index of
5692 the first match of @var{regexp} in @var{string}. If @var{regexp} does
5693 not match anywhere in @var{string}, it expands to -1.
5695 If @var{replacement} is supplied, and there was a match, @code{regexp}
5696 changes the expansion to this argument, with @samp{\@var{n}} substituted
5697 by the text matched by the @var{n}th parenthesized sub-expression of
5698 @var{regexp}, up to nine sub-expressions. The escape @samp{\&} is
5699 replaced by the text of the entire regular expression matched. For
5700 all other characters, @samp{\} treats the next character literally. A
5701 warning is issued if there were fewer sub-expressions than the
5702 @samp{\@var{n}} requested, or if there is a trailing @samp{\}. If there
5703 was no match, @code{regexp} expands to the empty string.
5705 The macro @code{regexp} is recognized only with parameters.
5709 regexp(`GNUs not Unix', `\<[a-z]\w+')
5711 regexp(`GNUs not Unix', `\<Q\w*')
5713 regexp(`GNUs not Unix', `\w\(\w+\)$', `*** \& *** \1 ***')
5714 @result{}*** Unix *** nix ***
5715 regexp(`GNUs not Unix', `\<Q\w*', `*** \& *** \1 ***')
5719 Here are some more examples on the handling of backslash:
5722 regexp(`abc', `\(b\)', `\\\10\a')
5724 regexp(`abc', `b', `\1\')
5725 @error{}m4:stdin:2: Warning: sub-expression 1 not present
5726 @error{}m4:stdin:2: Warning: trailing \ ignored in replacement
5728 regexp(`abc', `\(\(d\)?\)\(c\)', `\1\2\3\4\5\6')
5729 @error{}m4:stdin:3: Warning: sub-expression 4 not present
5730 @error{}m4:stdin:3: Warning: sub-expression 5 not present
5731 @error{}m4:stdin:3: Warning: sub-expression 6 not present
5735 Omitting @var{regexp} evokes a warning, but still produces output;
5736 contrast this with an empty @var{regexp} argument.
5740 @error{}m4:stdin:1: Warning: too few arguments to builtin `regexp'
5744 regexp(`abc', `', `\\def')
5749 @section Extracting substrings
5751 @cindex extracting substrings
5752 @cindex substrings, extracting
5753 Substrings are extracted with @code{substr}:
5755 @deffn Builtin substr (@var{string}, @var{from}, @ovar{length})
5756 Expands to the substring of @var{string}, which starts at index
5757 @var{from}, and extends for @var{length} characters, or to the end of
5758 @var{string}, if @var{length} is omitted. The starting index of a string
5759 is always 0. The expansion is empty if there is an error parsing
5760 @var{from} or @var{length}, if @var{from} is beyond the end of
5761 @var{string}, or if @var{length} is negative.
5763 The macro @code{substr} is recognized only with parameters.
5767 substr(`gnus, gnats, and armadillos', `6')
5768 @result{}gnats, and armadillos
5769 substr(`gnus, gnats, and armadillos', `6', `5')
5773 Omitting @var{from} evokes a warning, but still produces output.
5777 @error{}m4:stdin:1: Warning: too few arguments to builtin `substr'
5780 @error{}m4:stdin:2: empty string treated as 0 in builtin `substr'
5785 @section Translating characters
5787 @cindex translating characters
5788 @cindex characters, translating
5789 Character translation is done with @code{translit}:
5791 @deffn Builtin translit (@var{string}, @var{chars}, @ovar{replacement})
5792 Expands to @var{string}, with each character that occurs in
5793 @var{chars} translated into the character from @var{replacement} with
5796 If @var{replacement} is shorter than @var{chars}, the excess characters
5797 of @var{chars} are deleted from the expansion; if @var{chars} is
5798 shorter, the excess characters in @var{replacement} are silently
5799 ignored. If @var{replacement} is omitted, all characters in
5800 @var{string} that are present in @var{chars} are deleted from the
5801 expansion. If a character appears more than once in @var{chars}, only
5802 the first instance is used in making the translation. Only a single
5803 translation pass is made, even if characters in @var{replacement} also
5804 appear in @var{chars}.
5806 As a GNU extension, both @var{chars} and @var{replacement} can
5807 contain character-ranges, e.g., @samp{a-z} (meaning all lowercase
5808 letters) or @samp{0-9} (meaning all digits). To include a dash @samp{-}
5809 in @var{chars} or @var{replacement}, place it first or last in the
5810 entire string, or as the last character of a range. Back-to-back ranges
5811 can share a common endpoint. It is not an error for the last character
5812 in the range to be `larger' than the first. In that case, the range
5813 runs backwards, i.e., @samp{9-0} means the string @samp{9876543210}.
5814 The expansion of a range is dependent on the underlying encoding of
5815 characters, so using ranges is not always portable between machines.
5817 The macro @code{translit} is recognized only with parameters.
5821 translit(`GNUs not Unix', `A-Z')
5823 translit(`GNUs not Unix', `a-z', `A-Z')
5824 @result{}GNUS NOT UNIX
5825 translit(`GNUs not Unix', `A-Z', `z-a')
5826 @result{}tmfs not fnix
5827 translit(`+,-12345', `+--1-5', `<;>a-c-a')
5829 translit(`abcdef', `aabdef', `bcged')
5833 In the @sc{ascii} encoding, the first example deletes all uppercase
5834 letters, the second converts lowercase to uppercase, and the third
5835 `mirrors' all uppercase letters, while converting them to lowercase.
5836 The two first cases are by far the most common, even though they are not
5837 portable to @sc{ebcdic} or other encodings. The fourth example shows a
5838 range ending in @samp{-}, as well as back-to-back ranges. The final
5839 example shows that @samp{a} is mapped to @samp{b}, not @samp{c}; the
5840 resulting @samp{b} is not further remapped to @samp{g}; the @samp{d} and
5841 @samp{e} are swapped, and the @samp{f} is discarded.
5844 @comment No need to fight 8-bit characters, as it is difficult to get
5845 @comment rendering right in both info and dvi.
5848 translit(`«abc~', `~-»')
5852 @comment Stress test short arguments, since they use a different code
5855 translit(`abcdeabcde', `a')
5857 translit(`abcdeabcde', `ab')
5859 translit(`abcdeabcde', `a', `f')
5861 translit(`abcdeabcde', `a', `f')
5863 translit(`abcdeabcde', `a', `fg')
5865 translit(`abcdeabcde', `ab', `f')
5867 translit(`abcdeabcde', `ab', `fg')
5869 translit(`abcdeabcde', `ab', `ba')
5871 translit(`abcdeabcde', `e', `f')
5873 translit(`abc', `', `cde')
5875 translit(`', `a', `bc')
5880 Omitting @var{chars} evokes a warning, but still produces output.
5884 @error{}m4:stdin:1: Warning: too few arguments to builtin `translit'
5889 @section Substituting text by regular expression
5891 @cindex basic regular expressions
5892 @cindex regular expressions
5893 @cindex expressions, regular
5894 @cindex pattern substitution
5895 @cindex substitution by regular expression
5896 @cindex GNU extensions
5897 Global substitution in a string is done by @code{patsubst}:
5899 @deffn Builtin patsubst (@var{string}, @var{regexp}, @ovar{replacement})
5900 Searches @var{string} for matches of @var{regexp}, and substitutes
5901 @var{replacement} for each match. The syntax for regular expressions
5902 is the same as in GNU Emacs (@pxref{Regexp}).
5904 The parts of @var{string} that are not covered by any match of
5905 @var{regexp} are copied to the expansion. Whenever a match is found, the
5906 search proceeds from the end of the match, so a character from
5907 @var{string} will never be substituted twice. If @var{regexp} matches a
5908 string of zero length, the start position for the search is incremented,
5909 to avoid infinite loops.
5911 When a replacement is to be made, @var{replacement} is inserted into
5912 the expansion, with @samp{\@var{n}} substituted by the text matched by
5913 the @var{n}th parenthesized sub-expression of @var{patsubst}, for up to
5914 nine sub-expressions. The escape @samp{\&} is replaced by the text of
5915 the entire regular expression matched. For all other characters,
5916 @samp{\} treats the next character literally. A warning is issued if
5917 there were fewer sub-expressions than the @samp{\@var{n}} requested, or
5918 if there is a trailing @samp{\}.
5920 The @var{replacement} argument can be omitted, in which case the text
5921 matched by @var{regexp} is deleted.
5923 The macro @code{patsubst} is recognized only with parameters.
5927 patsubst(`GNUs not Unix', `^', `OBS: ')
5928 @result{}OBS: GNUs not Unix
5929 patsubst(`GNUs not Unix', `\<', `OBS: ')
5930 @result{}OBS: GNUs OBS: not OBS: Unix
5931 patsubst(`GNUs not Unix', `\w*', `(\&)')
5932 @result{}(GNUs)() (not)() (Unix)()
5933 patsubst(`GNUs not Unix', `\w+', `(\&)')
5934 @result{}(GNUs) (not) (Unix)
5935 patsubst(`GNUs not Unix', `[A-Z][a-z]+')
5936 @result{}GN not@w{ }
5937 patsubst(`GNUs not Unix', `not', `NOT\')
5938 @error{}m4:stdin:6: Warning: trailing \ ignored in replacement
5939 @result{}GNUs NOT Unix
5942 Here is a slightly more realistic example, which capitalizes individual
5943 words or whole sentences, by substituting calls of the macros
5944 @code{upcase} and @code{downcase} into the strings.
5946 @deffn Composite upcase (@var{text})
5947 @deffnx Composite downcase (@var{text})
5948 @deffnx Composite capitalize (@var{text})
5949 Expand to @var{text}, but with capitalization changed: @code{upcase}
5950 changes all letters to upper case, @code{downcase} changes all letters
5951 to lower case, and @code{capitalize} changes the first character of each
5952 word to upper case and the remaining characters to lower case.
5955 First, an example of their usage, using implementations distributed in
5956 @file{m4-@value{VERSION}/@/examples/@/capitalize.m4}.
5960 $ @kbd{m4 -I examples}
5961 include(`capitalize.m4')
5963 upcase(`GNUs not Unix')
5964 @result{}GNUS NOT UNIX
5965 downcase(`GNUs not Unix')
5966 @result{}gnus not unix
5967 capitalize(`GNUs not Unix')
5968 @result{}Gnus Not Unix
5971 Now for the implementation. There is a helper macro @code{_capitalize}
5972 which puts only its first word in mixed case. Then @code{capitalize}
5973 merely parses out the words, and replaces them with an invocation of
5974 @code{_capitalize}. (As presented here, the @code{capitalize} macro has
5975 some subtle flaws. You should try to see if you can find and correct
5976 them; or @pxref{Improved capitalize, , Answers}).
5980 $ @kbd{m4 -I examples}
5981 undivert(`capitalize.m4')dnl
5982 @result{}divert(`-1')
5983 @result{}# upcase(text)
5984 @result{}# downcase(text)
5985 @result{}# capitalize(text)
5986 @result{}# change case of text, simple version
5987 @result{}define(`upcase', `translit(`$*', `a-z', `A-Z')')
5988 @result{}define(`downcase', `translit(`$*', `A-Z', `a-z')')
5989 @result{}define(`_capitalize',
5990 @result{} `regexp(`$1', `^\(\w\)\(\w*\)',
5991 @result{} `upcase(`\1')`'downcase(`\2')')')
5992 @result{}define(`capitalize', `patsubst(`$1', `\w+', `_$0(`\&')')')
5993 @result{}divert`'dnl
5996 While @code{regexp} replaces the whole input with the replacement as
5997 soon as there is a match, @code{patsubst} replaces each
5998 @emph{occurrence} of a match and preserves non-matching pieces:
6004 patreg(`bar foo baz Foo', `foo\|Foo', `FOO')
6005 @result{}bar FOO baz FOO
6007 patreg(`aba abb 121', `\(.\)\(.\)\1', `\2\1\2')
6008 @result{}bab abb 212
6012 Omitting @var{regexp} evokes a warning, but still produces output;
6013 contrast this with an empty @var{regexp} argument.
6017 @error{}m4:stdin:1: Warning: too few arguments to builtin `patsubst'
6021 patsubst(`abc', `', `\\-')
6022 @result{}\-a\-b\-c\-
6026 @section Formatting strings (printf-like)
6028 @cindex formatted output
6029 @cindex output, formatted
6030 @cindex GNU extensions
6031 Formatted output can be made with @code{format}:
6033 @deffn Builtin format (@var{format-string}, @dots{})
6034 Works much like the C function @code{printf}. The first argument
6035 @var{format-string} can contain @samp{%} specifications which are
6036 satisfied by additional arguments, and the expansion of @code{format} is
6037 the formatted string.
6039 The macro @code{format} is recognized only with parameters.
6042 Its use is best described by a few examples:
6044 @comment This test is a bit fragile, if someone tries to port to a
6045 @comment platform without infinity.
6047 define(`foo', `The brown fox jumped over the lazy dog')
6049 format(`The string "%s" uses %d characters', foo, len(foo))
6050 @result{}The string "The brown fox jumped over the lazy dog" uses 38 characters
6051 format(`%*.*d', `-1', `-1', `1')
6053 format(`%.0f', `56789.9876')
6055 len(format(`%-*X', `5000', `1'))
6057 ifelse(format(`%010F', `infinity'), ` INF', `success',
6058 format(`%010F', `infinity'), ` INFINITY', `success',
6059 format(`%010F', `infinity'))
6061 ifelse(format(`%.1A', `1.999'), `0X1.0P+1', `success',
6062 format(`%.1A', `1.999'), `0X2.0P+0', `success',
6063 format(`%.1A', `1.999'))
6065 format(`%g', `0xa.P+1')
6069 Using the @code{forloop} macro defined earlier (@pxref{Forloop}), this
6070 example shows how @code{format} can be used to produce tabular output.
6074 $ @kbd{m4 -I examples}
6075 include(`forloop.m4')
6077 forloop(`i', `1', `10', `format(`%6d squared is %10d
6079 @result{} 1 squared is 1
6080 @result{} 2 squared is 4
6081 @result{} 3 squared is 9
6082 @result{} 4 squared is 16
6083 @result{} 5 squared is 25
6084 @result{} 6 squared is 36
6085 @result{} 7 squared is 49
6086 @result{} 8 squared is 64
6087 @result{} 9 squared is 81
6088 @result{} 10 squared is 100
6092 The builtin @code{format} is modeled after the ANSI C @samp{printf}
6093 function, and supports these @samp{%} specifiers: @samp{c}, @samp{s},
6094 @samp{d}, @samp{o}, @samp{x}, @samp{X}, @samp{u}, @samp{a}, @samp{A},
6095 @samp{e}, @samp{E}, @samp{f}, @samp{F}, @samp{g}, @samp{G}, and
6096 @samp{%}; it supports field widths and precisions, and the flags
6097 @samp{+}, @samp{-}, @samp{ }, @samp{0}, @samp{#}, and @samp{'}. For
6098 integer specifiers, the width modifiers @samp{hh}, @samp{h}, and
6099 @samp{l} are recognized, and for floating point specifiers, the width
6100 modifier @samp{l} is recognized. Items not yet supported include
6101 positional arguments, the @samp{n}, @samp{p}, @samp{S}, and @samp{C}
6102 specifiers, the @samp{z}, @samp{t}, @samp{j}, @samp{L} and @samp{ll}
6103 modifiers, and any platform extensions available in the native
6104 @code{printf}. For more details on the functioning of @code{printf},
6105 see the C Library Manual, or the POSIX specification (for
6106 example, @samp{%a} is supported even on platforms that haven't yet
6107 implemented C99 hexadecimal floating point output natively).
6109 Unrecognized specifiers result in a warning. It is anticipated that a
6110 future release of GNU @code{m4} will support more specifiers,
6111 and give better warnings when various problems such as overflow are
6112 encountered. Likewise, escape sequences are not yet recognized.
6116 @error{}m4:stdin:1: Warning: unrecognized specifier in `%p'
6121 @comment Expose a crash with a bad format string fixed in 1.4.15.
6122 @comment Unfortuntely, 8-bit bytes are hard to check for; but the
6123 @comment exit status is enough to sniff the crash in broken versions.
6125 @comment xerr: ignore
6127 format(`%'format(`%c', `128'))
6133 @chapter Macros for doing arithmetic
6136 @cindex integer arithmetic
6137 Integer arithmetic is included in @code{m4}, with a C-like syntax. As
6138 convenient shorthands, there are builtins for simple increment and
6139 decrement operations.
6142 * Incr:: Decrement and increment operators
6143 * Eval:: Evaluating integer expressions
6147 @section Decrement and increment operators
6149 @cindex decrement operator
6150 @cindex increment operator
6151 Increment and decrement of integers are supported using the builtins
6152 @code{incr} and @code{decr}:
6154 @deffn Builtin incr (@var{number})
6155 @deffnx Builtin decr (@var{number})
6156 Expand to the numerical value of @var{number}, incremented
6157 or decremented, respectively, by one. Except for the empty string, the
6158 expansion is empty if @var{number} could not be parsed.
6160 The macros @code{incr} and @code{decr} are recognized only with
6170 @error{}m4:stdin:3: empty string treated as 0 in builtin `incr'
6173 @error{}m4:stdin:4: empty string treated as 0 in builtin `decr'
6178 @section Evaluating integer expressions
6180 @cindex integer expression evaluation
6181 @cindex evaluation, of integer expressions
6182 @cindex expressions, evaluation of integer
6183 Integer expressions are evaluated with @code{eval}:
6185 @deffn Builtin eval (@var{expression}, @dvar{radix, 10}, @ovar{width})
6186 Expands to the value of @var{expression}. The expansion is empty
6187 if a problem is encountered while parsing the arguments. If specified,
6188 @var{radix} and @var{width} control the format of the output.
6190 Calculations are done with 32-bit signed numbers. Overflow silently
6191 results in wraparound. A warning is issued if division by zero is
6192 attempted, or if @var{expression} could not be parsed.
6194 Expressions can contain the following operators, listed in order of
6195 decreasing precedence.
6201 Unary plus and minus, and bitwise and logical negation
6205 Multiplication, division, and modulo
6207 Addition and subtraction
6211 Relational operators
6217 Bitwise exclusive-or
6226 The macro @code{eval} is recognized only with parameters.
6229 All binary operators, except exponentiation, are left associative. C
6230 operators that perform variable assignment, such as @samp{+=} or
6231 @samp{--}, are not implemented, since @code{eval} only operates on
6232 constants, not variables. Attempting to use them results in an error.
6233 However, since traditional implementations treated @samp{=} as an
6234 undocumented alias for @samp{==} as opposed to an assignment operator,
6235 this usage is supported as a special case. Be aware that a future
6236 version of GNU M4 may support assignment semantics as an
6237 extension when POSIX mode is not requested, and that using
6238 @samp{=} to check equality is not portable.
6243 @error{}m4:stdin:1: Warning: recommend ==, not =, for equality operator
6246 @error{}m4:stdin:2: invalid operator in eval: ++0
6249 @error{}m4:stdin:3: invalid operator in eval: 0 |= 1
6253 Note that some older @code{m4} implementations use @samp{^} as an
6254 alternate operator for the exponentiation, although POSIX
6255 requires the C behavior of bitwise exclusive-or. The precedence of the
6256 negation operators, @samp{~} and @samp{!}, was traditionally lower than
6257 equality. The unary operators could not be used reliably more than once
6258 on the same term without intervening parentheses. The traditional
6259 precedence of the equality operators @samp{==} and @samp{!=} was
6260 identical instead of lower than the relational operators such as
6261 @samp{<}, even through GNU M4 1.4.8. Starting with version
6262 1.4.9, GNU M4 correctly follows POSIX precedence
6263 rules. M4 scripts designed to be portable between releases must be
6264 aware that parentheses may be required to enforce C precedence rules.
6265 Likewise, division by zero, even in the unused branch of a
6266 short-circuiting operator, is not always well-defined in other
6269 Following are some examples where the current version of M4 follows C
6270 precedence rules, but where older versions and some other
6271 implementations of @code{m4} require explicit parentheses to get the
6277 eval(`(1 == 2) > 0')
6287 eval(`+ + - ~ ! ~ 0')
6292 @error{}m4:stdin:9: divide by zero in eval: 0 || 1 / 0
6297 @error{}m4:stdin:11: modulo by zero in eval: 2 && 1 % 0
6301 @cindex GNU extensions
6302 As a GNU extension, the operator @samp{**} performs integral
6303 exponentiation. The operator is right-associative, and if evaluated,
6304 the exponent must be non-negative, and at least one of the arguments
6305 must be non-zero, or a warning is issued.
6310 eval(`(2 ** 3) ** 2')
6318 @error{}m4:stdin:5: divide by zero in eval: 0 ** 0
6320 @error{}m4:stdin:6: negative exponent in eval: 4 ** -2
6324 Within @var{expression}, (but not @var{radix} or @var{width}), numbers
6325 without a special prefix are decimal. A simple @samp{0} prefix
6326 introduces an octal number. @samp{0x} introduces a hexadecimal number.
6327 As GNU extensions, @samp{0b} introduces a binary number.
6328 @samp{0r} introduces a number expressed in any radix between 1 and 36:
6329 the prefix should be immediately followed by the decimal expression of
6330 the radix, a colon, then the digits making the number. For radix 1,
6331 leading zeros are ignored, and all remaining digits must be @samp{1};
6332 for all other radices, the digits are @samp{0}, @samp{1}, @samp{2},
6333 @dots{}. Beyond @samp{9}, the digits are @samp{a}, @samp{b} @dots{} up
6334 to @samp{z}. Lower and upper case letters can be used interchangeably
6335 in numbers prefixes and as number digits.
6337 Parentheses may be used to group subexpressions whenever needed. For the
6338 relational operators, a true relation returns @code{1}, and a false
6339 relation return @code{0}.
6341 Here are a few examples of use of @code{eval}.
6352 eval(index(`Hello world', `llo') >= 0)
6354 eval(`0r1:0111 + 0b100 + 0r3:12')
6356 define(`square', `eval(`($1) ** 2')')
6360 square(square(`5')` + 1')
6362 define(`foo', `666')
6365 @error{}m4:stdin:11: bad expression in eval: foo / 6
6371 As the last two lines show, @code{eval} does not handle macro
6372 names, even if they expand to a valid expression (or part of a valid
6373 expression). Therefore all macros must be expanded before they are
6374 passed to @code{eval}.
6376 Some calculations are not portable to other implementations, since they
6377 have undefined semantics in C, but GNU @code{m4} has
6378 well-defined behavior on overflow. When shifting, an out-of-range shift
6379 amount is implicitly brought into the range of 32-bit signed integers
6380 using an implicit bit-wise and with 0x1f).
6383 define(`max_int', eval(`0x7fffffff'))
6385 define(`min_int', incr(max_int))
6391 ifelse(eval(min_int` / -1'), min_int, `overflow occurred')
6392 @result{}overflow occurred
6394 @result{}-2147483648
6395 eval(`0x80000000 % -1')
6403 If @var{radix} is specified, it specifies the radix to be used in the
6404 expansion. The default radix is 10; this is also the case if
6405 @var{radix} is the empty string. A warning results if the radix is
6406 outside the range of 1 through 36, inclusive. The result of @code{eval}
6407 is always taken to be signed. No radix prefix is output, and for
6408 radices greater than 10, the digits are lower case. The @var{width}
6409 argument specifies the minimum output width, excluding any negative
6410 sign. The result is zero-padded to extend the expansion to the
6411 requested width. A warning results if the width is negative. If
6412 @var{radix} or @var{width} is out of bounds, the expansion of
6413 @code{eval} is empty.
6422 eval(`666', `6', `10')
6424 eval(`-666', `6', `10')
6425 @result{}-0000003030
6428 `0r1:'eval(`10', `1', `11')
6429 @result{}0r1:01111111111
6433 @error{}m4:stdin:9: radix 37 in builtin `eval' out of range
6436 @error{}m4:stdin:10: negative width to builtin `eval'
6439 @error{}m4:stdin:11: empty string treated as 0 in builtin `eval'
6443 @node Shell commands
6444 @chapter Macros for running shell commands
6446 @cindex UNIX commands, running
6447 @cindex executing shell commands
6448 @cindex running shell commands
6449 @cindex shell commands, running
6450 @cindex commands, running shell
6451 There are a few builtin macros in @code{m4} that allow you to run shell
6452 commands from within @code{m4}.
6454 Note that the definition of a valid shell command is system dependent.
6455 On UNIX systems, this is the typical @command{/bin/sh}. But on other
6456 systems, such as native Windows, the shell has a different syntax of
6457 commands that it understands. Some examples in this chapter assume
6458 @command{/bin/sh}, and also demonstrate how to quit early with a known
6459 exit value if this is not the case.
6462 * Platform macros:: Determining the platform
6463 * Syscmd:: Executing simple commands
6464 * Esyscmd:: Reading the output of commands
6465 * Sysval:: Exit status
6466 * Mkstemp:: Making temporary files
6469 @node Platform macros
6470 @section Determining the platform
6472 @cindex platform macros
6473 Sometimes it is desirable for an input file to know which platform
6474 @code{m4} is running on. GNU @code{m4} provides several
6475 macros that are predefined to expand to the empty string; checking for
6476 their existence will confirm platform details.
6478 @deffn {Optional builtin} __gnu__
6479 @deffnx {Optional builtin} __os2__
6480 @deffnx {Optional builtin} os2
6481 @deffnx {Optional builtin} __unix__
6482 @deffnx {Optional builtin} unix
6483 @deffnx {Optional builtin} __windows__
6484 @deffnx {Optional builtin} windows
6485 Each of these macros is conditionally defined as needed to describe the
6486 environment of @code{m4}. If defined, each macro expands to the empty
6487 string. For now, these macros silently ignore all arguments, but in a
6488 future release of M4, they might warn if arguments are present.
6491 When GNU extensions are in effect (that is, when you did not
6492 use the @option{-G} option, @pxref{Limits control, , Invoking m4}),
6493 GNU @code{m4} will define the macro @code{@w{__gnu__}} to
6494 expand to the empty string.
6502 Extensions are ifdef(`__gnu__', `active', `inactive')
6503 @result{}Extensions are active
6506 @comment options: -G
6512 @result{}__gnu__(ignored)
6513 Extensions are ifdef(`__gnu__', `active', `inactive')
6514 @result{}Extensions are inactive
6517 On UNIX systems, GNU @code{m4} will define @code{@w{__unix__}}
6518 by default, or @code{unix} when the @option{-G} option is specified.
6520 On native Windows systems, GNU @code{m4} will define
6521 @code{@w{__windows__}} by default, or @code{windows} when the
6522 @option{-G} option is specified.
6524 On OS/2 systems, GNU @code{m4} will define @code{@w{__os2__}}
6525 by default, or @code{os2} when the @option{-G} option is specified.
6527 If GNU @code{m4} does not provide a platform macro for your system,
6528 please report that as a bug.
6531 define(`provided', `0')
6533 ifdef(`__unix__', `define(`provided', incr(provided))')
6535 ifdef(`__windows__', `define(`provided', incr(provided))')
6537 ifdef(`__os2__', `define(`provided', incr(provided))')
6544 @section Executing simple commands
6546 Any shell command can be executed, using @code{syscmd}:
6548 @deffn Builtin syscmd (@var{shell-command})
6549 Executes @var{shell-command} as a shell command.
6551 The expansion of @code{syscmd} is void, @emph{not} the output from
6552 @var{shell-command}! Output or error messages from @var{shell-command}
6553 are not read by @code{m4}. @xref{Esyscmd}, if you need to process the
6556 Prior to executing the command, @code{m4} flushes its buffers.
6557 The default standard input, output and error of @var{shell-command} are
6558 the same as those of @code{m4}.
6560 By default, the @var{shell-command} will be used as the argument to the
6561 @option{-c} option of the @command{/bin/sh} shell (or the version of
6562 @command{sh} specified by @samp{command -p getconf PATH}, if your system
6563 supports that). If you prefer a different shell, the
6564 @command{configure} script can be given the option
6565 @option{--with-syscmd-shell=@var{location}} to set the location of an
6566 alternative shell at GNU @code{m4} installation; the
6567 alternative shell must still support @option{-c}.
6569 The macro @code{syscmd} is recognized only with parameters.
6573 define(`foo', `FOO')
6580 Note how the expansion of @code{syscmd} keeps the trailing newline of
6581 the command, as well as using the newline that appeared after the macro.
6583 The following is an example of @var{shell-command} using the same
6584 standard input as @code{m4}:
6588 $ @kbd{echo "m4wrap(\`syscmd(\`cat')')" | m4}
6593 @comment If the user types the example below with stdin being an
6594 @comment interactive terminal, then cat will hang waiting for additional
6595 @comment input after m4 has exited. But the testsuite is using a pipe
6596 @comment for stdin. Hence, we have two versions - the one we feed the
6597 @comment testsuite below, and the one we display to the user above that
6598 @comment more accurately shows what the testsuite is really doing but
6599 @comment which the testsuite cannot parse.
6602 m4wrap(`syscmd(`cat')')
6608 It tells @code{m4} to read all of its input before executing the wrapped
6609 text, then hand a valid (albeit emptied) pipe as standard input for the
6610 @code{cat} subcommand. Therefore, you should be careful when using
6611 standard input (either by specifying no files, or by passing @samp{-} as
6612 a file name on the command line, @pxref{Command line files, , Invoking
6613 m4}), and also invoking subcommands via @code{syscmd} or @code{esyscmd}
6614 that consume data from standard input. When standard input is a
6615 seekable file, the subprocess will pick up with the next character not
6616 yet processed by @code{m4}; when it is a pipe or other non-seekable
6617 file, there is no guarantee how much data will already be buffered by
6618 @code{m4} and thus unavailable to the child.
6621 @section Reading the output of commands
6623 @cindex GNU extensions
6624 If you want @code{m4} to read the output of a shell command, use
6627 @deffn Builtin esyscmd (@var{shell-command})
6628 Expands to the standard output of the shell command
6629 @var{shell-command}.
6631 Prior to executing the command, @code{m4} flushes its buffers.
6632 The default standard input and standard error of @var{shell-command} are
6633 the same as those of @code{m4}. The error output of @var{shell-command}
6634 is not a part of the expansion: it will appear along with the error
6635 output of @code{m4}.
6637 By default, the @var{shell-command} will be used as the argument to the
6638 @option{-c} option of the @command{/bin/sh} shell (or the version of
6639 @command{sh} specified by @samp{command -p getconf PATH}, if your system
6640 supports that). If you prefer a different shell, the
6641 @command{configure} script can be given the option
6642 @option{--with-syscmd-shell=@var{location}} to set the location of an
6643 alternative shell at GNU @code{m4} installation; the
6644 alternative shell must still support @option{-c}.
6646 The macro @code{esyscmd} is recognized only with parameters.
6650 define(`foo', `FOO')
6657 Note how the expansion of @code{esyscmd} keeps the trailing newline of
6658 the command, as well as using the newline that appeared after the macro.
6660 Just as with @code{syscmd}, care must be exercised when sharing standard
6661 input between @code{m4} and the child process of @code{esyscmd}.
6664 @section Exit status
6666 @cindex UNIX commands, exit status from
6667 @cindex exit status from shell commands
6668 @cindex shell commands, exit status from
6669 @cindex commands, exit status from shell
6670 @cindex status of shell commands
6671 To see whether a shell command succeeded, use @code{sysval}:
6673 @deffn Builtin sysval
6674 Expands to the exit status of the last shell command run with
6675 @code{syscmd} or @code{esyscmd}. Expands to 0 if no command has been
6684 ifelse(sysval, `0', `zero', `non-zero')
6696 ifelse(sysval, `0', `zero', `non-zero')
6698 esyscmd(`echo dnl && exit 127')
6708 @code{sysval} results in 127 if there was a problem executing the
6709 command, for example, if the system-imposed argument length is exceeded,
6710 or if there were not enough resources to fork. It is not possible to
6711 distinguish between failed execution and successful execution that had
6712 an exit status of 127, unless there was output from the child process.
6714 On UNIX platforms, where it is possible to detect when command execution
6715 is terminated by a signal, rather than a normal exit, the result is the
6716 signal number shifted left by eight bits.
6718 @comment This test has difficulties being portable, even on platforms
6719 @comment where syscmd invokes /bin/sh. Kill is not portable with signal
6720 @comment names. According to autoconf, the only portable signal numbers
6721 @comment are 1 (HUP), 2 (INT), 9 (KILL), 13 (PIPE) and 15 (TERM). But
6722 @comment all shells handle SIGINT, and ksh handles HUP (as in, the shell
6723 @comment exits normally rather than letting the signal terminate it).
6724 @comment Also, TERM is flaky, as it can also kill the running m4 on
6725 @comment systems where /bin/sh does not create its own process group.
6726 @comment And PIPE is unreliable, since people tend to run with it
6727 @comment ignored, with m4 inheriting that choice. That leaves KILL as
6728 @comment the only signal we can reliably test.
6730 dnl This test assumes kill is a shell builtin, and that signals are
6733 `errprint(` skipping: syscmd does not have unix semantics
6735 syscmd(`kill -9 $$')
6743 esyscmd(`kill -9 $$')
6750 @section Making temporary files
6752 @cindex temporary file names
6753 @cindex files, names of temporary
6754 Commands specified to @code{syscmd} or @code{esyscmd} might need a
6755 temporary file, for output or for some other purpose. There is a
6756 builtin macro, @code{mkstemp}, for making a temporary file:
6758 @deffn Builtin mkstemp (@var{template})
6759 @deffnx Builtin maketemp (@var{template})
6760 Expands to the quoted name of a new, empty file, made from the string
6761 @var{template}, which should end with the string @samp{XXXXXX}. The six
6762 @samp{X} characters are then replaced with random characters matching
6763 the regular expression @samp{[a-zA-Z0-9._-]}, in order to make the file
6764 name unique. If fewer than six @samp{X} characters are found at the end
6765 of @code{template}, the result will be longer than the template. The
6766 created file will have access permissions as if by @kbd{chmod =rw,go=},
6767 meaning that the current umask of the @code{m4} process is taken into
6768 account, and at most only the current user can read and write the file.
6770 The traditional behavior, standardized by POSIX, is that
6771 @code{maketemp} merely replaces the trailing @samp{X} with the process
6772 id, without creating a file or quoting the expansion, and without
6773 ensuring that the resulting
6774 string is a unique file name. In part, this means that using the same
6775 @var{template} twice in the same input file will result in the same
6776 expansion. This behavior is a security hole, as it is very easy for
6777 another process to guess the name that will be generated, and thus
6778 interfere with a subsequent use of @code{syscmd} trying to manipulate
6779 that file name. Hence, POSIX has recommended that all new
6780 implementations of @code{m4} provide the secure @code{mkstemp} builtin,
6781 and that users of @code{m4} check for its existence.
6783 The expansion is void and an error issued if a temporary file could
6786 The macros @code{mkstemp} and @code{maketemp} are recognized only with
6790 If you try this next example, you will most likely get different output
6791 for the two file names, since the replacement characters are randomly
6797 define(`tmp', `oops')
6799 maketemp(`/tmp/fooXXXXXX')
6800 @result{}/tmp/fooa07346
6801 ifdef(`mkstemp', `define(`maketemp', defn(`mkstemp'))',
6802 `define(`mkstemp', defn(`maketemp'))dnl
6803 errprint(`warning: potentially insecure maketemp implementation
6810 @cindex GNU extensions
6811 Unless you use the @option{--traditional} command line option (or
6812 @option{-G}, @pxref{Limits control, , Invoking m4}), the GNU
6813 version of @code{maketemp} is secure. This means that using the same
6814 template to multiple calls will generate multiple files. However, we
6815 recommend that you use the new @code{mkstemp} macro, introduced in
6816 GNU M4 1.4.8, which is secure even in traditional mode. Also,
6817 as of M4 1.4.11, the secure implementation quotes the resulting file
6818 name, so that you are guaranteed to know what file was created even if
6819 the random file name happens to match an existing macro. Notice that
6820 this example is careful to use @code{defn} to avoid unintended expansion
6825 define(`foo', `errprint(`oops')')
6827 syscmd(`rm -f foo-??????')sysval
6829 define(`file1', maketemp(`foo-XXXXXX'))dnl
6830 ifelse(esyscmd(`echo \` foo-?????? \''), ` foo-?????? ',
6831 `no file', `created')
6833 define(`file2', maketemp(`foo-XX'))dnl
6834 define(`file3', mkstemp(`foo-XXXXXX'))dnl
6835 ifelse(len(defn(`file1')), len(defn(`file2')),
6836 `same length', `different')
6837 @result{}same length
6838 ifelse(defn(`file1'), defn(`file2'), `same', `different file')
6839 @result{}different file
6840 ifelse(defn(`file2'), defn(`file3'), `same', `different file')
6841 @result{}different file
6842 ifelse(defn(`file1'), defn(`file3'), `same', `different file')
6843 @result{}different file
6844 syscmd(`rm 'defn(`file1') defn(`file2') defn(`file3'))
6851 @c Not worth documenting, but make sure we don't leave trailing NUL in
6855 syscmd(`rm -rf foodir')sysval
6857 syscmd(`mkdir foodir')sysval
6859 len(mkstemp(`foodir/fooXXXXX'))
6861 syscmd(`rm -r foodir')sysval
6865 @c Likewise, and ensure that traditional mode leaves the result unquoted
6866 @c without creating a file.
6868 @comment options: -G
6870 syscmd(`rm -f foo-*')sysval
6872 len(maketemp(`foo-XXXXX'))
6873 @error{}m4:stdin:2: recommend using mkstemp instead
6875 define(`abc', `def')
6879 @error{}m4:stdin:4: recommend using mkstemp instead
6880 syscmd(`test -f foo-*')ifelse(sysval, `0', `0', `1')
6886 @chapter Miscellaneous builtin macros
6888 This chapter describes various builtins, that do not really belong in
6889 any of the previous chapters.
6892 * Errprint:: Printing error messages
6893 * Location:: Printing current location
6894 * M4exit:: Exiting from @code{m4}
6898 @section Printing error messages
6900 @cindex printing error messages
6901 @cindex error messages, printing
6902 @cindex messages, printing error
6903 @cindex standard error, output to
6904 You can print error messages using @code{errprint}:
6906 @deffn Builtin errprint (@var{message}, @dots{})
6907 Prints @var{message} and the rest of the arguments to standard error,
6908 separated by spaces. Standard error is used, regardless of the
6909 @option{--debugfile} option (@pxref{Debugging options, , Invoking m4}).
6911 The expansion of @code{errprint} is void.
6912 The macro @code{errprint} is recognized only with parameters.
6916 errprint(`Invalid arguments to forloop
6918 @error{}Invalid arguments to forloop
6920 errprint(`1')errprint(`2',`3
6926 A trailing newline is @emph{not} printed automatically, so it should be
6927 supplied as part of the argument, as in the example. Unfortunately, the
6928 exact output of @code{errprint} is not very portable to other @code{m4}
6929 implementations: POSIX requires that all arguments be printed,
6930 but some implementations of @code{m4} only print the first.
6931 Furthermore, some BSD implementations always append a newline
6932 for each @code{errprint} call, regardless of whether the last argument
6933 already had one, and POSIX is silent on whether this is
6937 @section Printing current location
6939 @cindex location, input
6940 @cindex input location
6941 To make it possible to specify the location of an error, three
6942 utility builtins exist:
6944 @deffn Builtin __file__
6945 @deffnx Builtin __line__
6946 @deffnx Builtin __program__
6947 Expand to the quoted name of the current input file, the
6948 current input line number in that file, and the quoted name of the
6949 current invocation of @code{m4}.
6953 errprint(__program__:__file__:__line__: `input error
6955 @error{}m4:stdin:1: input error
6959 Line numbers start at 1 for each file. If the file was found due to the
6960 @option{-I} option or @env{M4PATH} environment variable, that is
6961 reflected in the file name. The syncline option (@option{-s},
6962 @pxref{Preprocessor features, , Invoking m4}), and the
6963 @samp{f} and @samp{l} flags of @code{debugmode} (@pxref{Debug Levels}),
6964 also use this notion of current file and line. Redefining the three
6965 location macros has no effect on syncline, debug, warning, or error
6968 This example reuses the file @file{incl.m4} mentioned earlier
6973 $ @kbd{m4 -I examples}
6974 define(`foo', ``$0' called at __file__:__line__')
6977 @result{}foo called at stdin:2
6979 @result{}Include file start
6980 @result{}foo called at examples/incl.m4:2
6981 @result{}Include file end
6985 The location of macros invoked during the rescanning of macro expansion
6986 text corresponds to the location in the file where the expansion was
6987 triggered, regardless of how many newline characters the expansion text
6988 contains. As of GNU M4 1.4.8, the location of text wrapped
6989 with @code{m4wrap} (@pxref{M4wrap}) is the point at which the
6990 @code{m4wrap} was invoked. Previous versions, however, behaved as
6991 though wrapped text came from line 0 of the file ``''.
6994 define(`echo', `$@@')
6996 define(`foo', `echo(__line__
7006 foo(errprint(__line__
7024 The @code{@w{__program__}} macro behaves like @samp{$0} in shell
7025 terminology. If you invoke @code{m4} through an absolute path or a link
7026 with a different spelling, rather than by relying on a @env{PATH} search
7027 for plain @samp{m4}, it will affect how @code{@w{__program__}} expands.
7028 The intent is that you can use it to produce error messages with the
7029 same formatting that @code{m4} produces internally. It can also be used
7030 within @code{syscmd} (@pxref{Syscmd}) to pick the same version of
7031 @code{m4} that is currently running, rather than whatever version of
7032 @code{m4} happens to be first in @env{PATH}. It was first introduced in
7036 @section Exiting from @code{m4}
7038 @cindex exiting from @code{m4}
7039 @cindex status, setting @code{m4} exit
7040 If you need to exit from @code{m4} before the entire input has been
7041 read, you can use @code{m4exit}:
7043 @deffn Builtin m4exit (@dvar{code, 0})
7044 Causes @code{m4} to exit, with exit status @var{code}. If @var{code} is
7045 left out, the exit status is zero. If @var{code} cannot be parsed, or
7046 is outside the range of 0 to 255, the exit status is one. No further
7047 input is read, and all wrapped and diverted text is discarded.
7051 m4wrap(`This text is lost due to `m4exit'.')
7053 divert(`1') So is this.
7056 m4exit And this is never read.
7059 A common use of this is to abort processing:
7061 @deffn Composite fatal_error (@var{message})
7062 Abort processing with an error message and non-zero status. Prefix
7063 @var{message} with details about where the error occurred, and print the
7064 resulting string to standard error.
7069 define(`fatal_error',
7070 `errprint(__program__:__file__:__line__`: fatal error: $*
7073 fatal_error(`this is a BAD one, buster')
7074 @error{}m4:stdin:4: fatal error: this is a BAD one, buster
7077 After this macro call, @code{m4} will exit with exit status 1. This macro
7078 is only intended for error exits, since the normal exit procedures are
7079 not followed, i.e., diverted text is not undiverted, and saved text
7080 (@pxref{M4wrap}) is not reread. (This macro could be made more robust
7081 to earlier versions of @code{m4}. You should try to see if you can find
7082 weaknesses and correct them; or @pxref{Improved fatal_error, , Answers}).
7084 Note that it is still possible for the exit status to be different than
7085 what was requested by @code{m4exit}. If @code{m4} detects some other
7086 error, such as a write error on standard output, the exit status will be
7087 non-zero even if @code{m4exit} requested zero.
7089 If standard input is seekable, then the file will be positioned at the
7090 next unread character. If it is a pipe or other non-seekable file,
7091 then there are no guarantees how much data @code{m4} might have read
7092 into buffers, and thus discarded.
7095 @chapter Fast loading of frozen state
7097 Some bigger @code{m4} applications may be built over a common base
7098 containing hundreds of definitions and other costly initializations.
7099 Usually, the common base is kept in one or more declarative files,
7100 which files are listed on each @code{m4} invocation prior to the
7101 user's input file, or else each input file uses @code{include}.
7103 Reading the common base of a big application, over and over again, may
7104 be time consuming. GNU @code{m4} offers some machinery to
7105 speed up the start of an application using lengthy common bases.
7108 * Using frozen files:: Using frozen files
7109 * Frozen file format:: Frozen file format
7112 @node Using frozen files
7113 @section Using frozen files
7115 @cindex fast loading of frozen files
7116 @cindex frozen files for fast loading
7117 @cindex initialization, frozen state
7118 @cindex dumping into frozen file
7119 @cindex reloading a frozen file
7120 @cindex GNU extensions
7121 Suppose a user has a library of @code{m4} initializations in
7122 @file{base.m4}, which is then used with multiple input files:
7126 $ @kbd{m4 base.m4 input1.m4}
7127 $ @kbd{m4 base.m4 input2.m4}
7128 $ @kbd{m4 base.m4 input3.m4}
7131 Rather than spending time parsing the fixed contents of @file{base.m4}
7132 every time, the user might rather execute:
7136 $ @kbd{m4 -F base.m4f base.m4}
7140 once, and further execute, as often as needed:
7144 $ @kbd{m4 -R base.m4f input1.m4}
7145 $ @kbd{m4 -R base.m4f input2.m4}
7146 $ @kbd{m4 -R base.m4f input3.m4}
7150 with the varying input. The first call, containing the @option{-F}
7151 option, only reads and executes file @file{base.m4}, defining
7152 various application macros and computing other initializations.
7153 Once the input file @file{base.m4} has been completely processed, GNU
7154 @code{m4} produces in @file{base.m4f} a @dfn{frozen} file, that is, a
7155 file which contains a kind of snapshot of the @code{m4} internal state.
7157 Later calls, containing the @option{-R} option, are able to reload
7158 the internal state of @code{m4}, from @file{base.m4f},
7159 @emph{prior} to reading any other input files. This means
7160 instead of starting with a virgin copy of @code{m4}, input will be
7161 read after having effectively recovered the effect of a prior run.
7162 In our example, the effect is the same as if file @file{base.m4} has
7163 been read anew. However, this effect is achieved a lot faster.
7165 Only one frozen file may be created or read in any one @code{m4}
7166 invocation. It is not possible to recover two frozen files at once.
7167 However, frozen files may be updated incrementally, through using
7168 @option{-R} and @option{-F} options simultaneously. For example, if
7169 some care is taken, the command:
7173 $ @kbd{m4 file1.m4 file2.m4 file3.m4 file4.m4}
7177 could be broken down in the following sequence, accumulating the same
7182 $ @kbd{m4 -F file1.m4f file1.m4}
7183 $ @kbd{m4 -R file1.m4f -F file2.m4f file2.m4}
7184 $ @kbd{m4 -R file2.m4f -F file3.m4f file3.m4}
7185 $ @kbd{m4 -R file3.m4f file4.m4}
7188 Some care is necessary because not every effort has been made for
7189 this to work in all cases. In particular, the trace attribute of
7190 macros is not handled, nor the current setting of @code{changeword}.
7191 Currently, @code{m4wrap} and @code{sysval} also have problems.
7192 Also, interactions for some options of @code{m4}, being used in one call
7193 and not in the next, have not been fully analyzed yet. On the other
7194 end, you may be confident that stacks of @code{pushdef} definitions
7195 are handled correctly, as well as undefined or renamed builtins, and
7196 changed strings for quotes or comments. And future releases of
7197 GNU M4 will improve on the utility of frozen files.
7200 @c This example is not worth putting in the manual, but caused core
7201 @c dumps in all versions prior to 1.4.11.
7203 @comment options: -F /dev/null
7205 traceon(`undefined')dnl
7208 @c Make sure freezing is successful.
7212 `errprint(` skipping: syscmd does not have unix semantics
7214 changequote(`[', `]')dnl
7215 syscmd([echo 'changequote([,])pushdef([divnum],[hi])dnl' \
7216 | ']__program__[' -F in.m4f \
7217 && echo 'divnum popdef([divnum])divnum' \
7218 | ']__program__[' -R in.m4f \
7219 && rm in.m4f])status sysval
7224 @c Detect inability to freeze.
7225 @c Some systems harden /, and fail with EACCES rather than ENOENT.
7227 @comment options: -F /none/such
7228 @comment xerr: ignore
7231 $ @kbd{m4 -F /none/such}
7233 @error{}m4: cannot open `/none/such': No such file or directory
7237 When an @code{m4} run is to be frozen, the automatic undiversion
7238 which takes place at end of execution is inhibited. Instead, all
7239 positively numbered diversions are saved into the frozen file.
7240 The active diversion number is also transmitted.
7242 A frozen file to be reloaded need not reside in the current directory.
7243 It is looked up the same way as an @code{include} file (@pxref{Search
7246 If the frozen file was generated with a newer version of @code{m4}, and
7247 contains directives that an older @code{m4} cannot parse, attempting to
7248 load the frozen file with option @option{-R} will cause @code{m4} to
7249 exit with status 63 to indicate version mismatch.
7251 @node Frozen file format
7252 @section Frozen file format
7254 @cindex frozen file format
7255 @cindex file format, frozen file
7256 Frozen files are sharable across architectures. It is safe to write
7257 a frozen file on one machine and read it on another, given that the
7258 second machine uses the same or newer version of GNU @code{m4}.
7259 It is conventional, but not required, to give a frozen file the suffix
7262 These are simple (editable) text files, made up of directives,
7263 each starting with a capital letter and ending with a newline
7264 (@key{NL}). Wherever a directive is expected, the character
7265 @samp{#} introduces a comment line; empty lines are also ignored if they
7266 are not part of an embedded string.
7267 In the following descriptions, each @var{len} refers to the length of
7268 the corresponding strings @var{str} in the next line of input. Numbers
7269 are always expressed in decimal. There are no escape characters. The
7273 @item C @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
7274 Uses @var{str1} and @var{str2} as the begin-comment and
7275 end-comment strings. If omitted, then @samp{#} and @key{NL} are the
7278 @item D @var{number}, @var{len} @key{NL} @var{str} @key{NL}
7279 Selects diversion @var{number}, making it current, then copy
7280 @var{str} in the current diversion. @var{number} may be a negative
7281 number for a non-existing diversion. To merely specify an active
7282 selection, use this command with an empty @var{str}. With 0 as the
7283 diversion @var{number}, @var{str} will be issued on standard output
7284 at reload time. GNU @code{m4} will not produce the @samp{D}
7285 directive with non-zero length for diversion 0, but this can be done
7286 with manual edits. This directive may
7287 appear more than once for the same diversion, in which case the
7288 diversion is the concatenation of the various uses. If omitted, then
7289 diversion 0 is current.
7291 @item F @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
7292 Defines, through @code{pushdef}, a definition for @var{str1}
7293 expanding to the function whose builtin name is @var{str2}. If the
7294 builtin does not exist (for example, if the frozen file was produced by
7295 a copy of @code{m4} compiled with changeword support, but the version
7296 of @code{m4} reloading was compiled without it), the reload is silent,
7297 but any subsequent use of the definition of @var{str1} will result in
7298 a warning. This directive may appear more than once for the same name,
7299 and its order, along with @samp{T}, is important. If omitted, you will
7300 have no access to any builtins.
7302 @item Q @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
7303 Uses @var{str1} and @var{str2} as the begin-quote and end-quote
7304 strings. If omitted, then @samp{`} and @samp{'} are the quote
7307 @item T @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
7308 Defines, though @code{pushdef}, a definition for @var{str1}
7309 expanding to the text given by @var{str2}. This directive may appear
7310 more than once for the same name, and its order, along with @samp{F}, is
7313 @item V @var{number} @key{NL}
7314 Confirms the format of the file. @code{m4} @value{VERSION} only creates
7315 and understands frozen files where @var{number} is 1. This directive
7316 must be the first non-comment in the file, and may not appear more than
7321 @chapter Compatibility with other versions of @code{m4}
7323 @cindex compatibility
7324 This chapter describes the many of the differences between this
7325 implementation of @code{m4}, and of other implementations found under
7326 UNIX, such as System V Release 3, Solaris, and BSD flavors.
7327 In particular, it lists the known differences and extensions to
7328 POSIX. However, the list is not necessarily comprehensive.
7330 At the time of this writing, POSIX 2001 (also known as IEEE
7331 Std 1003.1-2001) is the latest standard, although a new version of
7332 POSIX is under development and includes several proposals for
7333 modifying what @code{m4} is required to do. The requirements for
7334 @code{m4} are shared between SUSv3 and POSIX, and
7336 @uref{http://www.opengroup.org/onlinepubs/@/000095399/@/utilities/@/m4.html}.
7339 * Extensions:: Extensions in GNU M4
7340 * Incompatibilities:: Facilities in System V m4 not in GNU M4
7341 * Other Incompatibilities:: Other incompatibilities
7345 @section Extensions in GNU M4
7347 @cindex GNU extensions
7349 This version of @code{m4} contains a few facilities that do not exist
7350 in System V @code{m4}. These extra facilities are all suppressed by
7351 using the @option{-G} command line option (@pxref{Limits control, ,
7352 Invoking m4}), unless overridden by other command line options.
7356 In the @code{$@var{n}} notation for macro arguments, @var{n} can contain
7357 several digits, while the System V @code{m4} only accepts one digit.
7358 This allows macros in GNU @code{m4} to take any number of
7359 arguments, and not only nine (@pxref{Arguments}).
7361 This means that @code{define(`foo', `$11')} is ambiguous between
7362 implementations. To portably choose between grabbing the first
7363 parameter and appending 1 to the expansion, or grabbing the eleventh
7364 parameter, you can do the following:
7369 dnl First argument, concatenated with 1
7370 define(`_1', `$1')define(`first1', `_1($@@)1')
7372 dnl Eleventh argument, portable
7373 define(`_9', `$9')define(`eleventh', `_9(shift(shift($@@)))')
7375 dnl Eleventh argument, GNU style
7376 define(`Eleventh', `$11')
7378 first1(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k')
7380 eleventh(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k')
7382 Eleventh(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k')
7387 Also see the @code{argn} macro (@pxref{Shift}).
7390 The @code{divert} (@pxref{Divert}) macro can manage more than 9
7391 diversions. GNU @code{m4} treats all positive numbers as valid
7392 diversions, rather than discarding diversions greater than 9.
7395 Files included with @code{include} and @code{sinclude} are sought in a
7396 user specified search path, if they are not found in the working
7397 directory. The search path is specified by the @option{-I} option and the
7398 @env{M4PATH} environment variable (@pxref{Search Path}).
7401 Arguments to @code{undivert} can be non-numeric, in which case the named
7402 file will be included uninterpreted in the output (@pxref{Undivert}).
7405 Formatted output is supported through the @code{format} builtin, which
7406 is modeled after the C library function @code{printf} (@pxref{Format}).
7409 Searches and text substitution through basic regular expressions are
7410 supported by the @code{regexp} (@pxref{Regexp}) and @code{patsubst}
7411 (@pxref{Patsubst}) builtins. Some BSD implementations use
7412 extended regular expressions instead.
7415 The output of shell commands can be read into @code{m4} with
7416 @code{esyscmd} (@pxref{Esyscmd}).
7419 There is indirect access to any builtin macro with @code{builtin}
7423 Macros can be called indirectly through @code{indir} (@pxref{Indir}).
7426 The name of the program, the current input file, and the current input
7427 line number are accessible through the builtins @code{@w{__program__}},
7428 @code{@w{__file__}}, and @code{@w{__line__}} (@pxref{Location}).
7431 The format of the output from @code{dumpdef} and macro tracing can be
7432 controlled with @code{debugmode} (@pxref{Debug Levels}).
7435 The destination of trace and debug output can be controlled with
7436 @code{debugfile} (@pxref{Debug Output}).
7439 The @code{maketemp} (@pxref{Mkstemp}) macro behaves like @code{mkstemp},
7440 creating a new file with a unique name on every invocation, rather than
7441 following the insecure behavior of replacing the trailing @samp{X}
7442 characters with the @code{m4} process id.
7445 POSIX only requires support for the command line options
7446 @option{-s}, @option{-D}, and @option{-U}, so all other options accepted
7447 by GNU M4 are extensions. @xref{Invoking m4}, for a
7448 description of these options.
7450 The debugging and tracing facilities in GNU @code{m4} are much
7451 more extensive than in most other versions of @code{m4}.
7454 @node Incompatibilities
7455 @section Facilities in System V @code{m4} not in GNU @code{m4}
7457 The version of @code{m4} from System V contains a few facilities that
7458 have not been implemented in GNU @code{m4} yet. Additionally,
7459 POSIX requires some behaviors that GNU @code{m4} has not
7460 implemented yet. Relying on these behaviors is non-portable, as a
7461 future release of GNU @code{m4} may change.
7465 POSIX requires support for multiple arguments to @code{defn},
7466 without any clarification on how @code{defn} behaves when one of the
7467 multiple arguments names a builtin. System V @code{m4} and some other
7468 implementations allow mixing builtins and text macros into a single
7469 macro. GNU @code{m4} only supports joining multiple text
7470 arguments, although a future implementation may lift this restriction to
7471 behave more like System V@. The only portable way to join text macros
7472 with builtins is via helper macros and implicit concatenation of macro
7476 POSIX requires an application to exit with non-zero status if
7477 it wrote an error message to stderr. This has not yet been consistently
7478 implemented for the various builtins that are required to issue an error
7479 (such as @code{eval} (@pxref{Eval}) when an argument cannot be parsed).
7482 Some traditional implementations only allow reading standard input
7483 once, but GNU @code{m4} correctly handles multiple instances
7484 of @samp{-} on the command line.
7487 POSIX requires @code{m4wrap} (@pxref{M4wrap}) to act in FIFO
7488 (first-in, first-out) order, but GNU @code{m4} currently uses
7489 LIFO order. Furthermore, POSIX states that only the first
7490 argument to @code{m4wrap} is saved for later evaluation, but
7491 GNU @code{m4} saves and processes all arguments, with output
7492 separated by spaces.
7495 POSIX states that builtins that require arguments, but are
7496 called without arguments, have undefined behavior. Traditional
7497 implementations simply behave as though empty strings had been passed.
7498 For example, @code{a`'define`'b} would expand to @code{ab}. But
7499 GNU @code{m4} ignores certain builtins if they have missing
7500 arguments, giving @code{adefineb} for the above example.
7503 Traditional implementations handle @code{define(`f',`1')} (@pxref{Define})
7504 by undefining the entire stack of previous definitions, and if doing
7505 @code{undefine(`f')} first. GNU @code{m4} replaces just the top
7506 definition on the stack, as if doing @code{popdef(`f')} followed by
7507 @code{pushdef(`f',`1')}. POSIX allows either behavior.
7510 POSIX 2001 requires @code{syscmd} (@pxref{Syscmd}) to evaluate
7511 command output for macro expansion, but this was a mistake that is
7512 anticipated to be corrected in the next version of POSIX.
7513 GNU @code{m4} follows traditional behavior in @code{syscmd}
7514 where output is not rescanned, and provides the extension @code{esyscmd}
7515 that does scan the output.
7518 At one point, POSIX required @code{changequote(@var{arg})}
7519 (@pxref{Changequote}) to use newline as the close quote, but this was a
7520 bug, and the next version of POSIX is anticipated to state
7521 that using empty strings or just one argument is unspecified.
7522 Meanwhile, the GNU @code{m4} behavior of treating an empty
7523 end-quote delimiter as @samp{'} is not portable, as Solaris treats it as
7524 repeating the start-quote delimiter, and BSD treats it as leaving the
7525 previous end-quote delimiter unchanged. For predictable results, never
7526 call changequote with just one argument, or with empty strings for
7530 At one point, POSIX required @code{changecom(@var{arg},)}
7531 (@pxref{Changecom}) to make it impossible to end a comment, but this is
7532 a bug, and the next version of POSIX is anticipated to state
7533 that using empty strings is unspecified. Meanwhile, the GNU
7534 @code{m4} behavior of treating an empty end-comment delimiter as newline
7535 is not portable, as BSD treats it as leaving the previous end-comment
7536 delimiter unchanged. It is also impossible in BSD implementations to
7537 disable comments, even though that is required by POSIX. For
7538 predictable results, never call changecom with empty strings for
7542 Most implementations of @code{m4} give macros a higher precedence than
7543 comments when parsing, meaning that if the start delimiter given to
7544 @code{changecom} (@pxref{Changecom}) starts with a macro name, comments
7545 are effectively disabled. POSIX does not specify what the
7546 precedence is, so this version of GNU @code{m4} parser
7547 recognizes comments, then macros, then quoted strings.
7550 Traditional implementations allow argument collection, but not string
7551 and comment processing, to span file boundaries. Thus, if @file{a.m4}
7552 contains @samp{len(}, and @file{b.m4} contains @samp{abc)},
7553 @kbd{m4 a.m4 b.m4} outputs @samp{3} with traditional @code{m4}, but
7554 gives an error message that the end of file was encountered inside a
7555 macro with GNU @code{m4}. On the other hand, traditional
7556 implementations do end of file processing for files included with
7557 @code{include} or @code{sinclude} (@pxref{Include}), while GNU
7558 @code{m4} seamlessly integrates the content of those files. Thus
7559 @code{include(`a.m4')include(`b.m4')} will output @samp{3} instead of
7563 Traditional @code{m4} treats @code{traceon} (@pxref{Trace}) without
7564 arguments as a global variable, independent of named macro tracing.
7565 Also, once a macro is undefined, named tracing of that macro is lost.
7566 On the other hand, when GNU @code{m4} encounters
7567 @code{traceon} without
7568 arguments, it turns tracing on for all existing definitions at the time,
7569 but does not trace future definitions; @code{traceoff} without arguments
7570 turns tracing off for all definitions regardless of whether they were
7571 also traced by name; and tracing by name, such as with @option{-tfoo} at
7572 the command line or @code{traceon(`foo')} in the input, is an attribute
7573 that is preserved even if the macro is currently undefined.
7575 Additionally, while POSIX requires trace output, it makes no
7576 demands on the formatting of that output. Parsing trace output is not
7577 guaranteed to be reliable, even between different releases of
7578 GNU M4; however, the intent is that any future changes in
7579 trace output will only occur under the direction of additional
7580 @code{debugmode} flags (@pxref{Debug Levels}).
7583 POSIX requires @code{eval} (@pxref{Eval}) to treat all
7584 operators with the same precedence as C@. However, earlier versions of
7585 GNU @code{m4} followed the traditional behavior of other
7586 @code{m4} implementations, where bitwise and logical negation (@samp{~}
7587 and @samp{!}) have lower precedence than equality operators; and where
7588 equality operators (@samp{==} and @samp{!=}) had the same precedence as
7589 relational operators (such as @samp{<}). Use explicit parentheses to
7590 ensure proper precedence. As extensions to POSIX,
7591 GNU @code{m4} gives well-defined semantics to operations that
7592 C leaves undefined, such as when overflow occurs, when shifting negative
7593 numbers, or when performing division by zero. POSIX also
7594 requires @samp{=} to cause an error, but many traditional
7595 implementations allowed it as an alias for @samp{==}.
7598 POSIX 2001 requires @code{translit} (@pxref{Translit}) to
7599 treat each character of the second and third arguments literally.
7600 However, it is anticipated that the next version of POSIX will
7601 allow the GNU @code{m4} behavior of treating @samp{-} as a
7605 POSIX requires @code{m4} to honor the locale environment
7606 variables of @env{LANG}, @env{LC_ALL}, @env{LC_CTYPE},
7607 @env{LC_MESSAGES}, and @env{NLSPATH}, but this has not yet been
7608 implemented in GNU @code{m4}.
7611 POSIX states that only unquoted leading newlines and blanks
7612 (that is, space and tab) are ignored when collecting macro arguments.
7613 However, this appears to be a bug in POSIX, since most
7614 traditional implementations also ignore all whitespace (formfeed,
7615 carriage return, and vertical tab). GNU @code{m4} follows
7616 tradition and ignores all leading unquoted whitespace.
7619 @cindex @env{POSIXLY_CORRECT}
7620 A strictly-compliant POSIX client is not allowed to use
7621 command-line arguments not specified by POSIX. However, since
7622 this version of M4 ignores @env{POSIXLY_CORRECT} and enables the option
7623 @code{--gnu} by default (@pxref{Limits control, , Invoking m4}), a
7624 client desiring to be strictly compliant has no way to disable
7625 GNU extensions that conflict with POSIX when
7626 directly invoking the compiled @code{m4}. A future version of
7627 @code{GNU} M4 will honor the environment variable @env{POSIXLY_CORRECT},
7628 implicitly enabling @option{--traditional} if it is set, in order to
7629 allow a strictly-compliant client. In the meantime, a client needing
7630 strict POSIX compliance can use the workaround of invoking a
7631 shell script wrapper, where the wrapper then adds @option{--traditional}
7632 to the arguments passed to the compiled @code{m4}.
7635 @node Other Incompatibilities
7636 @section Other incompatibilities
7638 There are a few other incompatibilities between this implementation of
7639 @code{m4}, and the System V version.
7643 GNU @code{m4} implements sync lines differently from System V
7644 @code{m4}, when text is being diverted. GNU @code{m4} outputs
7645 the sync lines when the text is being diverted, and System V @code{m4}
7646 when the diverted text is being brought back.
7648 The problem is which lines and file names should be attached to text
7649 that is being, or has been, diverted. System V @code{m4} regards all
7650 the diverted text as being generated by the source line containing the
7651 @code{undivert} call, whereas GNU @code{m4} regards the
7652 diverted text as being generated at the time it is diverted.
7654 The sync line option is used mostly when using @code{m4} as
7655 a front end to a compiler. If a diverted line causes a compiler error,
7656 the error messages should most probably refer to the place where the
7657 diversion was made, and not where it was inserted again.
7659 @comment options: -s
7664 @result{}#line 3 "stdin"
7667 @result{}#line 2 "stdin"
7669 @result{}#line 1 "stdin"
7673 The current @code{m4} implementation has a limitation that the syncline
7674 output at the start of each diversion occurs no matter what, even if the
7675 previous diversion did not end with a newline. This goes contrary to
7676 the claim that synclines appear on a line by themselves, so this
7677 limitation may be corrected in a future version of @code{m4}. In the
7678 meantime, when using @option{-s}, it is wisest to make sure all
7679 diversions end with newline.
7682 GNU @code{m4} makes no attempt at prohibiting self-referential
7693 There is nothing inherently wrong with defining @samp{x} to
7694 return @samp{x}. The wrong thing is to expand @samp{x} unquoted,
7695 because that would cause an infinite rescan loop.
7696 In @code{m4}, one might use macros to hold strings, as we do for
7697 variables in other programming languages, further checking them with:
7701 ifelse(defn(`@var{holder}'), `@var{value}', @dots{})
7705 In cases like this one, an interdiction for a macro to hold its own name
7706 would be a useless limitation. Of course, this leaves more rope for the
7707 GNU @code{m4} user to hang himself! Rescanning hangs may be
7708 avoided through careful programming, a little like for endless loops in
7709 traditional programming languages.
7713 @chapter Correct version of some examples
7715 Some of the examples in this manuals are buggy or not very robust, for
7716 demonstration purposes. Improved versions of these composite macros are
7720 * Improved exch:: Solution for @code{exch}
7721 * Improved forloop:: Solution for @code{forloop}
7722 * Improved foreach:: Solution for @code{foreach}
7723 * Improved copy:: Solution for @code{copy}
7724 * Improved m4wrap:: Solution for @code{m4wrap}
7725 * Improved cleardivert:: Solution for @code{cleardivert}
7726 * Improved capitalize:: Solution for @code{capitalize}
7727 * Improved fatal_error:: Solution for @code{fatal_error}
7731 @section Solution for @code{exch}
7733 The @code{exch} macro (@pxref{Arguments}) as presented requires clients
7734 to double quote their arguments. A nicer definition, which lets
7735 clients follow the rule of thumb of one level of quoting per level of
7736 parentheses, involves adding quotes in the definition of @code{exch}, as
7740 define(`exch', ``$2', `$1'')
7742 define(exch(`expansion text', `macro'))
7745 @result{}expansion text
7748 @node Improved forloop
7749 @section Solution for @code{forloop}
7751 The @code{forloop} macro (@pxref{Forloop}) as presented earlier can go
7752 into an infinite loop if given an iterator that is not parsed as a macro
7753 name. It does not do any sanity checking on its numeric bounds, and
7754 only permits decimal numbers for bounds. Here is an improved version,
7755 shipped as @file{m4-@value{VERSION}/@/examples/@/forloop2.m4}; this
7756 version also optimizes overhead by calling four macros instead of six
7757 per iteration (excluding those in @var{text}), by not dereferencing the
7758 @var{iterator} in the helper @code{@w{_forloop}}.
7762 $ @kbd{m4 -d -I examples}
7763 undivert(`forloop2.m4')dnl
7764 @result{}divert(`-1')
7765 @result{}# forloop(var, from, to, stmt) - improved version:
7766 @result{}# works even if VAR is not a strict macro name
7767 @result{}# performs sanity check that FROM is larger than TO
7768 @result{}# allows complex numerical expressions in TO and FROM
7769 @result{}define(`forloop', `ifelse(eval(`($2) <= ($3)'), `1',
7770 @result{} `pushdef(`$1')_$0(`$1', eval(`$2'),
7771 @result{} eval(`$3'), `$4')popdef(`$1')')')
7772 @result{}define(`_forloop',
7773 @result{} `define(`$1', `$2')$4`'ifelse(`$2', `$3', `',
7774 @result{} `$0(`$1', incr(`$2'), `$3', `$4')')')
7775 @result{}divert`'dnl
7776 include(`forloop2.m4')
7778 forloop(`i', `2', `1', `no iteration occurs')
7780 forloop(`', `1', `2', ` odd iterator name')
7781 @result{} odd iterator name odd iterator name
7782 forloop(`i', `5 + 5', `0xc', ` 0x`'eval(i, `16')')
7783 @result{} 0xa 0xb 0xc
7784 forloop(`i', `a', `b', `non-numeric bounds')
7785 @error{}m4:stdin:6: bad expression in eval (bad input): (a) <= (b)
7789 One other change to notice is that the improved version used @samp{_$0}
7790 rather than @samp{_foreach} to invoke the helper routine. In general,
7791 this is a good practice to follow, because then the set of macros can be
7792 uniformly transformed. The following example shows a transformation
7793 that doubles the current quoting and appends a suffix @samp{2} to each
7794 transformed macro. If @code{foreach} refers to the literal
7795 @samp{_foreach}, then @code{foreach2} invokes @code{_foreach} instead of
7796 the intended @code{_foreach2}, and the mixing of quoting paradigms leads
7797 to an infinite recursion loop in this example.
7799 @comment options: -L9
7803 $ @kbd{m4 -d -L 9 -I examples}
7804 define(`arg1', `$1')include(`forloop2.m4')include(`quote.m4')
7806 define(`double', `define(`$1'`2',
7807 arg1(patsubst(dquote(defn(`$1')), `[`']', `\&\&')))')
7809 double(`forloop')double(`_forloop')defn(`forloop2')
7810 @result{}ifelse(eval(``($2) <= ($3)''), ``1'',
7811 @result{} ``pushdef(``$1'')_$0(``$1'', eval(``$2''),
7812 @result{} eval(``$3''), ``$4'')popdef(``$1'')'')
7813 forloop(i, 1, 5, `ifelse(')forloop(i, 1, 5, `)')
7815 changequote(`[', `]')changequote([``], [''])
7817 forloop2(i, 1, 5, ``ifelse('')forloop2(i, 1, 5, ``)'')
7819 changequote`'include(`forloop.m4')
7821 double(`forloop')double(`_forloop')defn(`forloop2')
7822 @result{}pushdef(``$1'', ``$2'')_forloop($@@)popdef(``$1'')
7823 forloop(i, 1, 5, `ifelse(')forloop(i, 1, 5, `)')
7825 changequote(`[', `]')changequote([``], [''])
7827 forloop2(i, 1, 5, ``ifelse('')forloop2(i, 1, 5, ``)'')
7828 @error{}m4:stdin:12: recursion limit of 9 exceeded, use -L<N> to change it
7831 One more optimization is still possible. Instead of repeatedly
7832 assigning a variable then invoking or dereferencing it, it is possible
7833 to pass the current iterator value as a single argument. Coupled with
7834 @code{curry} if other arguments are needed (@pxref{Composition}), or
7835 with helper macros if the argument is needed in more than one place in
7836 the expansion, the output can be generated with three, rather than four,
7837 macros of overhead per iteration. Notice how the file
7838 @file{m4-@value{VERSION}/@/examples/@/forloop3.m4} rearranges the
7839 arguments of the helper @code{_forloop} to take two arguments that are
7840 placed around the current value. By splitting a balanced set of
7841 parantheses across multiple arguments, the helper macro can now be
7842 shared by @code{forloop} and the new @code{forloop_arg}.
7846 $ @kbd{m4 -I examples}
7847 include(`forloop3.m4')
7849 undivert(`forloop3.m4')dnl
7850 @result{}divert(`-1')
7851 @result{}# forloop_arg(from, to, macro) - invoke MACRO(value) for
7852 @result{}# each value between FROM and TO, without define overhead
7853 @result{}define(`forloop_arg', `ifelse(eval(`($1) <= ($2)'), `1',
7854 @result{} `_forloop(`$1', eval(`$2'), `$3(', `)')')')
7855 @result{}# forloop(var, from, to, stmt) - refactored to share code
7856 @result{}define(`forloop', `ifelse(eval(`($2) <= ($3)'), `1',
7857 @result{} `pushdef(`$1')_forloop(eval(`$2'), eval(`$3'),
7858 @result{} `define(`$1',', `)$4')popdef(`$1')')')
7859 @result{}define(`_forloop',
7860 @result{} `$3`$1'$4`'ifelse(`$1', `$2', `',
7861 @result{} `$0(incr(`$1'), `$2', `$3', `$4')')')
7862 @result{}divert`'dnl
7863 forloop(`i', `1', `3', ` i')
7865 define(`echo', `$@@')
7867 forloop_arg(`1', `3', ` echo')
7871 forloop_arg(`1', `3', `curry(`pushdef', `a')')
7883 Of course, it is possible to make even more improvements, such as
7884 adding an optional step argument, or allowing iteration through
7885 descending sequences. GNU Autoconf provides some of these
7886 additional bells and whistles in its @code{m4_for} macro.
7888 @node Improved foreach
7889 @section Solution for @code{foreach}
7891 The @code{foreach} and @code{foreachq} macros (@pxref{Foreach}) as
7892 presented earlier each have flaws. First, we will examine and fix the
7893 quadratic behavior of @code{foreachq}:
7897 $ @kbd{m4 -I examples}
7898 include(`foreachq.m4')
7900 traceon(`shift')debugmode(`aq')
7902 foreachq(`x', ``1', `2', `3', `4'', `x
7905 @error{}m4trace: -3- shift(`1', `2', `3', `4')
7906 @error{}m4trace: -2- shift(`1', `2', `3', `4')
7908 @error{}m4trace: -4- shift(`1', `2', `3', `4')
7909 @error{}m4trace: -3- shift(`2', `3', `4')
7910 @error{}m4trace: -3- shift(`1', `2', `3', `4')
7911 @error{}m4trace: -2- shift(`2', `3', `4')
7913 @error{}m4trace: -5- shift(`1', `2', `3', `4')
7914 @error{}m4trace: -4- shift(`2', `3', `4')
7915 @error{}m4trace: -3- shift(`3', `4')
7916 @error{}m4trace: -4- shift(`1', `2', `3', `4')
7917 @error{}m4trace: -3- shift(`2', `3', `4')
7918 @error{}m4trace: -2- shift(`3', `4')
7920 @error{}m4trace: -6- shift(`1', `2', `3', `4')
7921 @error{}m4trace: -5- shift(`2', `3', `4')
7922 @error{}m4trace: -4- shift(`3', `4')
7923 @error{}m4trace: -3- shift(`4')
7926 @cindex quadratic behavior, avoiding
7927 @cindex avoiding quadratic behavior
7928 Each successive iteration was adding more quoted @code{shift}
7929 invocations, and the entire list contents were passing through every
7930 iteration. In general, when recursing, it is a good idea to make the
7931 recursion use fewer arguments, rather than adding additional quoted
7932 uses of @code{shift}. By doing so, @code{m4} uses less memory, invokes
7933 fewer macros, is less likely to run into machine limits, and most
7934 importantly, performs faster. The fixed version of @code{foreachq} can
7935 be found in @file{m4-@value{VERSION}/@/examples/@/foreachq2.m4}:
7939 $ @kbd{m4 -I examples}
7940 include(`foreachq2.m4')
7942 undivert(`foreachq2.m4')dnl
7943 @result{}include(`quote.m4')dnl
7944 @result{}divert(`-1')
7945 @result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
7946 @result{}# quoted list, improved version
7947 @result{}define(`foreachq', `pushdef(`$1')_$0($@@)popdef(`$1')')
7948 @result{}define(`_arg1q', ``$1'')
7949 @result{}define(`_rest', `ifelse(`$#', `1', `', `dquote(shift($@@))')')
7950 @result{}define(`_foreachq', `ifelse(`$2', `', `',
7951 @result{} `define(`$1', _arg1q($2))$3`'$0(`$1', _rest($2), `$3')')')
7952 @result{}divert`'dnl
7953 traceon(`shift')debugmode(`aq')
7955 foreachq(`x', ``1', `2', `3', `4'', `x
7958 @error{}m4trace: -3- shift(`1', `2', `3', `4')
7960 @error{}m4trace: -3- shift(`2', `3', `4')
7962 @error{}m4trace: -3- shift(`3', `4')
7966 Note that the fixed version calls unquoted helper macros in
7967 @code{@w{_foreachq}} to trim elements immediately; those helper macros
7968 in turn must re-supply the layer of quotes lost in the macro invocation.
7969 Contrast the use of @code{@w{_arg1q}}, which quotes the first list
7970 element, with @code{@w{_arg1}} of the earlier implementation that
7971 returned the first list element directly. Additionally, by calling the
7972 helper method immediately, the @samp{defn(`@var{iterator}')} no longer
7973 contains unexpanded macros.
7975 The astute m4 programmer might notice that the solution above still uses
7976 more memory and macro invocations, and thus more time, than strictly
7977 necessary. Note that @samp{$2}, which contains an arbitrarily long
7978 quoted list, is expanded and rescanned three times per iteration of
7979 @code{_foreachq}. Furthermore, every iteration of the algorithm
7980 effectively unboxes then reboxes the list, which costs a couple of macro
7981 invocations. It is possible to rewrite the algorithm for a bit more
7982 speed by swapping the order of the arguments to @code{_foreachq} in
7983 order to operate on an unboxed list in the first place, and by using the
7984 fixed-length @samp{$#} instead of an arbitrary length list as the key to
7985 end recursion. The result is an overhead of six macro invocations per
7986 loop (excluding any macros in @var{text}), instead of eight. This
7987 alternative approach is available as
7988 @file{m4-@value{VERSION}/@/examples/@/foreach3.m4}:
7992 $ @kbd{m4 -I examples}
7993 include(`foreachq3.m4')
7995 undivert(`foreachq3.m4')dnl
7996 @result{}divert(`-1')
7997 @result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
7998 @result{}# quoted list, alternate improved version
7999 @result{}define(`foreachq', `ifelse(`$2', `', `',
8000 @result{} `pushdef(`$1')_$0(`$1', `$3', `', $2)popdef(`$1')')')
8001 @result{}define(`_foreachq', `ifelse(`$#', `3', `',
8002 @result{} `define(`$1', `$4')$2`'$0(`$1', `$2',
8003 @result{} shift(shift(shift($@@))))')')
8004 @result{}divert`'dnl
8005 traceon(`shift')debugmode(`aq')
8007 foreachq(`x', ``1', `2', `3', `4'', `x
8010 @error{}m4trace: -4- shift(`x', `x
8011 @error{}', `', `1', `2', `3', `4')
8012 @error{}m4trace: -3- shift(`x
8013 @error{}', `', `1', `2', `3', `4')
8014 @error{}m4trace: -2- shift(`', `1', `2', `3', `4')
8016 @error{}m4trace: -4- shift(`x', `x
8017 @error{}', `1', `2', `3', `4')
8018 @error{}m4trace: -3- shift(`x
8019 @error{}', `1', `2', `3', `4')
8020 @error{}m4trace: -2- shift(`1', `2', `3', `4')
8022 @error{}m4trace: -4- shift(`x', `x
8023 @error{}', `2', `3', `4')
8024 @error{}m4trace: -3- shift(`x
8025 @error{}', `2', `3', `4')
8026 @error{}m4trace: -2- shift(`2', `3', `4')
8028 @error{}m4trace: -4- shift(`x', `x
8029 @error{}', `3', `4')
8030 @error{}m4trace: -3- shift(`x
8031 @error{}', `3', `4')
8032 @error{}m4trace: -2- shift(`3', `4')
8035 In the current version of M4, every instance of @samp{$@@} is rescanned
8036 as it is encountered. Thus, the @file{foreachq3.m4} alternative uses
8037 much less memory than @file{foreachq2.m4}, and executes as much as 10%
8038 faster, since each iteration encounters fewer @samp{$@@}. However, the
8039 implementation of rescanning every byte in @samp{$@@} is quadratic in
8040 the number of bytes scanned (for example, making the broken version in
8041 @file{foreachq.m4} cubic, rather than quadratic, in behavior). A future
8042 release of M4 will improve the underlying implementation by reusing
8043 results of previous scans, so that both styles of @code{foreachq} can
8044 become linear in the number of bytes scanned. Notice how the
8045 implementation injects an empty argument prior to expanding @samp{$2}
8046 within @code{foreachq}; the helper macro @code{_foreachq} then ignores
8047 the third argument altogether, and ends recursion when there are three
8048 arguments left because there was nothing left to pass through
8049 @code{shift}. Thus, each iteration only needs one @code{ifelse}, rather
8050 than the two conditionals used in the version from @file{foreachq2.m4}.
8052 @cindex nine arguments, more than
8053 @cindex more than nine arguments
8054 @cindex arguments, more than nine
8055 So far, all of the implementations of @code{foreachq} presented have
8056 been quadratic with M4 1.4.x. But @code{forloop} is linear, because
8057 each iteration parses a constant amount of arguments. So, it is
8058 possible to design a variant that uses @code{forloop} to do the
8059 iteration, then uses @samp{$@@} only once at the end, giving a linear
8060 result even with older M4 implementations. This implementation relies
8061 on the GNU extension that @samp{$10} expands to the tenth
8062 argument rather than the first argument concatenated with @samp{0}. The
8063 trick is to define an intermediate macro that repeats the text
8064 @code{m4_define(`$1', `$@var{n}')$2`'}, with @samp{n} set to successive
8065 integers corresponding to each argument. The helper macro
8066 @code{_foreachq_} is needed in order to generate the literal sequences
8067 such as @samp{$1} into the intermediate macro, rather than expanding
8068 them as the arguments of @code{_foreachq}. With this approach, no
8069 @code{shift} calls are even needed! Even though there are seven macros
8070 of overhead per iteration instead of six in @file{foreachq3.m4}, the
8071 linear scaling is apparent at relatively small list sizes. However,
8072 this approach will need adjustment when a future version of M4 follows
8073 POSIX by no longer treating @samp{$10} as the tenth argument;
8074 the anticipation is that @samp{$@{10@}} can be used instead, although
8075 that alternative syntax is not yet supported.
8079 $ @kbd{m4 -I examples}
8080 include(`foreachq4.m4')
8082 undivert(`foreachq4.m4')dnl
8083 @result{}include(`forloop2.m4')dnl
8084 @result{}divert(`-1')
8085 @result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
8086 @result{}# quoted list, version based on forloop
8087 @result{}define(`foreachq',
8088 @result{}`ifelse(`$2', `', `', `_$0(`$1', `$3', $2)')')
8089 @result{}define(`_foreachq',
8090 @result{}`pushdef(`$1', forloop(`$1', `3', `$#',
8091 @result{} `$0_(`1', `2', indir(`$1'))')`popdef(
8092 @result{} `$1')')indir(`$1', $@@)')
8093 @result{}define(`_foreachq_',
8094 @result{}``define(`$$1', `$$3')$$2`''')
8095 @result{}divert`'dnl
8096 traceon(`shift')debugmode(`aq')
8098 foreachq(`x', ``1', `2', `3', `4'', `x
8106 For yet another approach, the improved version of @code{foreach},
8107 available in @file{m4-@value{VERSION}/@/examples/@/foreach2.m4}, simply
8108 overquotes the arguments to @code{@w{_foreach}} to begin with, using
8109 @code{dquote_elt}. Then @code{@w{_foreach}} can just use
8110 @code{@w{_arg1}} to remove the extra layer of quoting that was added up
8115 $ @kbd{m4 -I examples}
8116 include(`foreach2.m4')
8118 undivert(`foreach2.m4')dnl
8119 @result{}include(`quote.m4')dnl
8120 @result{}divert(`-1')
8121 @result{}# foreach(x, (item_1, item_2, ..., item_n), stmt)
8122 @result{}# parenthesized list, improved version
8123 @result{}define(`foreach', `pushdef(`$1')_$0(`$1',
8124 @result{} (dquote(dquote_elt$2)), `$3')popdef(`$1')')
8125 @result{}define(`_arg1', `$1')
8126 @result{}define(`_foreach', `ifelse(`$2', `(`')', `',
8127 @result{} `define(`$1', _arg1$2)$3`'$0(`$1', (dquote(shift$2)), `$3')')')
8128 @result{}divert`'dnl
8129 traceon(`shift')debugmode(`aq')
8131 foreach(`x', `(`1', `2', `3', `4')', `x
8133 @error{}m4trace: -4- shift(`1', `2', `3', `4')
8134 @error{}m4trace: -4- shift(`2', `3', `4')
8135 @error{}m4trace: -4- shift(`3', `4')
8137 @error{}m4trace: -3- shift(``1'', ``2'', ``3'', ``4'')
8139 @error{}m4trace: -3- shift(``2'', ``3'', ``4'')
8141 @error{}m4trace: -3- shift(``3'', ``4'')
8143 @error{}m4trace: -3- shift(``4'')
8146 It is likewise possible to write a variant of @code{foreach} that
8147 performs in linear time on M4 1.4.x; the easiest method is probably
8148 writing a version of @code{foreach} that unboxes its list, then invokes
8149 @code{_foreachq} as previously defined in @file{foreachq4.m4}.
8151 In summary, recursion over list elements is trickier than it appeared at
8152 first glance, but provides a powerful idiom within @code{m4} processing.
8153 As a final demonstration, both list styles are now able to handle
8154 several scenarios that would wreak havoc on one or both of the original
8155 implementations. This points out one other difference between the
8156 list styles. @code{foreach} evaluates unquoted list elements only once,
8157 in preparation for calling @code{@w{_foreach}}, similary for
8158 @code{foreachq} as provided by @file{foreachq3.m4} or
8159 @file{foreachq4.m4}. But
8160 @code{foreachq}, as provided by @file{foreachq2.m4},
8161 evaluates unquoted list elements twice while visiting the first list
8162 element, once in @code{@w{_arg1q}} and once in @code{@w{_rest}}. When
8163 deciding which list style to use, one must take into account whether
8164 repeating the side effects of unquoted list elements will have any
8165 detrimental effects.
8169 $ @kbd{m4 -I examples}
8170 include(`foreach2.m4')
8172 include(`foreachq2.m4')
8175 foreach(`x', `', `<x>') / foreachq(`x', `', `<x>')
8177 dnl 1-element list of empty element
8178 foreach(`x', `()', `<x>') / foreachq(`x', ``'', `<x>')
8180 dnl 2-element list of empty elements
8181 foreach(`x', `(`',`')', `<x>') / foreachq(`x', ``',`'', `<x>')
8182 @result{}<><> / <><>
8183 dnl 1-element list of a comma
8184 foreach(`x', `(`,')', `<x>') / foreachq(`x', ``,'', `<x>')
8186 dnl 2-element list of unbalanced parentheses
8187 foreach(`x', `(`(', `)')', `<x>') / foreachq(`x', ``(', `)'', `<x>')
8188 @result{}<(><)> / <(><)>
8189 define(`ab', `oops')dnl using defn(`iterator')
8190 foreach(`x', `(`a', `b')', `defn(`x')') /dnl
8191 foreachq(`x', ``a', `b'', `defn(`x')')
8193 define(`active', `ACT, IVE')
8197 dnl list of unquoted macros; expansion occurs before recursion
8198 foreach(`x', `(active, active)', `<x>
8200 @error{}m4trace: -4- active -> `ACT, IVE'
8201 @error{}m4trace: -4- active -> `ACT, IVE'
8206 foreachq(`x', `active, active', `<x>
8208 @error{}m4trace: -3- active -> `ACT, IVE'
8209 @error{}m4trace: -3- active -> `ACT, IVE'
8211 @error{}m4trace: -3- active -> `ACT, IVE'
8212 @error{}m4trace: -3- active -> `ACT, IVE'
8216 dnl list of quoted macros; expansion occurs during recursion
8217 foreach(`x', `(`active', `active')', `<x>
8219 @error{}m4trace: -1- active -> `ACT, IVE'
8221 @error{}m4trace: -1- active -> `ACT, IVE'
8223 foreachq(`x', ``active', `active'', `<x>
8225 @error{}m4trace: -1- active -> `ACT, IVE'
8227 @error{}m4trace: -1- active -> `ACT, IVE'
8229 dnl list of double-quoted macro names; no expansion
8230 foreach(`x', `(``active'', ``active'')', `<x>
8234 foreachq(`x', ```active'', ``active''', `<x>
8241 @comment Not worth putting in the manual, but make sure that foreach
8242 @comment implementations behave, and that final implementation is
8245 @comment boxed recursion
8248 @comment options: -Dlimit=10 -Dverbose
8250 $ @kbd {m4 -I examples -Dlimit=10 -Dverbose}
8251 include(`loop.m4')dnl
8252 @result{} 1 2 3 4 5 6 7 8 9 10
8255 @comment unboxed recursion
8258 @comment options: -Dlimit=10 -Dverbose -Dalt
8260 $ @kbd {m4 -I examples -Dlimit=10 -Dverbose -Dalt}
8261 include(`loop.m4')dnl
8262 @result{} 1 2 3 4 5 6 7 8 9 10
8265 @comment foreach via forloop recursion
8268 @comment options: -Dlimit=10 -Dverbose -Dalt=4
8270 $ @kbd {m4 -I examples -Dlimit=10 -Dverbose -Dalt=4}
8271 include(`loop.m4')dnl
8272 @result{} 1 2 3 4 5 6 7 8 9 10
8276 @comment options: -Dlimit=2500 -Dalt=4
8278 $ @kbd {m4 -I examples -Dlimit=2500 -Dalt=4}
8279 include(`loop.m4')dnl
8283 @comment options: -Dlimit=10000 -Dalt=4
8285 $ @kbd {m4 -I examples -Dlimit=10000 -Dalt=4}
8286 define(`foo', `divert`'len(popdef(`_foreachq')_foreachq($@@))')dnl
8287 define(`debug', `pushdef(`_foreachq', defn(`foo'))')
8289 include(`loop.m4')dnl
8296 @section Solution for @code{copy}
8298 The macro @code{copy} presented above
8299 is unable to handle builtin tokens with M4 1.4.x, because it tries to
8300 pass the builtin token through the macro @code{curry}, where it is
8301 silently flattened to an empty string (@pxref{Composition}). Rather
8302 than using the problematic @code{curry} to work around the limitation
8303 that @code{stack_foreach} expects to invoke a macro that takes exactly
8304 one argument, we can write a new macro that lets us form the exact
8305 two-argument @code{pushdef} call sequence needed, so that we are no
8306 longer passing a builtin token through a text macro.
8308 @deffn Composite stack_foreach_sep (@var{macro}, @var{pre}, @var{post}, @
8310 @deffnx Composite stack_foreach_sep_lifo (@var{macro}, @var{pre}, @
8311 @var{post}, @var{sep})
8312 For each of the @code{pushdef} definitions associated with @var{macro},
8313 expand the sequence @samp{@var{pre}`'definition`'@var{post}}.
8314 Additionally, expand @var{sep} between definitions.
8315 @code{stack_foreach_sep} visits the oldest definition first, while
8316 @code{stack_foreach_sep_lifo} visits the current definition first. The
8317 expansion may dereference @var{macro}, but should not modify it. There
8318 are a few special macros, such as @code{defn}, which cannot be used as
8319 the @var{macro} parameter.
8322 Note that @code{stack_foreach(`@var{macro}', `@var{action}')} is
8323 equivalent to @code{stack_foreach_sep(`@var{macro}', `@var{action}(',
8324 `)')}. By supplying explicit parentheses, split among the @var{pre} and
8325 @var{post} arguments to @code{stack_foreach_sep}, it is now possible to
8326 construct macro calls with more than one argument, without passing
8327 builtin tokens through a macro call. It is likewise possible to
8328 directly reference the stack definitions without a macro call, by
8329 leaving @var{pre} and @var{post} empty. Thus, in addition to fixing
8330 @code{copy} on builtin tokens, it also executes with fewer macro
8333 The new macro also adds a separator that is only output after the first
8334 iteration of the helper @code{_stack_reverse_sep}, implemented by
8335 prepending the original @var{sep} to @var{pre} and omitting a @var{sep}
8336 argument in subsequent iterations. Note that the empty string that
8337 separates @var{sep} from @var{pre} is provided as part of the fourth
8338 argument when originally calling @code{_stack_reverse_sep}, and not by
8339 writing @code{$4`'$3} as the third argument in the recursive call; while
8340 the other approach would give the same output, it does so at the expense
8341 of increasing the argument size on each iteration of
8342 @code{_stack_reverse_sep}, which results in quadratic instead of linear
8343 execution time. The improved stack walking macros are available in
8344 @file{m4-@value{VERSION}/@/examples/@/stack_sep.m4}:
8348 $ @kbd{m4 -I examples}
8349 include(`stack_sep.m4')
8351 define(`copy', `ifdef(`$2', `errprint(`$2 already defined
8353 `stack_foreach_sep(`$1', `pushdef(`$2',', `)')')')dnl
8354 pushdef(`a', `1')pushdef(`a', defn(`divnum'))
8364 pushdef(`c', `1')pushdef(`c', `2')
8366 stack_foreach_sep_lifo(`c', `', `', `, ')
8368 undivert(`stack_sep.m4')dnl
8369 @result{}divert(`-1')
8370 @result{}# stack_foreach_sep(macro, pre, post, sep)
8371 @result{}# Invoke PRE`'defn`'POST with a single argument of each definition
8372 @result{}# from the definition stack of MACRO, starting with the oldest, and
8373 @result{}# separated by SEP between definitions.
8374 @result{}define(`stack_foreach_sep',
8375 @result{}`_stack_reverse_sep(`$1', `tmp-$1')'dnl
8376 @result{}`_stack_reverse_sep(`tmp-$1', `$1', `$2`'defn(`$1')$3', `$4`'')')
8377 @result{}# stack_foreach_sep_lifo(macro, pre, post, sep)
8378 @result{}# Like stack_foreach_sep, but starting with the newest definition.
8379 @result{}define(`stack_foreach_sep_lifo',
8380 @result{}`_stack_reverse_sep(`$1', `tmp-$1', `$2`'defn(`$1')$3', `$4`'')'dnl
8381 @result{}`_stack_reverse_sep(`tmp-$1', `$1')')
8382 @result{}define(`_stack_reverse_sep',
8383 @result{}`ifdef(`$1', `pushdef(`$2', defn(`$1'))$3`'popdef(`$1')$0(
8384 @result{} `$1', `$2', `$4$3')')')
8385 @result{}divert`'dnl
8389 @comment Not worth putting in the manual, but make sure that
8390 @comment stack_foreach_sep has linear performance.
8394 $ @kbd {m4 -I examples}
8395 include(`forloop3.m4')include(`stack_sep.m4')dnl
8396 forloop(`i', `1', `10000', `pushdef(`s', i)')
8398 define(`colon', `:')define(`dash', `-')
8400 len(stack_foreach_sep(`s', `dash', `', `colon'))
8405 @node Improved m4wrap
8406 @section Solution for @code{m4wrap}
8408 The replacement @code{m4wrap} versions presented above, designed to
8409 guarantee FIFO or LIFO order regardless of the underlying M4
8410 implementation, share a bug when dealing with wrapped text that looks
8411 like parameter expansion. Note how the invocation of
8412 @code{m4wrap@var{n}} interprets these parameters, while using the
8413 builtin preserves them for their intended use.
8417 $ @kbd{m4 -I examples}
8418 include(`wraplifo.m4')
8420 m4wrap(`define(`foo', ``$0:'-$1-$*-$#-')foo(`a', `b')
8423 builtin(`m4wrap', ``'define(`bar', ``$0:'-$1-$*-$#-')bar(`a', `b')
8427 @result{}bar:-a-a,b-2-
8428 @result{}m4wrap0:---0-
8431 Additionally, the computation of @code{_m4wrap_level} and creation of
8432 multiple @code{m4wrap@var{n}} placeholders in the original examples is
8433 more expensive in time and memory than strictly necessary. Notice how
8434 the improved version grabs the wrapped text via @code{defn} to avoid
8435 parameter expansion, then undefines @code{_m4wrap_text}, before
8436 stripping a level of quotes with @code{_arg1} to expand the text. That
8437 way, each level of wrapping reuses the single placeholder, which starts
8438 each nesting level in an undefined state.
8440 Finally, it is worth emulating the GNU M4 extension of saving
8441 all arguments to @code{m4wrap}, separated by a space, rather than saving
8442 just the first argument. This is done with the @code{join} macro
8443 documented previously (@pxref{Shift}). The improved LIFO example is
8444 shipped as @file{m4-@value{VERSION}/@/examples/@/wraplifo2.m4}, and can
8445 easily be converted to a FIFO solution by swapping the adjacent
8446 invocations of @code{joinall} and @code{defn}.
8450 $ @kbd{m4 -I examples}
8451 include(`wraplifo2.m4')
8453 undivert(`wraplifo2.m4')dnl
8454 @result{}dnl Redefine m4wrap to have LIFO semantics, improved example.
8455 @result{}include(`join.m4')dnl
8456 @result{}define(`_m4wrap', defn(`m4wrap'))dnl
8457 @result{}define(`_arg1', `$1')dnl
8458 @result{}define(`m4wrap',
8459 @result{}`ifdef(`_$0_text',
8460 @result{} `define(`_$0_text', joinall(` ', $@@)defn(`_$0_text'))',
8461 @result{} `_$0(`_arg1(defn(`_$0_text')undefine(`_$0_text'))')dnl
8462 @result{}define(`_$0_text', joinall(` ', $@@))')')dnl
8463 m4wrap(`define(`foo', ``$0:'-$1-$*-$#-')foo(`a', `b')
8467 m4wrap(`nested', `', `$@@
8472 @result{}foo:-a-a,b-2-
8476 @node Improved cleardivert
8477 @section Solution for @code{cleardivert}
8479 The @code{cleardivert} macro (@pxref{Cleardivert}) cannot, as it stands, be
8480 called without arguments to clear all pending diversions. That is
8481 because using undivert with an empty string for an argument is different
8482 than using it with no arguments at all. Compare the earlier definition
8483 with one that takes the number of arguments into account:
8486 define(`cleardivert',
8487 `pushdef(`_n', divnum)divert(`-1')undivert($@@)divert(_n)popdef(`_n')')
8497 define(`cleardivert',
8498 `pushdef(`_num', divnum)divert(`-1')ifelse(`$#', `0',
8499 `undivert`'', `undivert($@@)')divert(_num)popdef(`_num')')
8510 @node Improved capitalize
8511 @section Solution for @code{capitalize}
8513 The @code{capitalize} macro (@pxref{Patsubst}) as presented earlier does
8514 not allow clients to follow the quoting rule of thumb. Consider the
8515 three macros @code{active}, @code{Active}, and @code{ACTIVE}, and the
8516 difference between calling @code{capitalize} with the expansion of a
8517 macro, expanding the result of a case change, and changing the case of a
8518 double-quoted string:
8522 $ @kbd{m4 -I examples}
8523 include(`capitalize.m4')dnl
8524 define(`active', `act1, ive')dnl
8525 define(`Active', `Act2, Ive')dnl
8526 define(`ACTIVE', `ACT3, IVE')dnl
8537 downcase(``ACTIVE'')
8541 capitalize(`active')
8543 capitalize(``active'')
8544 @result{}_capitalize(`active')
8549 capitalize(`active')
8553 First, when @code{capitalize} is called with more than one argument, it
8554 was throwing away later arguments, whereas @code{upcase} and
8555 @code{downcase} used @samp{$*} to collect them all. The fix is simple:
8556 use @samp{$*} consistently.
8558 Next, with single-quoting, @code{capitalize} outputs a single character,
8559 a set of quotes, then the rest of the characters, making it impossible
8560 to invoke @code{Active} after the fact, and allowing the alternate macro
8561 @code{A} to interfere. Here, the solution is to use additional quoting
8562 in the helper macros, then pass the final over-quoted output string
8563 through @code{_arg1} to remove the extra quoting and finally invoke the
8564 concatenated portions as a single string.
8566 Finally, when passed a double-quoted string, the nested macro
8567 @code{_capitalize} is never invoked because it ended up nested inside
8568 quotes. This one is the toughest to fix. In short, we have no idea how
8569 many levels of quotes are in effect on the substring being altered by
8570 @code{patsubst}. If the replacement string cannot be expressed entirely
8571 in terms of literal text and backslash substitutions, then we need a
8572 mechanism to guarantee that the helper macros are invoked outside of
8573 quotes. In other words, this sounds like a job for @code{changequote}
8574 (@pxref{Changequote}). By changing the active quoting characters, we
8575 can guarantee that replacement text injected by @code{patsubst} always
8576 occurs in the middle of a string that has exactly one level of
8577 over-quoting using alternate quotes; so the replacement text closes the
8578 quoted string, invokes the helper macros, then reopens the quoted
8579 string. In turn, that means the replacement text has unbalanced quotes,
8580 necessitating another round of @code{changequote}.
8582 In the fixed version below, (also shipped as
8583 @file{m4-@value{VERSION}/@/examples/@/capitalize2.m4}), @code{capitalize}
8584 uses the alternate quotes of @samp{<<[} and @samp{]>>} (the longer
8585 strings are chosen so as to be less likely to appear in the text being
8586 converted). The helpers @code{_to_alt} and @code{_from_alt} merely
8587 reduce the number of characters required to perform a
8588 @code{changequote}, since the definition changes twice. The outermost
8589 pair means that @code{patsubst} and @code{_capitalize_alt} are invoked
8590 with alternate quoting; the innermost pair is used so that the third
8591 argument to @code{patsubst} can contain an unbalanced
8592 @samp{]>>}/@samp{<<[} pair. Note that @code{upcase} and @code{downcase}
8593 must be redefined as @code{_upcase_alt} and @code{_downcase_alt}, since
8594 they contain nested quotes but are invoked with the alternate quoting
8599 $ @kbd{m4 -I examples}
8600 include(`capitalize2.m4')dnl
8601 define(`active', `act1, ive')dnl
8602 define(`Active', `Act2, Ive')dnl
8603 define(`ACTIVE', `ACT3, IVE')dnl
8604 define(`A', `OOPS')dnl
8605 capitalize(active; `active'; ``active''; ```actIVE''')
8606 @result{}Act1,Ive; Act2, Ive; Active; `Active'
8607 undivert(`capitalize2.m4')dnl
8608 @result{}divert(`-1')
8609 @result{}# upcase(text)
8610 @result{}# downcase(text)
8611 @result{}# capitalize(text)
8612 @result{}# change case of text, improved version
8613 @result{}define(`upcase', `translit(`$*', `a-z', `A-Z')')
8614 @result{}define(`downcase', `translit(`$*', `A-Z', `a-z')')
8615 @result{}define(`_arg1', `$1')
8616 @result{}define(`_to_alt', `changequote(`<<[', `]>>')')
8617 @result{}define(`_from_alt', `changequote(<<[`]>>, <<[']>>)')
8618 @result{}define(`_upcase_alt', `translit(<<[$*]>>, <<[a-z]>>, <<[A-Z]>>)')
8619 @result{}define(`_downcase_alt', `translit(<<[$*]>>, <<[A-Z]>>, <<[a-z]>>)')
8620 @result{}define(`_capitalize_alt',
8621 @result{} `regexp(<<[$1]>>, <<[^\(\w\)\(\w*\)]>>,
8622 @result{} <<[_upcase_alt(<<[<<[\1]>>]>>)_downcase_alt(<<[<<[\2]>>]>>)]>>)')
8623 @result{}define(`capitalize',
8624 @result{} `_arg1(_to_alt()patsubst(<<[<<[$*]>>]>>, <<[\w+]>>,
8625 @result{} _from_alt()`]>>_$0_alt(<<[\&]>>)<<['_to_alt())_from_alt())')
8626 @result{}divert`'dnl
8629 @node Improved fatal_error
8630 @section Solution for @code{fatal_error}
8632 The @code{fatal_error} macro (@pxref{M4exit}) is not robust to versions
8633 of GNU M4 earlier than 1.4.8, where invoking
8634 @code{@w{__file__}} (@pxref{Location}) inside @code{m4wrap} would result
8635 in an empty string, and @code{@w{__line__}} resulted in @samp{0} even
8636 though all files start at line 1. Furthermore, versions earlier than
8637 1.4.6 did not support the @code{@w{__program__}} macro. If you want
8638 @code{fatal_error} to work across the entire 1.4.x release series, a
8639 better implementation would be:
8643 define(`fatal_error',
8644 `errprint(ifdef(`__program__', `__program__', ``m4'')'dnl
8645 `:ifelse(__line__, `0', `',
8646 `__file__:__line__:')` fatal error: $*
8649 m4wrap(`divnum(`demo of internal message')
8650 fatal_error(`inside wrapped text')')
8653 @error{}m4:stdin:6: Warning: excess arguments to builtin `divnum' ignored
8655 @error{}m4:stdin:6: fatal error: inside wrapped text
8658 @c ========================================================== Appendices
8660 @node Copying This Package
8661 @appendix How to make copies of the overall M4 package
8662 @cindex License, code
8664 This appendix covers the license for copying the source code of the
8665 overall M4 package. This manual is under a different set of
8666 restrictions, covered later (@pxref{Copying This Manual}).
8669 * GNU General Public License:: License for copying the M4 package
8672 @node GNU General Public License
8673 @appendixsec License for copying the M4 package
8674 @cindex GPL, GNU General Public License
8675 @cindex GNU General Public License
8676 @cindex General Public License (GPL), GNU
8677 @include gpl-3.0.texi
8679 @node Copying This Manual
8680 @appendix How to make copies of this manual
8681 @cindex License, manual
8683 This appendix covers the license for copying this manual. Note that
8684 some of the longer examples in this manual are also distributed in the
8685 directory @file{m4-@value{VERSION}/@/examples/}, where a more
8686 permissive license is in effect when copying just the examples.
8689 * GNU Free Documentation License:: License for copying this manual
8692 @node GNU Free Documentation License
8693 @appendixsec License for copying this manual
8694 @cindex FDL, GNU Free Documentation License
8695 @cindex GNU Free Documentation License
8696 @cindex Free Documentation License (FDL), GNU
8697 @include fdl-1.3.texi
8700 @appendix Indices of concepts and macros
8703 * Macro index:: Index for all @code{m4} macros
8704 * Concept index:: Index for many concepts
8708 @appendixsec Index for all @code{m4} macros
8710 This index covers all @code{m4} builtins, as well as several useful
8711 composite macros. References are exclusively to the places where a
8712 macro is introduced the first time.
8717 @appendixsec Index for many concepts
8724 @c coding: iso-8859-1
8726 @c ispell-local-dictionary: "american"
8727 @c indent-tabs-mode: nil
8728 @c whitespace-check-buffer-indent: nil