1 \input texinfo @c -*- texinfo -*-
2 @comment ========================================================
3 @comment %**start of header
6 @settitle GNU M4 @value{VERSION} macro processor
7 @documentencoding UTF-8
8 @set txicodequoteundirected
9 @set txicodequotebacktick
10 @setchapternewpage odd
15 @c The testsuite expects literal tab output in some examples, but
16 @c literal tabs in texinfo lead to formatting issues.
22 @c -------------------
23 @c The ARG is an optional argument. To be used for macro arguments in
24 @c their documentation (@defmac).
26 @r{[}@var{\varname\}@r{]}
29 @c @dvar{ARG, DEFAULT}
30 @c -------------------
31 @c The ARG is an optional argument, defaulting to DEFAULT. To be used
32 @c for macro arguments in their documentation (@defmac).
33 @macro dvar{varname, default}
34 @r{[}@var{\varname\} = @samp{\default\}@r{]}
37 @comment %**end of header
38 @comment ========================================================
42 This manual (@value{UPDATED}) is for GNU M4 (version
43 @value{VERSION}), a package containing an implementation of the m4 macro
46 Copyright @copyright{} 1989--1994, 2004--2014, 2016--2017, 2020--2021
47 Free Software Foundation, Inc.
50 Permission is granted to copy, distribute and/or modify this document
51 under the terms of the GNU Free Documentation License,
52 Version 1.3 or any later version published by the Free Software
53 Foundation; with no Invariant Sections, no Front-Cover Texts, and no
54 Back-Cover Texts. A copy of the license is included in the section
55 entitled ``GNU Free Documentation License.''
59 @dircategory Text creation and manipulation
61 * M4: (m4). A powerful macro processor.
65 @title GNU M4, version @value{VERSION}
66 @subtitle A powerful macro processor
67 @subtitle Edition @value{EDITION}, @value{UPDATED}
68 @author by Ren@'e Seindal, Fran@,{c}ois Pinard,
69 @author Gary V. Vaughan, and Eric Blake
70 @author (@email{bug-m4@@gnu.org})
73 @vskip 0pt plus 1filll
85 GNU @code{m4} is an implementation of the traditional UNIX macro
86 processor. It is mostly SVR4 compatible, although it has some
87 extensions (for example, handling more than 9 positional parameters
88 to macros). @code{m4} also has builtin functions for including
89 files, running shell commands, doing arithmetic, etc. Autoconf needs
90 GNU @code{m4} for generating @file{configure} scripts, but not for
93 GNU @code{m4} was originally written by Ren@'e Seindal, with
94 subsequent changes by Fran@,{c}ois Pinard and other volunteers
95 on the Internet. All names and email addresses can be found in the
96 files @file{m4-@value{VERSION}/@/AUTHORS} and
97 @file{m4-@value{VERSION}/@/THANKS} from the GNU M4
100 This is release @value{VERSION}. It is now considered stable: future
101 releases in the 1.4.x series are only meant to fix bugs, increase speed,
102 or improve documentation. However@dots{}
104 An experimental feature, which would improve @code{m4} usefulness,
105 allows for changing the syntax for what is a @dfn{word} in @code{m4}.
109 ./configure --enable-changeword
112 if you want this feature compiled in. The current implementation
113 slows down @code{m4} considerably and is hardly acceptable. In the
114 future, @code{m4} 2.0 will come with a different set of new features
115 that provide similar capabilities, but without the inefficiencies, so
116 changeword will go away and @emph{you should not count on it}.
119 * Preliminaries:: Introduction and preliminaries
120 * Invoking m4:: Invoking @code{m4}
121 * Syntax:: Lexical and syntactic conventions
123 * Macros:: How to invoke macros
124 * Definitions:: How to define new macros
125 * Conditionals:: Conditionals, loops, and recursion
127 * Debugging:: How to debug macros and input
129 * Input Control:: Input control
130 * File Inclusion:: File inclusion
131 * Diversions:: Diverting and undiverting output
133 * Text handling:: Macros for text handling
134 * Arithmetic:: Macros for doing arithmetic
135 * Shell commands:: Macros for running shell commands
136 * Miscellaneous:: Miscellaneous builtin macros
137 * Frozen files:: Fast loading of frozen state
139 * Compatibility:: Compatibility with other versions of @code{m4}
140 * Answers:: Correct version of some examples
142 * Copying This Package:: How to make copies of the overall M4 package
143 * Copying This Manual:: How to make copies of this manual
144 * Indices:: Indices of concepts and macros
147 --- The Detailed Node Listing ---
149 Introduction and preliminaries
151 * Intro:: Introduction to @code{m4}
152 * History:: Historical references
153 * Bugs:: Problems and bugs
154 * Manual:: Using this manual
158 * Operation modes:: Command line options for operation modes
159 * Preprocessor features:: Command line options for preprocessor features
160 * Limits control:: Command line options for limits control
161 * Frozen state:: Command line options for frozen state
162 * Debugging options:: Command line options for debugging
163 * Command line files:: Specifying input files on the command line
165 Lexical and syntactic conventions
167 * Names:: Macro names
168 * Quoted strings:: Quoting input to @code{m4}
169 * Comments:: Comments in @code{m4} input
170 * Other tokens:: Other kinds of input tokens
171 * Input processing:: How @code{m4} copies input to output
175 * Invocation:: Macro invocation
176 * Inhibiting Invocation:: Preventing macro invocation
177 * Macro Arguments:: Macro arguments
178 * Quoting Arguments:: On Quoting Arguments to macros
179 * Macro expansion:: Expanding macros
181 How to define new macros
183 * Define:: Defining a new macro
184 * Arguments:: Arguments to macros
185 * Pseudo Arguments:: Special arguments to macros
186 * Undefine:: Deleting a macro
187 * Defn:: Renaming macros
188 * Pushdef:: Temporarily redefining macros
190 * Indir:: Indirect call of macros
191 * Builtin:: Indirect call of builtins
193 Conditionals, loops, and recursion
195 * Ifdef:: Testing if a macro is defined
196 * Ifelse:: If-else construct, or multibranch
197 * Shift:: Recursion in @code{m4}
198 * Forloop:: Iteration by counting
199 * Foreach:: Iteration by list contents
200 * Stacks:: Working with definition stacks
201 * Composition:: Building macros with macros
203 How to debug macros and input
205 * Dumpdef:: Displaying macro definitions
206 * Trace:: Tracing macro calls
207 * Debug Levels:: Controlling debugging output
208 * Debug Output:: Saving debugging output
212 * Dnl:: Deleting whitespace in input
213 * Changequote:: Changing the quote characters
214 * Changecom:: Changing the comment delimiters
215 * Changeword:: Changing the lexical structure of words
216 * M4wrap:: Saving text until end of input
220 * Include:: Including named files
221 * Search Path:: Searching for include files
223 Diverting and undiverting output
225 * Divert:: Diverting output
226 * Undivert:: Undiverting output
227 * Divnum:: Diversion numbers
228 * Cleardivert:: Discarding diverted text
230 Macros for text handling
232 * Len:: Calculating length of strings
233 * Index macro:: Searching for substrings
234 * Regexp:: Searching for regular expressions
235 * Substr:: Extracting substrings
236 * Translit:: Translating characters
237 * Patsubst:: Substituting text by regular expression
238 * Format:: Formatting strings (printf-like)
240 Macros for doing arithmetic
242 * Incr:: Decrement and increment operators
243 * Eval:: Evaluating integer expressions
245 Macros for running shell commands
247 * Platform macros:: Determining the platform
248 * Syscmd:: Executing simple commands
249 * Esyscmd:: Reading the output of commands
250 * Sysval:: Exit status
251 * Mkstemp:: Making temporary files
253 Miscellaneous builtin macros
255 * Errprint:: Printing error messages
256 * Location:: Printing current location
257 * M4exit:: Exiting from @code{m4}
259 Fast loading of frozen state
261 * Using frozen files:: Using frozen files
262 * Frozen file format:: Frozen file format
264 Compatibility with other versions of @code{m4}
266 * Extensions:: Extensions in GNU M4
267 * Incompatibilities:: Facilities in System V m4 not in GNU M4
268 * Other Incompatibilities:: Other incompatibilities
270 Correct version of some examples
272 * Improved exch:: Solution for @code{exch}
273 * Improved forloop:: Solution for @code{forloop}
274 * Improved foreach:: Solution for @code{foreach}
275 * Improved copy:: Solution for @code{copy}
276 * Improved m4wrap:: Solution for @code{m4wrap}
277 * Improved cleardivert:: Solution for @code{cleardivert}
278 * Improved capitalize:: Solution for @code{capitalize}
279 * Improved fatal_error:: Solution for @code{fatal_error}
281 How to make copies of the overall M4 package
283 * GNU General Public License:: License for copying the M4 package
285 How to make copies of this manual
287 * GNU Free Documentation License:: License for copying this manual
289 Indices of concepts and macros
291 * Macro index:: Index for all @code{m4} macros
292 * Concept index:: Index for many concepts
298 @chapter Introduction and preliminaries
300 This first chapter explains what GNU @code{m4} is, where @code{m4}
301 comes from, how to read and use this documentation, how to call the
302 @code{m4} program, and how to report bugs about it. It concludes by
303 giving tips for reading the remainder of the manual.
305 The following chapters then detail all the features of the @code{m4}
309 * Intro:: Introduction to @code{m4}
310 * History:: Historical references
311 * Bugs:: Problems and bugs
312 * Manual:: Using this manual
316 @section Introduction to @code{m4}
318 @cindex overview of @code{m4}
319 @code{m4} is a macro processor, in the sense that it copies its
320 input to the output, expanding macros as it goes. Macros are either
321 builtin or user-defined, and can take any number of arguments.
322 Besides just doing macro expansion, @code{m4} has builtin functions
323 for including named files, running shell commands, doing integer
324 arithmetic, manipulating text in various ways, performing recursion,
325 etc.@dots{} @code{m4} can be used either as a front-end to a compiler,
326 or as a macro processor in its own right.
328 The @code{m4} macro processor is widely available on all UNIXes, and has
329 been standardized by POSIX.
330 Usually, only a small percentage of users are aware of its existence.
331 However, those who find it often become committed users. The
332 popularity of GNU Autoconf, which requires GNU
333 @code{m4} for @emph{generating} @file{configure} scripts, is an incentive
334 for many to install it, while these people will not themselves
335 program in @code{m4}. GNU @code{m4} is mostly compatible with the
336 System V, Release 4 version, except for some minor differences.
337 @xref{Compatibility}, for more details.
339 Some people find @code{m4} to be fairly addictive. They first use
340 @code{m4} for simple problems, then take bigger and bigger challenges,
341 learning how to write complex sets of @code{m4} macros along the way.
342 Once really addicted, users pursue writing of sophisticated @code{m4}
343 applications even to solve simple problems, devoting more time
344 debugging their @code{m4} scripts than doing real work. Beware that
345 @code{m4} may be dangerous for the health of compulsive programmers.
348 @section Historical references
350 @cindex history of @code{m4}
351 @cindex GNU M4, history of
352 Macro languages were invented early in the history of computing. In the
353 1950s Alan Perlis suggested that the macro language be independent of the
354 language being processed. Techniques such as conditional and recursive
355 macros, and using macros to define other macros, were described by Doug
356 McIlroy of Bell Labs in ``Macro Instruction Extensions of Compiler
357 Languages'', @emph{Communications of the ACM} 3, 4 (1960), 214--20,
358 @url{https://dl.acm.org/doi/10.1145/367177.367223}.
360 An important precursor of @code{m4} was GPM; see C. Strachey,
361 @c The title uses lower case and has no space between "macro" and "generator".
362 ``A general purpose macrogenerator'', @emph{Computer Journal} 8, 3
364 @url{https://academic.oup.com/comjnl/article/8/3/225/336044}. GPM is
365 also succinctly described in David Gries's book @emph{Compiler
366 Construction for Digital Computers}, Wiley (1971). Strachey was a
367 brilliant programmer: GPM fit into 250 machine instructions!
369 Inspired by GPM while visiting Strachey's Lab in 1968, McIlroy wrote a
370 model preprocessor in that fit into a page of Snobol 3 code, and McIlroy
371 and Robert Morris developed a series of further models at Bell Labs.
372 Andrew D. Hall followed up with M6, a general purpose macro processor
373 used to port the Fortran source code of the Altran computer algebra
374 system; see Hall's ``The M6 Macro Processor'', Computing Science
375 Technical Report #2, Bell Labs (1972),
376 @url{http://cm.bell-labs.com/cm/cs/cstr/2.pdf}. M6's source code
377 consisted of about 600 Fortran statements. Its name was the first of
380 The Brian Kernighan and P.J. Plauger book @emph{Software Tools},
381 Addison-Wesley (1976), describes and implements a Unix
382 macro-processor language, which inspired Dennis Ritchie to write
383 @code{m3}, a macro processor for the AP-3 minicomputer.
385 Kernighan and Ritchie then joined forces to develop the original
386 @code{m4}, described in ``The M4 Macro Processor'', Bell Laboratories
387 (1977), @url{https://wolfram.schneider.org/bsd/7thEdManVol2/m4/m4.pdf}.
388 It had only 21 builtin macros.
390 While @code{GPM} was more @emph{pure}, @code{m4} is meant to deal with
391 the true intricacies of real life: macros can be recognized without
392 being pre-announced, skipping whitespace or end-of-lines is easier,
393 more constructs are builtin instead of derived, etc.
395 Originally, the Kernighan and Plauger macro-processor, and then
396 @code{m3}, formed the engine for the Rational FORTRAN preprocessor,
397 that is, the @code{Ratfor} equivalent of @code{cpp}. Later, @code{m4}
398 was used as a front-end for @code{Ratfor}, @code{C} and @code{Cobol}.
400 Ren@'e Seindal released his implementation of @code{m4}, GNU
402 in 1990, with the aim of removing the artificial limitations in many
403 of the traditional @code{m4} implementations, such as maximum line
404 length, macro size, or number of macros.
406 The late Professor A. Dain Samples described and implemented a further
407 evolution in the form of @code{M5}: ``User's Guide to the M5 Macro
408 Language: 2nd edition'', Electronic Announcement on comp.compilers
411 Fran@,{c}ois Pinard took over maintenance of GNU @code{m4} in
412 1992, until 1994 when he released GNU @code{m4} 1.4, which was
413 the stable release for 10 years. It was at this time that GNU
414 Autoconf decided to require GNU @code{m4} as its underlying
415 engine, since all other implementations of @code{m4} had too many
418 More recently, in 2004, Paul Eggert released 1.4.1 and 1.4.2 which
419 addressed some long standing bugs in the venerable 1.4 release. Then in
420 2005, Gary V. Vaughan collected together the many patches to
421 GNU @code{m4} 1.4 that were floating around the net and
422 released 1.4.3 and 1.4.4. And in 2006, Eric Blake joined the team and
423 prepared patches for the release of 1.4.5, with subsequent releases
424 through intervening years, as recent as 1.4.18 in 2016.
426 Meanwhile, development has continued on new features for @code{m4}, such
427 as dynamic module loading and additional builtins. When complete,
428 GNU @code{m4} 2.0 will start a new series of releases.
431 @section Problems and bugs
433 @cindex reporting bugs
435 @cindex suggestions, reporting
436 If you have problems with GNU M4 or think you've found a bug,
437 please report it. Before reporting a bug, make sure you've actually
438 found a real bug. Carefully reread the documentation and see if it
439 really says you can do what you're trying to do. If it's not clear
440 whether you should be able to do something or not, report that too; it's
441 a bug in the documentation!
443 Before reporting a bug or trying to fix it yourself, try to isolate it
444 to the smallest possible input file that reproduces the problem. Then
445 send us the input file and the exact results @code{m4} gave you. Also
446 say what you expected to occur; this will help us decide whether the
447 problem was really in the documentation.
449 Once you've got a precise problem, send e-mail to
450 @email{bug-m4@@gnu.org}. Please include the version number of @code{m4}
451 you are using. You can get this information with the command
452 @kbd{m4 --version}. Also provide details about the platform you are
455 Non-bug suggestions are always welcome as well. If you have questions
456 about things that are unclear in the documentation or are just obscure
457 features, please report them too.
460 @section Using this manual
462 @cindex examples, understanding
463 This manual contains a number of examples of @code{m4} input and output,
464 and a simple notation is used to distinguish input, output and error
465 messages from @code{m4}. Examples are set out from the normal text, and
466 shown in a fixed width font, like this
470 This is an example of an example!
473 To distinguish input from output, all output from @code{m4} is prefixed
474 by the string @samp{@result{}}, and all error messages by the string
475 @samp{@error{}}. When showing how command line options affect matters,
476 the command line is shown with a prompt @samp{$ @kbd{like this}},
477 otherwise, you can assume that a simple @kbd{m4} invocation will work.
482 $ @kbd{command line to invoke m4}
483 Example of input line
484 @result{}Output line from m4
485 @error{}and an error message
488 The sequence @samp{^D} in an example indicates the end of the input
489 file. The sequence @samp{@key{NL}} refers to the newline character.
490 The majority of these examples are self-contained, and you can run them
491 with similar results by invoking @kbd{m4 -d}. In fact, the testsuite
492 that is bundled in the GNU M4 package consists of the examples
493 in this document! Some of the examples assume that your current
494 directory is located where you unpacked the installation, so if you plan
495 on following along, you may find it helpful to do this now:
499 $ @kbd{cd m4-@value{VERSION}}
502 As each of the predefined macros in @code{m4} is described, a prototype
503 call of the macro will be shown, giving descriptive names to the
506 @deffn Composite example (@var{string}, @dvar{count, 1}, @
507 @ovar{argument}@dots{})
508 This is a sample prototype. There is not really a macro named
509 @code{example}, but this documents that if there were, it would be a
510 Composite macro, rather than a Builtin. It requires at least one
511 argument, @var{string}. Remember that in @code{m4}, there must not be a
512 space between the macro name and the opening parenthesis, unless it was
513 intended to call the macro without any arguments. The brackets around
514 @var{count} and @var{argument} show that these arguments are optional.
515 If @var{count} is omitted, the macro behaves as if count were @samp{1},
516 whereas if @var{argument} is omitted, the macro behaves as if it were
517 the empty string. A blank argument is not the same as an omitted
518 argument. For example, @samp{example(`a')}, @samp{example(`a',`1')},
519 and @samp{example(`a',`1',)} would behave identically with @var{count}
520 set to @samp{1}; while @samp{example(`a',)} and @samp{example(`a',`')}
521 would explicitly pass the empty string for @var{count}. The ellipses
522 (@samp{@dots{}}) show that the macro processes additional arguments
523 after @var{argument}, rather than ignoring them.
527 All macro arguments in @code{m4} are strings, but some are given
528 special interpretation, e.g., as numbers, file names, regular
529 expressions, etc. The documentation for each macro will state how the
530 parameters are interpreted, and what happens if the argument cannot be
531 parsed according to the desired interpretation. Unless specified
532 otherwise, a parameter specified to be a number is parsed as a decimal,
533 even if the argument has leading zeros; and parsing the empty string as
534 a number results in 0 rather than an error, although a warning will be
537 This document consistently writes and uses @dfn{builtin}, without a
538 hyphen, as if it were an English word. This is how the @code{builtin}
539 primitive is spelled within @code{m4}.
542 @chapter Invoking @code{m4}
545 @cindex invoking @code{m4}
546 The format of the @code{m4} command is:
550 @code{m4} @r{[}@var{option}@dots{}@r{]} @r{[}@var{file}@dots{}@r{]}
553 @cindex command line, options
554 @cindex options, command line
555 @cindex @env{POSIXLY_CORRECT}
556 All options begin with @samp{-}, or if long option names are used, with
557 @samp{--}. A long option name need not be written completely, any
558 unambiguous prefix is sufficient. POSIX requires @code{m4} to
559 recognize arguments intermixed with files, even when
560 @env{POSIXLY_CORRECT} is set in the environment. Most options take
561 effect at startup regardless of their position, but some are documented
562 below as taking effect after any files that occurred earlier in the
563 command line. The argument @option{--} is a marker to denote the end of
566 With short options, options that do not take arguments may be combined
567 into a single command line argument with subsequent options, options
568 with mandatory arguments may be provided either as a single command line
569 argument or as two arguments, and options with optional arguments must
570 be provided as a single argument. In other words,
571 @kbd{m4 -QPDfoo -d a -df} is equivalent to
572 @kbd{m4 -Q -P -D foo -d -df -- ./a}, although the latter form is
573 considered canonical.
575 With long options, options with mandatory arguments may be provided with
576 an equal sign (@samp{=}) in a single argument, or as two arguments, and
577 options with optional arguments must be provided as a single argument.
578 In other words, @kbd{m4 --def foo --debug a} is equivalent to
579 @kbd{m4 --define=foo --debug= -- ./a}, although the latter form is
580 considered canonical (not to mention more robust, in case a future
581 version of @code{m4} introduces an option named @option{--default}).
583 @code{m4} understands the following options, grouped by functionality.
586 * Operation modes:: Command line options for operation modes
587 * Preprocessor features:: Command line options for preprocessor features
588 * Limits control:: Command line options for limits control
589 * Frozen state:: Command line options for frozen state
590 * Debugging options:: Command line options for debugging
591 * Command line files:: Specifying input files on the command line
594 @node Operation modes
595 @section Command line options for operation modes
597 Several options control the overall operation of @code{m4}:
601 Print a help summary on standard output, then immediately exit
602 @code{m4} without reading any input files or performing any other
606 Print the version number of the program on standard output, then
607 immediately exit @code{m4} without reading any input files or
608 performing any other actions.
611 @itemx --fatal-warnings
612 @cindex errors, fatal
614 Controls the effect of warnings. If unspecified, then execution
615 continues and exit status is unaffected when a warning is printed. If
616 specified exactly once, warnings become fatal; when one is issued,
617 execution continues, but the exit status will be non-zero. If specified
618 multiple times, then execution halts with non-zero status the first time
619 a warning is issued. The introduction of behavior levels is new to M4
620 1.4.9; for behavior consistent with earlier versions, you should specify
626 Makes this invocation of @code{m4} interactive. This means that all
627 output will be unbuffered, and interrupts will be ignored. The
628 spelling @option{-e} exists for compatibility with other @code{m4}
629 implementations, and issues a warning because it may be withdrawn in a
630 future version of GNU M4.
633 @itemx --prefix-builtins
634 Internally modify @emph{all} builtin macro names so they all start with
635 the prefix @samp{m4_}. For example, using this option, one should write
636 @samp{m4_define} instead of @samp{define}, and @samp{m4___file__}
637 instead of @samp{__file__}. This option has no effect if @option{-R}
643 Suppress warnings, such as missing or superfluous arguments in macro
644 calls, or treating the empty string as zero.
646 @item --warn-macro-sequence@r{[}=@var{regexp}@r{]}
647 Issue a warning if the regular expression @var{regexp} has a non-empty
648 match in any macro definition (either by @code{define} or
649 @code{pushdef}). Empty matches are ignored; therefore, supplying the
650 empty string as @var{regexp} disables any warning. If the optional
651 @var{regexp} is not supplied, then the default regular expression is
652 @samp{\$\(@{[^@}]*@}\|[0-9][0-9]+\)} (a literal @samp{$} followed by
653 multiple digits or by an open brace), since these sequences will
654 change semantics in the default operation of GNU M4 2.0 (due
655 to a change in how more than 9 arguments in a macro definition will be
656 handled, @pxref{Arguments}). Providing an alternate regular
657 expression can provide a useful reverse lookup feature of finding
658 where a macro is defined to have a given definition.
660 @item -W @var{regexp}
661 @itemx --word-regexp=@var{regexp}
662 Use @var{regexp} as an alternative syntax for macro names. This
663 experimental option will not be present in all GNU @code{m4}
664 implementations (@pxref{Changeword}).
667 @node Preprocessor features
668 @section Command line options for preprocessor features
670 @cindex macro definitions, on the command line
671 @cindex command line, macro definitions on the
672 @cindex preprocessor features
673 Several options allow @code{m4} to behave more like a preprocessor.
674 Macro definitions and deletions can be made on the command line, the
675 search path can be altered, and the output file can track where the
676 input came from. These features occur with the following options:
679 @item -D @var{name}@r{[}=@var{value}@r{]}
680 @itemx --define=@var{name}@r{[}=@var{value}@r{]}
681 This enters @var{name} into the symbol table. If @samp{=@var{value}} is
682 missing, the value is taken to be the empty string. The @var{value} can
683 be any string, and the macro can be defined to take arguments, just as
684 if it was defined from within the input. This option may be given more
685 than once; order with respect to file names is significant, and
686 redefining the same @var{name} loses the previous value.
688 @item -I @var{directory}
689 @itemx --include=@var{directory}
690 Make @code{m4} search @var{directory} for included files that are not
691 found in the current working directory. @xref{Search Path}, for more
692 details. This option may be given more than once.
696 @cindex synchronization lines
697 @cindex location, input
698 @cindex input location
699 Generate synchronization lines, for use by the C preprocessor or other
700 similar tools. Order is significant with respect to file names. This
701 option is useful, for example, when @code{m4} is used as a
702 front end to a compiler. Source file name and line number information
703 is conveyed by directives of the form @samp{#line @var{linenum}
704 "@var{file}"}, which are inserted as needed into the middle of the
705 output. Such directives mean that the following line originated or was
706 expanded from the contents of input file @var{file} at line
707 @var{linenum}. The @samp{"@var{file}"} part is often omitted when
708 the file name did not change from the previous directive.
710 Synchronization directives are always given on complete lines by
711 themselves. When a synchronization discrepancy occurs in the middle of
712 an output line, the associated synchronization directive is delayed
713 until the next newline that does not occur in the middle of a quoted
720 @result{}#line 2 "stdin"
722 changecom(`/*', `*/')
724 define(`comment', `/*1
751 @itemx --undefine=@var{name}
752 This deletes any predefined meaning @var{name} might have. Obviously,
753 only predefined macros can be deleted in this way. This option may be
754 given more than once; undefining a @var{name} that does not have a
755 definition is silently ignored. Order is significant with respect to
760 @section Command line options for limits control
762 There are some limits within @code{m4} that can be tuned. For
763 compatibility, @code{m4} also accepts some options that control limits
764 in other implementations, but which are automatically unbounded (limited
765 only by your hardware and operating system constraints) in GNU
771 Enable all the extensions in this implementation. In this release of
772 M4, this option is always on by default; it is currently only useful
773 when overriding a prior use of @option{--traditional}. However, having
774 GNU behavior as default makes it impossible to write a
775 strictly POSIX-compliant client that avoids all incompatible
776 GNU M4 extensions, since such a client would have to use the
777 non-POSIX command-line option to force full POSIX
778 behavior. Thus, a future version of M4 will be changed to implicitly
779 use the option @option{--traditional} if the environment variable
780 @env{POSIXLY_CORRECT} is set. Projects that intentionally use
781 GNU extensions should consider using @option{--gnu} to state
782 their intentions, so that the project will not mysteriously break if the
783 user upgrades to a newer M4 and has @env{POSIXLY_CORRECT} set in their
788 Suppress all the extensions made in this implementation, compared to the
789 System V version. @xref{Compatibility}, for a list of these.
792 @itemx --hashsize=@var{num}
793 Make the internal hash table for symbol lookup be @var{num} entries big.
794 For better performance, the number should be prime, but this is not
795 checked. The default is 65537 entries. It should not be necessary to
796 increase this value, unless you define an excessive number of macros.
799 @itemx --nesting-limit=@var{num}
800 @cindex nesting limit
801 @cindex limit, nesting
802 Artificially limit the nesting of macro calls to @var{num} levels,
803 stopping program execution if this limit is ever exceeded. When not
804 specified, nesting defaults to unlimited on platforms that can detect
805 stack overflow, and to 1024 levels otherwise. A value of zero means
806 unlimited; but then heavily nested code could potentially cause a stack
809 The precise effect of this option is more correctly associated
810 with textual nesting than dynamic recursion. It has been useful
811 when some complex @code{m4} input was generated by mechanical means, and
812 also in diagnosing recursive algorithms that do not scale well.
813 Most users never need to change this option from its default.
816 This option does @emph{not} have the ability to break endless
817 rescanning loops, since these do not necessarily consume much memory
818 or stack space. Through clever usage of rescanning loops, one can
819 request complex, time-consuming computations from @code{m4} with useful
820 results. Putting limitations in this area would break @code{m4} power.
821 There are many pathological cases: @w{@samp{define(`a', `a')a}} is
822 only the simplest example (but @pxref{Compatibility}). Expecting GNU
823 @code{m4} to detect these would be a little like expecting a compiler
824 system to detect and diagnose endless loops: it is a quite @emph{hard}
825 problem in general, if not undecidable!
830 These options are present for compatibility with System V @code{m4}, but
831 do nothing in this implementation. They may disappear in future
832 releases, and issue a warning to that effect.
835 @itemx --diversions=@var{num}
836 These options are present only for compatibility with previous
837 versions of GNU @code{m4}, and were controlling the number of
838 possible diversions which could be used at the same time. They do nothing,
839 because there is no fixed limit anymore. They may disappear in future
840 releases, and issue a warning to that effect.
844 @section Command line options for frozen state
846 GNU @code{m4} comes with a feature of freezing internal state
847 (@pxref{Frozen files}). This can be used to speed up @code{m4}
848 execution when reusing a common initialization script.
852 @itemx --freeze-state=@var{file}
853 Once execution is finished, write out the frozen state on the specified
854 @var{file}. It is conventional, but not required, for @var{file} to end
858 @itemx --reload-state=@var{file}
859 Before execution starts, recover the internal state from the specified
860 frozen @var{file}. The options @option{-D}, @option{-U}, and
861 @option{-t} take effect after state is reloaded, but before the input
865 @node Debugging options
866 @section Command line options for debugging
868 Finally, there are several options for aiding in debugging @code{m4}
872 @item -d@r{[}@var{flags}@r{]}
873 @itemx --debug@r{[}=@var{flags}@r{]}
874 Set the debug-level according to the flags @var{flags}. The debug-level
875 controls the format and amount of information presented by the debugging
876 functions. @xref{Debug Levels}, for more details on the format and
877 meaning of @var{flags}. If omitted, @var{flags} defaults to @samp{aeq}.
879 @item --debugfile@r{[}=@var{file}@r{]}
881 @itemx --error-output=@var{file}
882 Redirect @code{dumpdef} output, debug messages, and trace output to the
883 named @var{file}. Warnings, error messages, and @code{errprint} output
884 are still printed to standard error. If these options are not used, or
885 if @var{file} is unspecified (only possible for @option{--debugfile}),
886 debug output goes to standard error; if @var{file} is the empty string,
887 debug output is discarded. @xref{Debug Output}, for more details. The
888 option @option{--debugfile} may be given more than once, and order is
889 significant with respect to file names. The spellings @option{-o} and
890 @option{--error-output} are misleading and inconsistent with other
891 GNU tools; for now they are silently accepted as synonyms of
892 @option{--debugfile} and only recognized once, but in a future version
893 of M4, using them will cause a warning to be issued.
896 @comment not worth including in the manual, but provides a good test
899 @comment options: -Dbar=hello -tbar --debugfile= foo --debugfile -
901 $ @kbd{m4 -d -Iexamples -Dbar=hello -tbar --debugfile= foo --debugfile -
907 @error{}m4trace: -1- bar -> `hello'
913 @itemx --arglength=@var{num}
914 Restrict the size of the output generated by macro tracing to @var{num}
915 characters per trace line. If unspecified or zero, output is
916 unlimited. @xref{Debug Levels}, for more details.
919 @itemx --trace=@var{name}
920 This enables tracing for the macro @var{name}, at any point where it is
921 defined. @var{name} need not be defined when this option is given.
922 This option may be given more than once, and order is significant with
923 respect to file names. @xref{Trace}, for more details.
926 @node Command line files
927 @section Specifying input files on the command line
929 @cindex command line, file names on the
930 @cindex file names, on the command line
931 The remaining arguments on the command line are taken to be input file
932 names. If no names are present, standard input is read. A file
933 name of @file{-} is taken to mean standard input. It is
934 conventional, but not required, for input files to end in @samp{.m4}.
936 The input files are read in the sequence given. Standard input can be
937 read more than once, so the file name @file{-} may appear multiple times
938 on the command line; this makes a difference when input is from a
939 terminal or other special file type. It is an error if an input file
940 ends in the middle of argument collection, a comment, or a quoted
943 The options @option{--define} (@option{-D}), @option{--undefine}
944 (@option{-U}), @option{--synclines} (@option{-s}), and @option{--trace}
945 (@option{-t}) only take effect after processing input from any file
946 names that occur earlier on the command line. For example, assume the
947 file @file{foo} contains:
955 The text @samp{bar} can then be redefined over multiple uses of
958 @comment options: -Dbar=hello foo -Dbar=world foo
960 $ @kbd{m4 -Dbar=hello foo -Dbar=world foo}
965 If none of the input files invoked @code{m4exit} (@pxref{M4exit}), the
966 exit status of @code{m4} will be 0 for success, 1 for general failure
967 (such as problems with reading an input file), and 63 for version
968 mismatch (@pxref{Using frozen files}).
970 If you need to read a file whose name starts with a @file{-}, you can
971 specify it as @samp{./-file}, or use @option{--} to mark the end of
975 @comment Test that 'm4 file/' detects that file is not a directory; we
976 @comment can assume that the current directory contains a Makefile.
977 @comment mingw fails with EINVAL rather than ENOTDIR.
980 @comment xerr: ignore
981 @comment options: Makefile/
983 @error{}m4: cannot open `Makefile/': Not a directory
986 @comment Test that closed stderr does not cause a crash. Not all
987 @comment systems have the same message for EBADF.
989 @comment xerr: ignore
992 `errprint(` skipping: syscmd does not have unix semantics
994 syscmd(`echo | cat >&- 2>/dev/null')ifelse(sysval, `0',
995 `errprint(` skipping: system does not allow closing stdout
997 changequote(`[', `]')dnl
998 syscmd([echo | ']__program__[' >&-])dnl
999 @error{}m4: write error: Bad file descriptor
1006 `errprint(` skipping: syscmd does not have unix semantics
1008 syscmd(`echo | cat >&- 2>/dev/null')ifelse(sysval, `0',
1009 `errprint(` skipping: system does not allow closing stdout
1011 changequote(`[', `]')dnl
1012 syscmd([echo 'esyscmd(echo hi >&2 && echo err"print(bye
1013 )d"nl)dnl' > tmp.m4 \
1014 && ']__program__[' tmp.m4 <&- >&- \
1015 && rm tmp.m4])sysval
1021 @comment Test that we obey POSIX semantics with -D interspersed with
1022 @comment files, even with POSIXLY_CORRECT (BSD getopt gets it wrong).
1027 `errprint(` skipping: syscmd does not have unix semantics
1029 changequote(`[', `]')dnl
1030 syscmd([POSIXLY_CORRECT=1 ']__program__[' -Dbar=hello foo -Dbar=world foo])dnl
1039 @chapter Lexical and syntactic conventions
1041 @cindex input tokens
1043 As @code{m4} reads its input, it separates it into @dfn{tokens}. A
1044 token is either a name, a quoted string, or any single character, that
1045 is not a part of either a name or a string. Input to @code{m4} can also
1046 contain comments. GNU @code{m4} does not yet understand
1047 multibyte locales; all operations are byte-oriented rather than
1048 character-oriented (although if your locale uses a single byte
1049 encoding, such as @sc{ISO-8859-1}, you will not notice a difference).
1050 However, @code{m4} is eight-bit clean, so you can
1051 use non-@sc{ascii} characters in quoted strings (@pxref{Changequote}),
1052 comments (@pxref{Changecom}), and macro names (@pxref{Indir}), with the
1053 exception of the @sc{nul} character (the zero byte @samp{'\0'}).
1056 * Names:: Macro names
1057 * Quoted strings:: Quoting input to @code{m4}
1058 * Comments:: Comments in @code{m4} input
1059 * Other tokens:: Other kinds of input tokens
1060 * Input processing:: How @code{m4} copies input to output
1064 @section Macro names
1068 A name is any sequence of letters, digits, and the character @samp{_}
1069 (underscore), where the first character is not a digit. @code{m4} will
1070 use the longest such sequence found in the input. If a name has a
1071 macro definition, it will be subject to macro expansion
1072 (@pxref{Macros}). Names are case-sensitive.
1074 Examples of legal names are: @samp{foo}, @samp{_tmp}, and @samp{name01}.
1076 @node Quoted strings
1077 @section Quoting input to @code{m4}
1079 @cindex quoted string
1080 @cindex string, quoted
1081 A quoted string is a sequence of characters surrounded by quote
1082 strings, defaulting to
1083 @samp{`} and @samp{'}, where the nested begin and end quotes within the
1084 string are balanced. The value of a string token is the text, with one
1085 level of quotes stripped off. Thus
1094 is the empty string, and double-quoting turns into single-quoting.
1102 The quote characters can be changed at any time, using the builtin macro
1103 @code{changequote}. @xref{Changequote}, for more information.
1106 @section Comments in @code{m4} input
1109 Comments in @code{m4} are normally delimited by the characters @samp{#}
1110 and newline. All characters between the comment delimiters are ignored,
1111 but the entire comment (including the delimiters) is passed through to
1112 the output---comments are @emph{not} discarded by @code{m4}.
1114 Comments cannot be nested, so the first newline after a @samp{#} ends
1115 the comment. The commenting effect of the begin-comment string
1116 can be inhibited by quoting it.
1120 `quoted text' # `commented text'
1121 @result{}quoted text # `commented text'
1122 `quoting inhibits' `#' `comments'
1123 @result{}quoting inhibits # comments
1126 The comment delimiters can be changed to any string at any time, using
1127 the builtin macro @code{changecom}. @xref{Changecom}, for more
1131 @comment Detect regression in 1.4.10b in regards to reparsing comments.
1132 @comment Not worth including in the manual.
1134 define(`e', `$@@')define(`q', ``$@@'')define(`foo', `bar')
1140 @result{}',`#two bar
1142 changecom(`<', `>')define(`n', `$#')
1152 @section Other kinds of input tokens
1154 @cindex tokens, special
1155 Any character, that is neither a part of a name, nor of a quoted string,
1156 nor a comment, is a token by itself. When not in the context of macro
1157 expansion, all of these tokens are just copied to output. However,
1158 during macro expansion, whitespace characters (space, tab, newline,
1159 formfeed, carriage return, vertical tab), parentheses (@samp{(} and
1160 @samp{)}), comma (@samp{,}), and dollar (@samp{$}) have additional
1161 roles, explained later.
1163 @node Input processing
1164 @section How @code{m4} copies input to output
1166 As @code{m4} reads the input token by token, it will copy each token
1167 directly to the output immediately.
1169 The exception is when it finds a word with a macro definition. In that
1170 case @code{m4} will calculate the macro's expansion, possibly reading
1171 more input to get the arguments. It then inserts the expansion in front
1172 of the remaining input. In other words, the resulting text from a macro
1173 call will be read and parsed into tokens again.
1175 @code{m4} expands a macro as soon as possible. If it finds a macro call
1176 when collecting the arguments to another, it will expand the second call
1177 first. This process continues until there are no more macro calls to
1178 expand and all the input has been consumed.
1180 For a running example, examine how @code{m4} handles this input:
1184 format(`Result is %d', eval(`2**15'))
1188 First, @code{m4} sees that the token @samp{format} is a macro name, so
1189 it collects the tokens @samp{(}, @samp{`Result is %d'}, @samp{,},
1190 and @samp{@w{ }}, before encountering another potential macro. Sure
1191 enough, @samp{eval} is a macro name, so the nested argument collection
1192 picks up @samp{(}, @samp{`2**15'}, and @samp{)}, invoking the eval macro
1193 with the lone argument of @samp{2**15}. The expansion of
1194 @samp{eval(2**15)} is @samp{32768}, which is then rescanned as the five
1195 tokens @samp{3}, @samp{2}, @samp{7}, @samp{6}, and @samp{8}; and
1196 combined with the next @samp{)}, the format macro now has all its
1197 arguments, as if the user had typed:
1201 format(`Result is %d', 32768)
1205 The format macro expands to @samp{Result is 32768}, and we have another
1206 round of scanning for the tokens @samp{Result}, @samp{@w{ }},
1207 @samp{is}, @samp{@w{ }}, @samp{3}, @samp{2}, @samp{7}, @samp{6}, and
1208 @samp{8}. None of these are macros, so the final output is
1212 @result{}Result is 32768
1215 As a more complicated example, we will contrast an actual code
1216 example from the Gnulib project@footnote{Derived from a patch in
1217 @uref{https://lists.gnu.org/archive/html/bug-gnulib/@/2007-01/@/msg00389.html},
1218 and a followup patch in
1219 @uref{https://lists.gnu.org/archive/html/bug-gnulib/@/2007-02/@/msg00000.html}},
1220 showing both a buggy approach and the desired results. The user desires
1221 to output a shell assignment statement that takes its argument and turns
1222 it into a shell variable by converting it to uppercase and prepending a
1223 prefix. The original attempt looks like this:
1227 define([gl_STRING_MODULE_INDICATOR],
1230 GNULIB_]translit([$1],[a-z],[A-Z])[=1
1232 gl_STRING_MODULE_INDICATOR([strcase])
1234 @result{} GNULIB_strcase=1
1238 Oops -- the argument did not get capitalized. And although the manual
1239 is not able to easily show it, both lines that appear empty actually
1240 contain two trailing spaces. By stepping through the parse, it is easy
1241 to see what happened. First, @code{m4} sees the token
1242 @samp{changequote}, which it recognizes as a macro, followed by
1243 @samp{(}, @samp{[}, @samp{,}, @samp{]}, and @samp{)} to form the
1244 argument list. The macro expands to the empty string, but changes the
1245 quoting characters to something more useful for generating shell code
1246 (unbalanced @samp{`} and @samp{'} appear all the time in shell scripts,
1247 but unbalanced @samp{[]} tend to be rare). Also in the first line,
1248 @code{m4} sees the token @samp{dnl}, which it recognizes as a builtin
1249 macro that consumes the rest of the line, resulting in no output for
1252 The second line starts a macro definition. @code{m4} sees the token
1253 @samp{define}, which it recognizes as a macro, followed by a @samp{(},
1254 @samp{[gl_STRING_MODULE_INDICATOR]}, and @samp{,}. Because an unquoted
1255 comma was encountered, the first argument is known to be the expansion
1256 of the single-quoted string token, or @samp{gl_STRING_MODULE_INDICATOR}.
1257 Next, @code{m4} sees @samp{@key{NL}}, @samp{ }, and @samp{ }, but this
1258 whitespace is discarded as part of argument collection. Then comes a
1259 rather lengthy single-quoted string token, @samp{[@key{NL}@ @ @ @ dnl
1260 comment@key{NL}@ @ @ @ GNULIB_]}. This is followed by the token
1261 @samp{translit}, which @code{m4} recognizes as a macro name, so a nested
1262 macro expansion has started.
1264 The arguments to the @code{translit} are found by the tokens @samp{(},
1265 @samp{[$1]}, @samp{,}, @samp{[a-z]}, @samp{,}, @samp{[A-Z]}, and finally
1266 @samp{)}. All three string arguments are expanded (or in other words,
1267 the quotes are stripped), and since neither @samp{$} nor @samp{1} need
1268 capitalization, the result of the macro is @samp{$1}. This expansion is
1269 rescanned, resulting in the two literal characters @samp{$} and
1272 Scanning of the outer macro resumes, and picks up with
1273 @samp{[=1@key{NL}@ @ ]}, and finally @samp{)}. The collected pieces of
1274 expanded text are concatenated, with the end result that the macro
1275 @samp{gl_STRING_MODULE_INDICATOR} is now defined to be the sequence
1276 @samp{@key{NL}@ @ @ @ dnl comment@key{NL}@ @ @ @ GNULIB_$1=1@key{NL}@ @ }.
1277 Once again, @samp{dnl} is recognized and avoids a newline in the output.
1279 The final line is then parsed, beginning with @samp{ } and @samp{ }
1280 that are output literally. Then @samp{gl_STRING_MODULE_INDICATOR} is
1281 recognized as a macro name, with an argument list of @samp{(},
1282 @samp{[strcase]}, and @samp{)}. Since the definition of the macro
1283 contains the sequence @samp{$1}, that sequence is replaced with the
1284 argument @samp{strcase} prior to starting the rescan. The rescan sees
1285 @samp{@key{NL}} and four spaces, which are output literally, then
1286 @samp{dnl}, which discards the text @samp{ comment@key{NL}}. Next
1287 comes four more spaces, also output literally, and the token
1288 @samp{GNULIB_strcase}, which resulted from the earlier parameter
1289 substitution. Since that is not a macro name, it is output literally,
1290 followed by the literal tokens @samp{=}, @samp{1}, @samp{@key{NL}}, and
1291 two more spaces. Finally, the original @samp{@key{NL}} seen after the
1292 macro invocation is scanned and output literally.
1294 Now for a corrected approach. This rearranges the use of newlines and
1295 whitespace so that less whitespace is output (which, although harmless
1296 to shell scripts, can be visually unappealing), and fixes the quoting
1297 issues so that the capitalization occurs when the macro
1298 @samp{gl_STRING_MODULE_INDICATOR} is invoked, rather then when it is
1299 defined. It also adds another layer of quoting to the first argument of
1300 @code{translit}, to ensure that the output will be rescanned as a string
1301 rather than a potential uppercase macro name needing further expansion.
1305 define([gl_STRING_MODULE_INDICATOR],
1307 GNULIB_[]translit([[$1]], [a-z], [A-Z])=1dnl
1309 gl_STRING_MODULE_INDICATOR([strcase])
1310 @result{} GNULIB_STRCASE=1
1313 The parsing of the first line is unchanged. The second line sees the
1314 name of the macro to define, then sees the discarded @samp{@key{NL}}
1315 and two spaces, as before. But this time, the next token is
1316 @samp{[dnl comment@key{NL}@ @ GNULIB_[]translit([[$1]], [a-z],
1317 [A-Z])=1dnl@key{NL}]}, which includes nested quotes, followed by
1318 @samp{)} to end the macro definition and @samp{dnl} to skip the
1319 newline. No early expansion of @code{translit} occurs, so the entire
1320 string becomes the definition of the macro.
1322 The final line is then parsed, beginning with two spaces that are
1323 output literally, and an invocation of
1324 @code{gl_STRING_MODULE_INDICATOR} with the argument @samp{strcase}.
1325 Again, the @samp{$1} in the macro definition is substituted prior to
1326 rescanning. Rescanning first encounters @samp{dnl}, and discards
1327 @samp{ comment@key{NL}}. Then two spaces are output literally. Next
1328 comes the token @samp{GNULIB_}, but that is not a macro, so it is
1329 output literally. The token @samp{[]} is an empty string, so it does
1330 not affect output. Then the token @samp{translit} is encountered.
1332 This time, the arguments to @code{translit} are parsed as @samp{(},
1333 @samp{[[strcase]]}, @samp{,}, @samp{ }, @samp{[a-z]}, @samp{,}, @samp{ },
1334 @samp{[A-Z]}, and @samp{)}. The two spaces are discarded, and the
1335 translit results in the desired result @samp{[STRCASE]}. This is
1336 rescanned, but since it is a string, the quotes are stripped and the
1337 only output is a literal @samp{STRCASE}.
1338 Then the scanner sees @samp{=} and @samp{1}, which are output
1339 literally, followed by @samp{dnl} which discards the rest of the
1340 definition of @code{gl_STRING_MODULE_INDICATOR}. The newline at the
1341 end of output is the literal @samp{@key{NL}} that appeared after the
1342 invocation of the macro.
1344 The order in which @code{m4} expands the macros can be further explored
1345 using the trace facilities of GNU @code{m4} (@pxref{Trace}).
1348 @chapter How to invoke macros
1350 This chapter covers macro invocation, macro arguments and how macro
1351 expansion is treated.
1354 * Invocation:: Macro invocation
1355 * Inhibiting Invocation:: Preventing macro invocation
1356 * Macro Arguments:: Macro arguments
1357 * Quoting Arguments:: On Quoting Arguments to macros
1358 * Macro expansion:: Expanding macros
1362 @section Macro invocation
1364 @cindex macro invocation
1365 @cindex invoking macros
1366 Macro invocations has one of the forms
1374 which is a macro invocation without any arguments, or
1378 name(arg1, arg2, @dots{}, arg@var{n})
1382 which is a macro invocation with @var{n} arguments. Macros can have any
1383 number of arguments. All arguments are strings, but different macros
1384 might interpret the arguments in different ways.
1386 The opening parenthesis @emph{must} follow the @var{name} directly, with
1387 no spaces in between. If it does not, the macro is called with no
1390 For a macro call to have no arguments, the parentheses @emph{must} be
1391 left out. The macro call
1399 is a macro call with one argument, which is the empty string, not a call
1402 @node Inhibiting Invocation
1403 @section Preventing macro invocation
1405 An innovation of the @code{m4} language, compared to some of its
1406 predecessors (like Strachey's @code{GPM}, for example), is the ability
1407 to recognize macro calls without resorting to any special, prefixed
1408 invocation character. While generally useful, this feature might
1409 sometimes be the source of spurious, unwanted macro calls. So, GNU
1410 @code{m4} offers several mechanisms or techniques for inhibiting the
1411 recognition of names as macro calls.
1413 @cindex GNU extensions
1415 @cindex macro, blind
1416 First of all, many builtin macros cannot meaningfully be called without
1417 arguments. As a GNU extension, for any of these macros,
1418 whenever an opening parenthesis does not immediately follow their name,
1419 the builtin macro call is not triggered. This solves the most usual
1420 cases, like for @samp{include} or @samp{eval}. Later in this document,
1421 the sentence ``This macro is recognized only with parameters'' refers to
1422 this specific provision of GNU M4, also known as a blind
1423 builtin macro. For the builtins defined by POSIX that bear
1424 this disclaimer, POSIX specifically states that invoking those
1425 builtins without arguments is unspecified, because many other
1426 implementations simply invoke the builtin as though it were given one
1427 empty argument instead.
1437 There is also a command line option (@option{--prefix-builtins}, or
1438 @option{-P}, @pxref{Operation modes, , Invoking m4}) that renames all
1439 builtin macros with a prefix of @samp{m4_} at startup. The option has
1440 no effect whatsoever on user defined macros. For example, with this option,
1441 one has to write @code{m4_dnl} and even @code{m4_m4exit}. It also has
1442 no effect on whether a macro requires parameters.
1444 @comment options: -P
1457 Another alternative is to redefine problematic macros to a name less
1458 likely to cause conflicts, using @ref{Definitions}.
1460 If your version of GNU @code{m4} has the @code{changeword} feature
1461 compiled in, it offers far more flexibility in specifying the
1462 syntax of macro names, both builtin or user-defined. @xref{Changeword},
1463 for more information on this experimental feature.
1465 Of course, the simplest way to prevent a name from being interpreted
1466 as a call to an existing macro is to quote it. The remainder of
1467 this section studies a little more deeply how quoting affects macro
1468 invocation, and how quoting can be used to inhibit macro invocation.
1470 Even if quoting is usually done over the whole macro name, it can also
1471 be done over only a few characters of this name (provided, of course,
1472 that the unquoted portions are not also a macro). It is also possible
1473 to quote the empty string, but this works only @emph{inside} the name.
1488 all yield the string @samp{divert}. While in both:
1498 the @code{divert} builtin macro will be called, which expands to the
1502 The output of macro evaluations is always rescanned. In the following
1503 example, the input @samp{x`'y} yields the string @samp{bCD}, exactly as
1505 has been given @w{@samp{substr(ab`'cde, `1', `3')}} as input:
1508 define(`cde', `CDE')
1510 define(`x', `substr(ab')
1512 define(`y', `cde, `1', `3')')
1519 @comment Similar, but with argument references, to ensure good test
1522 define(`x1', `len(`$1'')
1524 define(`y1', ``$1')')
1526 x1(`01234567890123456789')y1(`98765432109876543210')
1531 Unquoted strings on either side of a quoted string are subject to
1532 being recognized as macro names. In the following example, quoting the
1533 empty string allows for the second @code{macro} to be recognized as such:
1536 define(`macro', `m')
1544 Quoting may prevent recognizing as a macro name the concatenation of a
1545 macro expansion with the surrounding characters. In this example:
1548 define(`macro', `di$1')
1557 the input will produce the string @samp{divert}. When the quotes were
1558 removed, the @code{divert} builtin was called instead.
1560 @node Macro Arguments
1561 @section Macro arguments
1563 @cindex macros, arguments to
1564 @cindex arguments to macros
1565 When a name is seen, and it has a macro definition, it will be expanded
1568 If the name is followed by an opening parenthesis, the arguments will be
1569 collected before the macro is called. If too few arguments are
1570 supplied, the missing arguments are taken to be the empty string.
1571 However, some builtins are documented to behave differently for a
1572 missing optional argument than for an explicit empty string. If there
1573 are too many arguments, the excess arguments are ignored. Unquoted
1574 leading whitespace is stripped off all arguments, but whitespace
1575 generated by a macro expansion or occurring after a macro that expanded
1576 to an empty string remains intact. Whitespace includes space, tab,
1577 newline, carriage return, vertical tab, and formfeed.
1580 define(`macro', `$1')
1582 macro( unquoted leading space lost)
1583 @result{}unquoted leading space lost
1584 macro(` quoted leading space kept')
1585 @result{} quoted leading space kept
1587 divert `unquoted space kept after expansion')
1588 @result{} unquoted space kept after expansion
1590 ')`whitespace from expansion kept')
1592 @result{}whitespace from expansion kept
1593 macro(`unquoted trailing whitespace kept'
1595 @result{}unquoted trailing whitespace kept
1599 @cindex warnings, suppressing
1600 @cindex suppressing warnings
1601 Normally @code{m4} will issue warnings if a builtin macro is called
1602 with an inappropriate number of arguments, but it can be suppressed with
1603 the @option{--quiet} command line option (or @option{--silent}, or
1604 @option{-Q}, @pxref{Operation modes, , Invoking m4}). For user
1605 defined macros, there is no check of the number of arguments given.
1610 @error{}m4:stdin:1: Warning: too few arguments to builtin `index'
1614 index(`abc', `b', `ignored')
1615 @error{}m4:stdin:3: Warning: excess arguments to builtin `index' ignored
1619 @comment options: -Q
1626 index(`abc', `b', `ignored')
1630 Macros are expanded normally during argument collection, and whatever
1631 commas, quotes and parentheses that might show up in the resulting
1632 expanded text will serve to define the arguments as well. Thus, if
1633 @var{foo} expands to @samp{, b, c}, the macro call
1641 is a macro call with four arguments, which are @samp{a }, @samp{b},
1642 @samp{c} and @samp{d}. To understand why the first argument contains
1643 whitespace, remember that unquoted leading whitespace is never part
1644 of an argument, but trailing whitespace always is.
1646 It is possible for a macro's definition to change during argument
1647 collection, in which case the expansion uses the definition that was in
1648 effect at the time the opening @samp{(} was seen.
1659 It is an error if the end of file occurs while collecting arguments.
1664 @result{}hello world
1667 @error{}m4:stdin:2: ERROR: end of file in argument list
1670 @node Quoting Arguments
1671 @section On Quoting Arguments to macros
1673 @cindex quoted macro arguments
1674 @cindex macros, quoted arguments to
1675 @cindex arguments, quoted macro
1676 Each argument has unquoted leading whitespace removed. Within each
1677 argument, all unquoted parentheses must match. For example, if
1678 @var{foo} is a macro,
1686 is a macro call, with one argument, whose value is @samp{() (() (}.
1687 Commas separate arguments, except when they occur inside quotes,
1688 comments, or unquoted parentheses. @xref{Pseudo Arguments}, for
1691 It is common practice to quote all arguments to macros, unless you are
1692 sure you want the arguments expanded. Thus, in the above
1693 example with the parentheses, the `right' way to do it is like this:
1700 @cindex quoting rule of thumb
1701 @cindex rule of thumb, quoting
1702 It is, however, in certain cases necessary (because nested expansion
1703 must occur to create the arguments for the outer macro) or convenient
1704 (because it uses fewer characters) to leave out quotes for some
1705 arguments, and there is nothing wrong in doing it. It just makes life a
1706 bit harder, if you are not careful to follow a consistent quoting style.
1707 For consistency, this manual follows the rule of thumb that each layer
1708 of parentheses introduces another layer of single quoting, except when
1709 showing the consequences of quoting rules. This is done even when the
1710 quoted string cannot be a macro, such as with integers when you have not
1711 changed the syntax via @code{changeword} (@pxref{Changeword}).
1713 The quoting rule of thumb of one level of quoting per parentheses has a
1714 nice property: when a macro name appears inside parentheses, you can
1715 determine when it will be expanded. If it is not quoted, it will be
1716 expanded prior to the outer macro, so that its expansion becomes the
1717 argument. If it is single-quoted, it will be expanded after the outer
1718 macro. And if it is double-quoted, it will be used as literal text
1719 instead of a macro name.
1722 define(`active', `ACT, IVE')
1724 define(`show', `$1 $1')
1729 @result{}ACT, IVE ACT, IVE
1731 @result{}active active
1734 @node Macro expansion
1735 @section Macro expansion
1737 @cindex macros, expansion of
1738 @cindex expansion of macros
1739 When the arguments, if any, to a macro call have been collected, the
1740 macro is expanded, and the expansion text is pushed back onto the input
1741 (unquoted), and reread. The expansion text from one macro call might
1742 therefore result in more macros being called, if the calls are included,
1743 completely or partially, in the first macro calls' expansion.
1745 Taking a very simple example, if @var{foo} expands to @samp{bar}, and
1746 @var{bar} expands to @samp{Hello}, the input
1748 @comment options: -Dbar=Hello -Dfoo=bar
1750 $ @kbd{m4 -Dbar=Hello -Dfoo=bar}
1756 will expand first to @samp{bar}, and when this is reread and
1757 expanded, into @samp{Hello}.
1760 @comment not worth documenting, but test that the command line can
1761 @comment define macros that take parameters
1763 @comment options: -Dfoo -Decho=$@
1765 $ @kbd{m4 -Dfoo -Decho='$@'}
1768 foo(`silently ignored')
1776 @chapter How to define new macros
1778 @cindex macros, how to define new
1779 @cindex defining new macros
1780 Macros can be defined, redefined and deleted in several different ways.
1781 Also, it is possible to redefine a macro without losing a previous
1782 value, and bring back the original value at a later time.
1785 * Define:: Defining a new macro
1786 * Arguments:: Arguments to macros
1787 * Pseudo Arguments:: Special arguments to macros
1788 * Undefine:: Deleting a macro
1789 * Defn:: Renaming macros
1790 * Pushdef:: Temporarily redefining macros
1792 * Indir:: Indirect call of macros
1793 * Builtin:: Indirect call of builtins
1797 @section Defining a macro
1799 The normal way to define or redefine macros is to use the builtin
1802 @deffn Builtin define (@var{name}, @ovar{expansion})
1803 Defines @var{name} to expand to @var{expansion}. If
1804 @var{expansion} is not given, it is taken to be empty.
1806 The expansion of @code{define} is void.
1807 The macro @code{define} is recognized only with parameters.
1810 The following example defines the macro @var{foo} to expand to the text
1811 @samp{Hello World.}.
1814 define(`foo', `Hello world.')
1817 @result{}Hello world.
1820 The empty line in the output is there because the newline is not
1821 a part of the macro definition, and it is consequently copied to
1822 the output. This can be avoided by use of the macro @code{dnl}.
1823 @xref{Dnl}, for details.
1825 The first argument to @code{define} should be quoted; otherwise, if the
1826 macro is already defined, you will be defining a different macro. This
1827 example shows the problems with underquoting, since we did not want to
1828 redefine @code{one}:
1839 @cindex GNU extensions
1840 GNU @code{m4} normally replaces only the @emph{topmost}
1841 definition of a macro if it has several definitions from @code{pushdef}
1842 (@pxref{Pushdef}). Some other implementations of @code{m4} replace all
1843 definitions of a macro with @code{define}. @xref{Incompatibilities},
1846 As a GNU extension, the first argument to @code{define} does
1847 not have to be a simple word.
1848 It can be any text string, even the empty string. A macro with a
1849 non-standard name cannot be invoked in the normal way, as the name is
1850 not recognized. It can only be referenced by the builtins @code{indir}
1851 (@pxref{Indir}) and @code{defn} (@pxref{Defn}).
1854 Arrays and associative arrays can be simulated by using non-standard
1857 @deffn Composite array (@var{index})
1858 @deffnx Composite array_set (@var{index}, @ovar{value})
1859 Provide access to entries within an array. @code{array} reads the entry
1860 at location @var{index}, and @code{array_set} assigns @var{value} to
1861 location @var{index}.
1865 define(`array', `defn(format(``array[%d]'', `$1'))')
1867 define(`array_set', `define(format(``array[%d]'', `$1'), `$2')')
1869 array_set(`4', `array element no. 4')
1871 array_set(`17', `array element no. 17')
1874 @result{}array element no. 4
1875 array(eval(`10 + 7'))
1876 @result{}array element no. 17
1879 Change the @samp{%d} to @samp{%s} and it is an associative array.
1882 @section Arguments to macros
1884 @cindex macros, arguments to
1885 @cindex arguments to macros
1886 Macros can have arguments. The @var{n}th argument is denoted by
1887 @code{$n} in the expansion text, and is replaced by the @var{n}th actual
1888 argument, when the macro is expanded. Replacement of arguments happens
1889 before rescanning, regardless of how many nesting levels of quoting
1890 appear in the expansion. Here is an example of a macro with
1893 @deffn Composite exch (@var{arg1}, @var{arg2})
1894 Expands to @var{arg2} followed by @var{arg1}, effectively exchanging
1899 define(`exch', `$2, $1')
1901 exch(`arg1', `arg2')
1905 This can be used, for example, if you like the arguments to
1906 @code{define} to be reversed.
1909 define(`exch', `$2, $1')
1911 define(exch(``expansion text'', ``macro''))
1914 @result{}expansion text
1917 @xref{Quoting Arguments}, for an explanation of the double quotes.
1918 (You should try and improve this example so that clients of @code{exch}
1919 do not have to double quote; or @pxref{Improved exch, , Answers}).
1921 As a special case, the zeroth argument, @code{$0}, is always the name
1922 of the macro being expanded.
1925 define(`test', ``Macro name: $0'')
1928 @result{}Macro name: test
1931 If you want quoted text to appear as part of the expansion text,
1932 remember that quotes can be nested in quoted strings. Thus, in
1935 define(`foo', `This is macro `foo'.')
1938 @result{}This is macro foo.
1942 The @samp{foo} in the expansion text is @emph{not} expanded, since it is
1943 a quoted string, and not a name.
1945 @cindex GNU extensions
1946 @cindex nine arguments, more than
1947 @cindex more than nine arguments
1948 @cindex arguments, more than nine
1949 @cindex positional parameters, more than nine
1950 GNU @code{m4} allows the number following the @samp{$} to
1951 consist of one or more digits, allowing macros to have any number of
1952 arguments. The extension of accepting multiple digits is incompatible
1953 with POSIX, and is different than traditional implementations
1954 of @code{m4}, which only recognize one digit. Therefore, future
1955 versions of GNU M4 will phase out this feature. To portably
1956 access beyond the ninth argument, you can use the @code{argn} macro
1957 documented later (@pxref{Shift}).
1959 POSIX also states that @samp{$} followed immediately by
1960 @samp{@{} in a macro definition is implementation-defined. This version
1961 of M4 passes the literal characters @samp{$@{} through unchanged, but M4
1962 2.0 will implement an optional feature similar to @command{sh}, where
1963 @samp{$@{11@}} expands to the eleventh argument, to replace the current
1964 recognition of @samp{$11}. Meanwhile, if you want to guarantee that you
1965 will get a literal @samp{$@{} in output when expanding a macro, even
1966 when you upgrade to M4 2.0, you can use nested quoting to your
1970 define(`foo', `single quoted $`'@{1@} output')
1972 define(`bar', ``double quoted $'`@{2@} output'')
1975 @result{}single quoted $@{1@} output
1977 @result{}double quoted $@{2@} output
1980 To help you detect places in your M4 input files that might change in
1981 behavior due to the changed behavior of M4 2.0, you can use the
1982 @option{--warn-macro-sequence} command-line option (@pxref{Operation
1983 modes, , Invoking m4}) with the default regular expression. This will
1984 add a warning any time a macro definition includes @samp{$} followed by
1985 multiple digits, or by @samp{@{}. The warning is not enabled by
1986 default, because it triggers a number of warnings in Autoconf 2.61 (and
1987 Autoconf uses @option{-E} to treat warnings as errors), and because it
1988 will still be possible to restore older behavior in M4 2.0.
1990 @comment options: --warn-macro-sequence
1992 $ @kbd{m4 --warn-macro-sequence}
1993 define(`foo', `$001 $@{1@} $1')
1994 @error{}m4:stdin:1: Warning: definition of `foo' contains sequence `$001'
1995 @error{}m4:stdin:1: Warning: definition of `foo' contains sequence `$@{1@}'
1998 @result{}bar $@{1@} bar
2001 @node Pseudo Arguments
2002 @section Special arguments to macros
2004 @cindex special arguments to macros
2005 @cindex macros, special arguments to
2006 @cindex arguments to macros, special
2007 There is a special notation for the number of actual arguments supplied,
2008 and for all the actual arguments.
2010 The number of actual arguments in a macro call is denoted by @code{$#}
2011 in the expansion text.
2013 @deffn Composite nargs (@dots{})
2014 Expands to a count of the number of arguments supplied.
2018 define(`nargs', `$#')
2024 nargs(`arg1', `arg2', `arg3')
2026 nargs(`commas can be quoted, like this')
2028 nargs(arg1#inside comments, commas do not separate arguments
2031 nargs((unquoted parentheses, like this, group arguments))
2035 Remember that @samp{#} defaults to the comment character; if you forget
2036 quotes to inhibit the comment behavior, your macro definition may not
2037 end where you expected.
2040 dnl Attempt to define a macro to just `$#'
2041 define(underquoted, $#)
2049 The notation @code{$*} can be used in the expansion text to denote all
2050 the actual arguments, unquoted, with commas in between. For example
2053 define(`echo', `$*')
2055 echo(arg1, arg2, arg3 , arg4)
2056 @result{}arg1,arg2,arg3 ,arg4
2059 Often each argument should be quoted, and the notation @code{$@@} handles
2060 that. It is just like @code{$*}, except that it quotes each argument.
2061 A simple example of that is:
2064 define(`echo', `$@@')
2066 echo(arg1, arg2, arg3 , arg4)
2067 @result{}arg1,arg2,arg3 ,arg4
2070 Where did the quotes go? Of course, they were eaten, when the expanded
2071 text were reread by @code{m4}. To show the difference, try
2074 define(`echo1', `$*')
2076 define(`echo2', `$@@')
2078 define(`foo', `This is macro `foo'.')
2081 @result{}This is macro This is macro foo..
2083 @result{}This is macro foo.
2085 @result{}This is macro foo.
2091 @xref{Trace}, if you do not understand this. As another example of the
2092 difference, remember that comments encountered in arguments are passed
2093 untouched to the macro, and that quoting disables comments.
2096 define(`echo1', `$*')
2098 define(`echo2', `$@@')
2100 define(`foo', `bar')
2113 @comment Not worth putting in the manual, but this example is needed for
2114 @comment good test coverage of copying large strings across recursion
2118 define(`echo', `$@@')dnl
2119 echo(echo(`01234567890123456789', `01234567890123456789')
2120 echo(`98765432109876543210', `98765432109876543210'))
2121 @result{}01234567890123456789,01234567890123456789
2122 @result{}98765432109876543210,98765432109876543210
2123 len((echo(`01234567890123456789',
2124 `01234567890123456789')echo(`98765432109876543210',
2125 `98765432109876543210')))
2127 indir(`echo', indir(`echo', `01234567890123456789',
2128 `01234567890123456789')
2129 indir(`echo', `98765432109876543210', `98765432109876543210'))
2130 @result{}01234567890123456789,01234567890123456789
2131 @result{}98765432109876543210,98765432109876543210
2132 define(`argn', `$#')dnl
2133 define(`echo1', `-$@@-')define(`echo2', `,$@@,')dnl
2134 echo1(`1', `2', `3') argn(echo1(`1', `2', `3'))
2136 echo2(`1', `2', `3') argn(echo2(`1', `2', `3'))
2141 A @samp{$} sign in the expansion text, that is not followed by anything
2142 @code{m4} understands, is simply copied to the macro expansion, as any
2146 define(`foo', `$$$ hello $$$')
2149 @result{}$$$ hello $$$
2153 @cindex literal output
2154 @cindex output, literal
2155 If you want a macro to expand to something like @samp{$12}, the
2156 judicious use of nested quoting can put a safe character between the
2157 @code{$} and the next character, relying on the rescanning to remove the
2158 nested quote. This will prevent @code{m4} from interpreting the
2159 @code{$} sign as a reference to an argument.
2162 define(`foo', `no nested quote: $1')
2165 @result{}no nested quote: arg
2166 define(`foo', `nested quote around $: `$'1')
2169 @result{}nested quote around $: $1
2170 define(`foo', `nested empty quote after $: $`'1')
2173 @result{}nested empty quote after $: $1
2174 define(`foo', `nested quote around next character: $`1'')
2177 @result{}nested quote around next character: $1
2178 define(`foo', `nested quote around both: `$1'')
2181 @result{}nested quote around both: arg
2185 @section Deleting a macro
2187 @cindex macros, how to delete
2188 @cindex deleting macros
2189 @cindex undefining macros
2190 A macro definition can be removed with @code{undefine}:
2192 @deffn Builtin undefine (@var{name}@dots{})
2193 For each argument, remove the macro @var{name}. The macro names must
2194 necessarily be quoted, since they will be expanded otherwise.
2196 The expansion of @code{undefine} is void.
2197 The macro @code{undefine} is recognized only with parameters.
2202 @result{}foo bar blah
2203 define(`foo', `some')define(`bar', `other')define(`blah', `text')
2206 @result{}some other text
2210 @result{}foo other text
2211 undefine(`bar', `blah')
2214 @result{}foo bar blah
2217 Undefining a macro inside that macro's expansion is safe; the macro
2218 still expands to the definition that was in effect at the @samp{(}.
2221 define(`f', ``$0':$1')
2223 f(f(f(undefine(`f')`hello world')))
2224 @result{}f:f:f:hello world
2229 It is not an error for @var{name} to have no macro definition. In that
2230 case, @code{undefine} does nothing.
2233 @section Renaming macros
2235 @cindex macros, how to rename
2236 @cindex renaming macros
2237 @cindex macros, displaying definitions
2238 @cindex definitions, displaying macro
2239 It is possible to rename an already defined macro. To do this, you need
2240 the builtin @code{defn}:
2242 @deffn Builtin defn (@var{name}@dots{})
2243 Expands to the @emph{quoted definition} of each @var{name}. If an
2244 argument is not a defined macro, the expansion for that argument is
2247 If @var{name} is a user-defined macro, the quoted definition is simply
2248 the quoted expansion text. If, instead, there is only one @var{name}
2249 and it is a builtin, the
2250 expansion is a special token, which points to the builtin's internal
2251 definition. This token is only meaningful as the second argument to
2252 @code{define} (and @code{pushdef}), and is silently converted to an
2253 empty string in most other contexts. Combining a builtin with anything
2254 else is not supported; a warning is issued and the builtin is omitted
2255 from the final expansion.
2257 The macro @code{defn} is recognized only with parameters.
2260 Its normal use is best understood through an example, which shows how to
2261 rename @code{undefine} to @code{zap}:
2264 define(`zap', defn(`undefine'))
2269 @result{}undefine(zap)
2272 In this way, @code{defn} can be used to copy macro definitions, and also
2273 definitions of builtin macros. Even if the original macro is removed,
2274 the other name can still be used to access the definition.
2276 The fact that macro definitions can be transferred also explains why you
2277 should use @code{$0}, rather than retyping a macro's name in its
2281 define(`foo', `This is `$0'')
2283 define(`bar', defn(`foo'))
2286 @result{}This is bar
2289 Macros used as string variables should be referred through @code{defn},
2290 to avoid unwanted expansion of the text:
2293 define(`string', `The macro dnl is very useful
2297 @result{}The macro@w{ }
2299 @result{}The macro dnl is very useful
2304 However, it is important to remember that @code{m4} rescanning is purely
2305 textual. If an unbalanced end-quote string occurs in a macro
2306 definition, the rescan will see that embedded quote as the termination
2307 of the quoted string, and the remainder of the macro's definition will
2308 be rescanned unquoted. Thus it is a good idea to avoid unbalanced
2309 end-quotes in macro definitions or arguments to macros.
2316 define(`echo', `$@@')
2326 On the other hand, it is possible to exploit the fact that @code{defn}
2327 can concatenate multiple macros prior to the rescanning phase, in order
2328 to join the definitions of macros that, in isolation, have unbalanced
2329 quotes. This is particularly useful when one has used several macros to
2330 accumulate text that M4 should rescan as a whole. In the example below,
2331 note how the use of @code{defn} on @code{l} in isolation opens a string,
2332 which is not closed until the next line; but used on @code{l} and
2333 @code{r} together results in nested quoting.
2336 define(`l', `<[>')define(`r', `<]>')
2338 changequote(`[', `]')
2342 @result{}<[>]defn([r])
2348 @cindex builtins, special tokens
2349 @cindex tokens, builtin macro
2350 Using @code{defn} to generate special tokens for builtin macros outside
2351 of expected contexts can sometimes trigger warnings. But most of the
2352 time, such tokens are silently converted to the empty string.
2358 define(defn(`divnum'), `cannot redefine a builtin token')
2359 @error{}m4:stdin:2: Warning: define: invalid macro name ignored
2367 Also note that @code{defn} with multiple arguments can only join text
2368 macros, not builtins, although a future version of GNU M4 may
2369 lift this restriction.
2373 define(`a', `A')define(`AA', `b')
2375 traceon(`defn', `define')
2377 defn(`a', `divnum', `a')
2378 @error{}m4:stdin:3: Warning: cannot concatenate builtin `divnum'
2379 @error{}m4trace: -1- defn(`a', `divnum', `a') -> ``A'`A''
2381 define(`mydivnum', defn(`divnum', `divnum'))mydivnum
2382 @error{}m4:stdin:4: Warning: cannot concatenate builtin `divnum'
2383 @error{}m4:stdin:4: Warning: cannot concatenate builtin `divnum'
2384 @error{}m4trace: -2- defn(`divnum', `divnum')
2385 @error{}m4trace: -1- define(`mydivnum', `')
2387 traceoff(`defn', `define')
2392 @section Temporarily redefining macros
2394 @cindex macros, temporary redefinition of
2395 @cindex temporary redefinition of macros
2396 @cindex redefinition of macros, temporary
2397 @cindex definition stack
2398 @cindex pushdef stack
2399 @cindex stack, macro definition
2400 It is possible to redefine a macro temporarily, reverting to the
2401 previous definition at a later time. This is done with the builtins
2402 @code{pushdef} and @code{popdef}:
2404 @deffn Builtin pushdef (@var{name}, @ovar{expansion})
2405 @deffnx Builtin popdef (@var{name}@dots{})
2406 Analogous to @code{define} and @code{undefine}.
2408 These macros work in a stack-like fashion. A macro is temporarily
2409 redefined with @code{pushdef}, which replaces an existing definition of
2410 @var{name}, while saving the previous definition, before the new one is
2411 installed. If there is no previous definition, @code{pushdef} behaves
2412 exactly like @code{define}.
2414 If a macro has several definitions (of which only one is accessible),
2415 the topmost definition can be removed with @code{popdef}. If there is
2416 no previous definition, @code{popdef} behaves like @code{undefine}.
2418 The expansion of both @code{pushdef} and @code{popdef} is void.
2419 The macros @code{pushdef} and @code{popdef} are recognized only with
2424 define(`foo', `Expansion one.')
2427 @result{}Expansion one.
2428 pushdef(`foo', `Expansion two.')
2431 @result{}Expansion two.
2432 pushdef(`foo', `Expansion three.')
2434 pushdef(`foo', `Expansion four.')
2439 @result{}Expansion three.
2440 popdef(`foo', `foo')
2443 @result{}Expansion one.
2450 If a macro with several definitions is redefined with @code{define}, the
2451 topmost definition is @emph{replaced} with the new definition. If it is
2452 removed with @code{undefine}, @emph{all} the definitions are removed,
2453 and not only the topmost one. However, POSIX allows other
2454 implementations that treat @code{define} as replacing an entire stack
2455 of definitions with a single new definition, so to be portable to other
2456 implementations, it may be worth explicitly using @code{popdef} and
2457 @code{pushdef} rather than relying on the GNU behavior of
2461 define(`foo', `Expansion one.')
2464 @result{}Expansion one.
2465 pushdef(`foo', `Expansion two.')
2468 @result{}Expansion two.
2469 define(`foo', `Second expansion two.')
2472 @result{}Second expansion two.
2479 @cindex local variables
2480 @cindex variables, local
2481 Local variables within macros are made with @code{pushdef} and
2482 @code{popdef}. At the start of the macro a new definition is pushed,
2483 within the macro it is manipulated and at the end it is popped,
2484 revealing the former definition.
2486 It is possible to temporarily redefine a builtin with @code{pushdef}
2490 @section Indirect call of macros
2492 @cindex indirect call of macros
2493 @cindex call of macros, indirect
2494 @cindex macros, indirect call of
2495 @cindex GNU extensions
2496 Any macro can be called indirectly with @code{indir}:
2498 @deffn Builtin indir (@var{name}, @ovar{args@dots{}})
2499 Results in a call to the macro @var{name}, which is passed the
2500 rest of the arguments @var{args}. If @var{name} is not defined, an
2501 error message is printed, and the expansion is void.
2503 The macro @code{indir} is recognized only with parameters.
2506 This can be used to call macros with computed or ``invalid''
2507 names (@code{define} allows such names to be defined):
2510 define(`$$internal$macro', `Internal macro (name `$0')')
2513 @result{}$$internal$macro
2514 indir(`$$internal$macro')
2515 @result{}Internal macro (name $$internal$macro)
2518 The point is, here, that larger macro packages can have private macros
2519 defined, that will not be called by accident. They can @emph{only} be
2520 called through the builtin @code{indir}.
2522 One other point to observe is that argument collection occurs before
2523 @code{indir} invokes @var{name}, so if argument collection changes the
2524 value of @var{name}, that will be reflected in the final expansion.
2525 This is different than the behavior when invoking macros directly,
2526 where the definition that was in effect before argument collection is
2535 indir(`f', define(`f', `3'))
2537 indir(`f', undefine(`f'))
2538 @error{}m4:stdin:4: undefined macro `f'
2542 When handed the result of @code{defn} (@pxref{Defn}) as one of its
2543 arguments, @code{indir} defers to the invoked @var{name} for whether a
2544 token representing a builtin is recognized or flattened to the empty
2549 indir(defn(`defn'), `divnum')
2550 @error{}m4:stdin:1: Warning: indir: invalid macro name ignored
2552 indir(`define', defn(`defn'), `divnum')
2553 @error{}m4:stdin:2: Warning: define: invalid macro name ignored
2555 indir(`define', `foo', defn(`divnum'))
2559 indir(`divert', defn(`foo'))
2560 @error{}m4:stdin:5: empty string treated as 0 in builtin `divert'
2565 @section Indirect call of builtins
2567 @cindex indirect call of builtins
2568 @cindex call of builtins, indirect
2569 @cindex builtins, indirect call of
2570 @cindex GNU extensions
2571 Builtin macros can be called indirectly with @code{builtin}:
2573 @deffn Builtin builtin (@var{name}, @ovar{args@dots{}})
2574 Results in a call to the builtin @var{name}, which is passed the
2575 rest of the arguments @var{args}. If @var{name} does not name a
2576 builtin, an error message is printed, and the expansion is void.
2578 The macro @code{builtin} is recognized only with parameters.
2581 This can be used even if @var{name} has been given another definition
2582 that has covered the original, or been undefined so that no macro
2583 maps to the builtin.
2586 pushdef(`define', `hidden')
2588 undefine(`undefine')
2590 define(`foo', `bar')
2594 builtin(`define', `foo', defn(`divnum'))
2598 builtin(`define', `foo', `BAR')
2603 @result{}undefine(foo)
2606 builtin(`undefine', `foo')
2612 The @var{name} argument only matches the original name of the builtin,
2613 even when the @option{--prefix-builtins} option (or @option{-P},
2614 @pxref{Operation modes, , Invoking m4}) is in effect. This is different
2615 from @code{indir}, which only tracks current macro names.
2617 @comment options: -P
2620 m4_builtin(`divnum')
2622 m4_builtin(`m4_divnum')
2623 @error{}m4:stdin:2: undefined builtin `m4_divnum'
2626 @error{}m4:stdin:3: undefined macro `divnum'
2628 m4_indir(`m4_divnum')
2632 Note that @code{indir} and @code{builtin} can be used to invoke builtins
2633 without arguments, even when they normally require parameters to be
2634 recognized; but it will provoke a warning, and result in a void expansion.
2640 @error{}m4:stdin:2: undefined builtin `'
2643 @error{}m4:stdin:3: Warning: too few arguments to builtin `builtin'
2646 @error{}m4:stdin:4: undefined builtin `'
2648 builtin(`builtin', ``'
2650 @error{}m4:stdin:5: undefined builtin ``'
2654 @error{}m4:stdin:7: Warning: too few arguments to builtin `index'
2659 @comment This example is not worth putting in the manual, but it is
2660 @comment needed for full coverage. Autoconf's m4_include relies heavily
2661 @comment on this feature.
2664 builtin(`include', `foo')dnl
2668 @comment And this example triggers a regression present in 1.4.10b.
2671 define(`s', `builtin(`shift', $@@)')dnl
2672 define(`loop', `ifelse(`$2', `', `-', `$1$2: $0(`$1', s(s($@@)))')')dnl
2679 loop(`1', `2', `3', `4')
2680 @result{}12: 13: 14: -
2681 loop(`1', `2', `3', `4', `5')
2682 @result{}12: 13: 14: 15: -
2687 @chapter Conditionals, loops, and recursion
2689 Macros, expanding to plain text, perhaps with arguments, are not quite
2690 enough. We would like to have macros expand to different things, based
2691 on decisions taken at run-time. For that, we need some kind of conditionals.
2692 Also, we would like to have some kind of loop construct, so we could do
2693 something a number of times, or while some condition is true.
2696 * Ifdef:: Testing if a macro is defined
2697 * Ifelse:: If-else construct, or multibranch
2698 * Shift:: Recursion in @code{m4}
2699 * Forloop:: Iteration by counting
2700 * Foreach:: Iteration by list contents
2701 * Stacks:: Working with definition stacks
2702 * Composition:: Building macros with macros
2706 @section Testing if a macro is defined
2708 @cindex conditionals
2709 There are two different builtin conditionals in @code{m4}. The first is
2712 @deffn Builtin ifdef (@var{name}, @var{string-1}, @ovar{string-2})
2713 If @var{name} is defined as a macro, @code{ifdef} expands to
2714 @var{string-1}, otherwise to @var{string-2}. If @var{string-2} is
2715 omitted, it is taken to be the empty string (according to the normal
2718 The macro @code{ifdef} is recognized only with parameters.
2722 ifdef(`foo', ``foo' is defined', ``foo' is not defined')
2723 @result{}foo is not defined
2726 ifdef(`foo', ``foo' is defined', ``foo' is not defined')
2727 @result{}foo is defined
2728 ifdef(`no_such_macro', `yes', `no', `extra argument')
2729 @error{}m4:stdin:4: Warning: excess arguments to builtin `ifdef' ignored
2734 @section If-else construct, or multibranch
2736 @cindex comparing strings
2737 @cindex discarding input
2738 @cindex input, discarding
2739 The other conditional, @code{ifelse}, is much more powerful. It can be
2740 used as a way to introduce a long comment, as an if-else construct, or
2741 as a multibranch, depending on the number of arguments supplied:
2743 @deffn Builtin ifelse (@var{comment})
2744 @deffnx Builtin ifelse (@var{string-1}, @var{string-2}, @var{equal}, @
2746 @deffnx Builtin ifelse (@var{string-1}, @var{string-2}, @var{equal-1}, @
2747 @var{string-3}, @var{string-4}, @var{equal-2}, @dots{}, @ovar{not-equal})
2748 Used with only one argument, the @code{ifelse} simply discards it and
2751 If called with three or four arguments, @code{ifelse} expands into
2752 @var{equal}, if @var{string-1} and @var{string-2} are equal (character
2753 for character), otherwise it expands to @var{not-equal}. A final fifth
2754 argument is ignored, after triggering a warning.
2756 If called with six or more arguments, and @var{string-1} and
2757 @var{string-2} are equal, @code{ifelse} expands into @var{equal-1},
2758 otherwise the first three arguments are discarded and the processing
2761 The macro @code{ifelse} is recognized only with parameters.
2764 Using only one argument is a common @code{m4} idiom for introducing a
2765 block comment, as an alternative to repeatedly using @code{dnl}. This
2766 special usage is recognized by GNU @code{m4}, so that in this
2767 case, the warning about missing arguments is never triggered.
2770 ifelse(`some comments')
2772 ifelse(`foo', `bar')
2773 @error{}m4:stdin:2: Warning: too few arguments to builtin `ifelse'
2777 Using three or four arguments provides decision points.
2780 ifelse(`foo', `bar', `true')
2782 ifelse(`foo', `foo', `true')
2784 define(`foo', `bar')
2786 ifelse(foo, `bar', `true', `false')
2788 ifelse(foo, `foo', `true', `false')
2792 @cindex macro, blind
2794 Notice how the first argument was used unquoted; it is common to compare
2795 the expansion of a macro with a string. With this macro, you can now
2796 reproduce the behavior of blind builtins, where the macro is recognized
2797 only with arguments.
2800 define(`foo', `ifelse(`$#', `0', ``$0'', `arguments:$#')')
2805 @result{}arguments:1
2807 @result{}arguments:3
2810 For an example of a way to make defining blind macros easier, see
2813 @cindex multibranches
2814 @cindex switch statement
2815 @cindex case statement
2816 The macro @code{ifelse} can take more than four arguments. If given more
2817 than four arguments, @code{ifelse} works like a @code{case} or @code{switch}
2818 statement in traditional programming languages. If @var{string-1} and
2819 @var{string-2} are equal, @code{ifelse} expands into @var{equal-1}, otherwise
2820 the procedure is repeated with the first three arguments discarded. This
2821 calls for an example:
2824 ifelse(`foo', `bar', `third', `gnu', `gnats')
2825 @error{}m4:stdin:1: Warning: excess arguments to builtin `ifelse' ignored
2827 ifelse(`foo', `bar', `third', `gnu', `gnats', `sixth')
2829 ifelse(`foo', `bar', `third', `gnu', `gnats', `sixth', `seventh')
2831 ifelse(`foo', `bar', `3', `gnu', `gnats', `6', `7', `8')
2832 @error{}m4:stdin:4: Warning: excess arguments to builtin `ifelse' ignored
2837 @comment Stress tests, not worth documenting.
2839 @comment Ensure that references compared to strings work regardless of
2840 @comment similar prefixes.
2842 define(`e', `$@@')define(`long', `01234567890123456789')
2844 ifelse(long, `01234567890123456789', `yes', `no')
2846 ifelse(`01234567890123456789', long, `yes', `no')
2848 ifelse(long, `01234567890123456789-', `yes', `no')
2850 ifelse(`01234567890123456789-', long, `yes', `no')
2852 ifelse(e(long), `01234567890123456789', `yes', `no')
2854 ifelse(`01234567890123456789', e(long), `yes', `no')
2856 ifelse(e(long), `01234567890123456789-', `yes', `no')
2858 ifelse(`01234567890123456789-', e(long), `yes', `no')
2860 ifelse(-e(long), `-01234567890123456789', `yes', `no')
2862 ifelse(-`01234567890123456789', -e(long), `yes', `no')
2864 ifelse(-e(long), `-01234567890123456789-', `yes', `no')
2866 ifelse(`-01234567890123456789-', -e(long), `yes', `no')
2868 ifelse(-e(long)-, `-01234567890123456789-', `yes', `no')
2870 ifelse(-`01234567890123456789-', -e(long)-, `yes', `no')
2872 ifelse(-e(long)-, `-01234567890123456789', `yes', `no')
2874 ifelse(`-01234567890123456789', -e(long)-, `yes', `no')
2876 ifelse(`-'e(long), `-01234567890123456789', `yes', `no')
2878 ifelse(-`01234567890123456789', `-'e(long), `yes', `no')
2880 ifelse(`-'e(long), `-01234567890123456789-', `yes', `no')
2882 ifelse(`-01234567890123456789-', `-'e(long), `yes', `no')
2884 ifelse(`-'e(long)`-', `-01234567890123456789-', `yes', `no')
2886 ifelse(-`01234567890123456789-', `-'e(long)`-', `yes', `no')
2888 ifelse(`-'e(long)`-', `-01234567890123456789', `yes', `no')
2890 ifelse(`-01234567890123456789', `-'e(long)`-', `yes', `no')
2895 Naturally, the normal case will be slightly more advanced than these
2896 examples. A common use of @code{ifelse} is in macros implementing loops
2900 @section Recursion in @code{m4}
2902 @cindex recursive macros
2903 @cindex macros, recursive
2904 There is no direct support for loops in @code{m4}, but macros can be
2905 recursive. There is no limit on the number of recursion levels, other
2906 than those enforced by your hardware and operating system.
2909 Loops can be programmed using recursion and the conditionals described
2912 There is a builtin macro, @code{shift}, which can, among other things,
2913 be used for iterating through the actual arguments to a macro:
2915 @deffn Builtin shift (@var{arg1}, @dots{})
2916 Takes any number of arguments, and expands to all its arguments except
2917 @var{arg1}, separated by commas, with each argument quoted.
2919 The macro @code{shift} is recognized only with parameters.
2927 shift(`foo', `bar', `baz')
2931 An example of the use of @code{shift} is this macro:
2933 @cindex reversing arguments
2934 @cindex arguments, reversing
2935 @deffn Composite reverse (@dots{})
2936 Takes any number of arguments, and reverses their order.
2939 It is implemented as:
2942 define(`reverse', `ifelse(`$#', `0', , `$#', `1', ``$1'',
2943 `reverse(shift($@@)), `$1'')')
2949 reverse(`foo', `bar', `gnats', `and gnus')
2950 @result{}and gnus, gnats, bar, foo
2953 While not a very interesting macro, it does show how simple loops can be
2954 made with @code{shift}, @code{ifelse} and recursion. It also shows
2955 that @code{shift} is usually used with @samp{$@@}. Another example of
2956 this is an implementation of a short-circuiting conditional operator.
2958 @cindex short-circuiting conditional
2959 @cindex conditional, short-circuiting
2960 @deffn Composite cond (@var{test-1}, @var{string-1}, @var{equal-1}, @
2961 @ovar{test-2}, @ovar{string-2}, @ovar{equal-2}, @dots{}, @ovar{not-equal})
2962 Similar to @code{ifelse}, where an equal comparison between the first
2963 two strings results in the third, otherwise the first three arguments
2964 are discarded and the process repeats. The difference is that each
2965 @var{test-<n>} is expanded only when it is encountered. This means that
2966 every third argument to @code{cond} is normally given one more level of
2967 quoting than the corresponding argument to @code{ifelse}.
2970 Here is the implementation of @code{cond}, along with a demonstration of
2971 how it can short-circuit the side effects in @code{side}. Notice how
2972 all the unquoted side effects happen regardless of how many comparisons
2973 are made with @code{ifelse}, compared with only the relevant effects
2978 `ifelse(`$#', `1', `$1',
2979 `ifelse($1, `$2', `$3',
2980 `$0(shift(shift(shift($@@))))')')')dnl
2981 define(`side', `define(`counter', incr(counter))$1')dnl
2983 `define(`counter', `0')dnl
2984 ifelse(side(`$1'), `yes', `one comparison: ',
2985 side(`$1'), `no', `two comparisons: ',
2986 side(`$1'), `maybe', `three comparisons: ',
2987 `side(`default answer: ')')counter')dnl
2989 `define(`counter', `0')dnl
2990 cond(`side(`$1')', `yes', `one comparison: ',
2991 `side(`$1')', `no', `two comparisons: ',
2992 `side(`$1')', `maybe', `three comparisons: ',
2993 `side(`default answer: ')')counter')dnl
2995 @result{}one comparison: 3
2997 @result{}two comparisons: 3
2999 @result{}three comparisons: 3
3000 example1(`feeling rather indecisive today')
3001 @result{}default answer: 4
3003 @result{}one comparison: 1
3005 @result{}two comparisons: 2
3007 @result{}three comparisons: 3
3008 example2(`feeling rather indecisive today')
3009 @result{}default answer: 4
3012 @cindex joining arguments
3013 @cindex arguments, joining
3014 @cindex concatenating arguments
3015 Another common task that requires iteration is joining a list of
3016 arguments into a single string.
3018 @deffn Composite join (@ovar{separator}, @ovar{args@dots{}})
3019 @deffnx Composite joinall (@ovar{separator}, @ovar{args@dots{}})
3020 Generate a single-quoted string, consisting of each @var{arg} separated
3021 by @var{separator}. While @code{joinall} always outputs a
3022 @var{separator} between arguments, @code{join} avoids the
3023 @var{separator} for an empty @var{arg}.
3026 Here are some examples of its usage, based on the implementation
3027 @file{m4-@value{VERSION}/@/examples/@/join.m4} distributed in this
3032 $ @kbd{m4 -I examples}
3035 join,join(`-'),join(`-', `'),join(`-', `', `')
3037 joinall,joinall(`-'),joinall(`-', `'),joinall(`-', `', `')
3041 join(`-', `1', `2', `3')
3043 join(`', `1', `2', `3')
3045 join(`-', `', `1', `', `', `2', `')
3047 joinall(`-', `', `1', `', `', `2', `')
3049 join(`,', `1', `2', `3')
3051 define(`nargs', `$#')dnl
3052 nargs(join(`,', `1', `2', `3'))
3056 Examining the implementation shows some interesting points about several
3057 m4 programming idioms.
3061 $ @kbd{m4 -I examples}
3062 undivert(`join.m4')dnl
3063 @result{}divert(`-1')
3064 @result{}# join(sep, args) - join each non-empty ARG into a single
3065 @result{}# string, with each element separated by SEP
3066 @result{}define(`join',
3067 @result{}`ifelse(`$#', `2', ``$2'',
3068 @result{} `ifelse(`$2', `', `', ``$2'_')$0(`$1', shift(shift($@@)))')')
3069 @result{}define(`_join',
3070 @result{}`ifelse(`$#$2', `2', `',
3071 @result{} `ifelse(`$2', `', `', ``$1$2'')$0(`$1', shift(shift($@@)))')')
3072 @result{}# joinall(sep, args) - join each ARG, including empty ones,
3073 @result{}# into a single string, with each element separated by SEP
3074 @result{}define(`joinall', ``$2'_$0(`$1', shift($@@))')
3075 @result{}define(`_joinall',
3076 @result{}`ifelse(`$#', `2', `', ``$1$3'$0(`$1', shift(shift($@@)))')')
3077 @result{}divert`'dnl
3080 First, notice that this implementation creates helper macros
3081 @code{_join} and @code{_joinall}. This division of labor makes it
3082 easier to output the correct number of @var{separator} instances:
3083 @code{join} and @code{joinall} are responsible for the first argument,
3084 without a separator, while @code{_join} and @code{_joinall} are
3085 responsible for all remaining arguments, always outputting a separator
3086 when outputting an argument.
3088 Next, observe how @code{join} decides to iterate to itself, because the
3089 first @var{arg} was empty, or to output the argument and swap over to
3090 @code{_join}. If the argument is non-empty, then the nested
3091 @code{ifelse} results in an unquoted @samp{_}, which is concatenated
3092 with the @samp{$0} to form the next macro name to invoke. The
3093 @code{joinall} implementation is simpler since it does not have to
3094 suppress empty @var{arg}; it always executes once then defers to
3097 Another important idiom is the idea that @var{separator} is reused for
3098 each iteration. Each iteration has one less argument, but rather than
3099 discarding @samp{$1} by iterating with @code{$0(shift($@@))}, the macro
3100 discards @samp{$2} by using @code{$0(`$1', shift(shift($@@)))}.
3102 Next, notice that it is possible to compare more than one condition in a
3103 single @code{ifelse} test. The test of @samp{$#$2} against @samp{2}
3104 allows @code{_join} to iterate for two separate reasons---either there
3105 are still more than two arguments, or there are exactly two arguments
3106 but the last argument is not empty.
3108 Finally, notice that these macros require exactly two arguments to
3109 terminate recursion, but that they still correctly result in empty
3110 output when given no @var{args} (i.e., zero or one macro argument). On
3111 the first pass when there are too few arguments, the @code{shift}
3112 results in no output, but leaves an empty string to serve as the
3113 required second argument for the second pass. Put another way,
3114 @samp{`$1', shift($@@)} is not the same as @samp{$@@}, since only the
3115 former guarantees at least two arguments.
3117 @cindex quote manipulation
3118 @cindex manipulating quotes
3119 Sometimes, a recursive algorithm requires adding quotes to each element,
3120 or treating multiple arguments as a single element:
3122 @deffn Composite quote (@dots{})
3123 @deffnx Composite dquote (@dots{})
3124 @deffnx Composite dquote_elt (@dots{})
3125 Takes any number of arguments, and adds quoting. With @code{quote},
3126 only one level of quoting is added, effectively removing whitespace
3127 after commas and turning multiple arguments into a single string. With
3128 @code{dquote}, two levels of quoting are added, one around each element,
3129 and one around the list. And with @code{dquote_elt}, two levels of
3130 quoting are added around each element.
3133 An actual implementation of these three macros is distributed as
3134 @file{m4-@value{VERSION}/@/examples/@/quote.m4} in this package. First,
3135 let's examine their usage:
3139 $ @kbd{m4 -I examples}
3142 -quote-dquote-dquote_elt-
3144 -quote()-dquote()-dquote_elt()-
3146 -quote(`1')-dquote(`1')-dquote_elt(`1')-
3147 @result{}-1-`1'-`1'-
3148 -quote(`1', `2')-dquote(`1', `2')-dquote_elt(`1', `2')-
3149 @result{}-1,2-`1',`2'-`1',`2'-
3150 define(`n', `$#')dnl
3151 -n(quote(`1', `2'))-n(dquote(`1', `2'))-n(dquote_elt(`1', `2'))-
3153 dquote(dquote_elt(`1', `2'))
3154 @result{}``1'',``2''
3155 dquote_elt(dquote(`1', `2'))
3159 The last two lines show that when given two arguments, @code{dquote}
3160 results in one string, while @code{dquote_elt} results in two. Now,
3161 examine the implementation. Note that @code{quote} and
3162 @code{dquote_elt} make decisions based on their number of arguments, so
3163 that when called without arguments, they result in nothing instead of a
3164 quoted empty string; this is so that it is possible to distinguish
3165 between no arguments and an empty first argument. @code{dquote}, on the
3166 other hand, results in a string no matter what, since it is still
3167 possible to tell whether it was invoked without arguments based on the
3172 $ @kbd{m4 -I examples}
3173 undivert(`quote.m4')dnl
3174 @result{}divert(`-1')
3175 @result{}# quote(args) - convert args to single-quoted string
3176 @result{}define(`quote', `ifelse(`$#', `0', `', ``$*'')')
3177 @result{}# dquote(args) - convert args to quoted list of quoted strings
3178 @result{}define(`dquote', ``$@@'')
3179 @result{}# dquote_elt(args) - convert args to list of double-quoted strings
3180 @result{}define(`dquote_elt', `ifelse(`$#', `0', `', `$#', `1', ```$1''',
3181 @result{} ```$1'',$0(shift($@@))')')
3182 @result{}divert`'dnl
3185 It is worth pointing out that @samp{quote(@var{args})} is more efficient
3186 than @samp{joinall(`,', @var{args})} for producing the same output.
3188 @cindex nine arguments, more than
3189 @cindex more than nine arguments
3190 @cindex arguments, more than nine
3191 One more useful macro based on @code{shift} allows portably selecting
3192 an arbitrary argument (usually greater than the ninth argument), without
3193 relying on the GNU extension of multi-digit arguments
3194 (@pxref{Arguments}).
3196 @deffn Composite argn (@var{n}, @dots{})
3197 Expands to argument @var{n} out of the remaining arguments. @var{n}
3198 must be a positive number. Usually invoked as
3199 @samp{argn(`@var{n}',$@@)}.
3202 It is implemented as:
3205 define(`argn', `ifelse(`$1', 1, ``$2'',
3206 `argn(decr(`$1'), shift(shift($@@)))')')
3210 define(`foo', `argn(`11', $@@)')
3212 foo(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k', `l')
3217 @section Iteration by counting
3220 @cindex loops, counting
3221 @cindex counting loops
3222 Here is an example of a loop macro that implements a simple for loop.
3224 @deffn Composite forloop (@var{iterator}, @var{start}, @var{end}, @var{text})
3225 Takes the name in @var{iterator}, which must be a valid macro name, and
3226 successively assign it each integer value from @var{start} to @var{end},
3227 inclusive. For each assignment to @var{iterator}, append @var{text} to
3228 the expansion of the @code{forloop}. @var{text} may refer to
3229 @var{iterator}. Any definition of @var{iterator} prior to this
3230 invocation is restored.
3233 It can, for example, be used for simple counting:
3237 $ @kbd{m4 -I examples}
3238 include(`forloop.m4')
3240 forloop(`i', `1', `8', `i ')
3241 @result{}1 2 3 4 5 6 7 8@w{ }
3244 For-loops can be nested, like:
3248 $ @kbd{m4 -I examples}
3249 include(`forloop.m4')
3251 forloop(`i', `1', `4', `forloop(`j', `1', `8', ` (i, j)')
3253 @result{} (1, 1) (1, 2) (1, 3) (1, 4) (1, 5) (1, 6) (1, 7) (1, 8)
3254 @result{} (2, 1) (2, 2) (2, 3) (2, 4) (2, 5) (2, 6) (2, 7) (2, 8)
3255 @result{} (3, 1) (3, 2) (3, 3) (3, 4) (3, 5) (3, 6) (3, 7) (3, 8)
3256 @result{} (4, 1) (4, 2) (4, 3) (4, 4) (4, 5) (4, 6) (4, 7) (4, 8)
3260 The implementation of the @code{forloop} macro is fairly
3261 straightforward. The @code{forloop} macro itself is simply a wrapper,
3262 which saves the previous definition of the first argument, calls the
3263 internal macro @code{@w{_forloop}}, and re-establishes the saved
3264 definition of the first argument.
3266 The macro @code{@w{_forloop}} expands the fourth argument once, and
3267 tests to see if the iterator has reached the final value. If it has
3268 not finished, it increments the iterator (using the predefined macro
3269 @code{incr}, @pxref{Incr}), and recurses.
3271 Here is an actual implementation of @code{forloop}, distributed as
3272 @file{m4-@value{VERSION}/@/examples/@/forloop.m4} in this package:
3276 $ @kbd{m4 -I examples}
3277 undivert(`forloop.m4')dnl
3278 @result{}divert(`-1')
3279 @result{}# forloop(var, from, to, stmt) - simple version
3280 @result{}define(`forloop', `pushdef(`$1', `$2')_forloop($@@)popdef(`$1')')
3281 @result{}define(`_forloop',
3282 @result{} `$4`'ifelse($1, `$3', `', `define(`$1', incr($1))$0($@@)')')
3283 @result{}divert`'dnl
3286 Notice the careful use of quotes. Certain macro arguments are left
3287 unquoted, each for its own reason. Try to find out @emph{why} these
3288 arguments are left unquoted, and see what happens if they are quoted.
3289 (As presented, these two macros are useful but not very robust for
3290 general use. They lack even basic error handling for cases like
3291 @var{start} less than @var{end}, @var{end} not numeric, or
3292 @var{iterator} not being a macro name. See if you can improve these
3293 macros; or @pxref{Improved forloop, , Answers}).
3296 @section Iteration by list contents
3298 @cindex for each loops
3299 @cindex loops, list iteration
3300 @cindex iterating over lists
3301 Here is an example of a loop macro that implements list iteration.
3303 @deffn Composite foreach (@var{iterator}, @var{paren-list}, @var{text})
3304 @deffnx Composite foreachq (@var{iterator}, @var{quote-list}, @var{text})
3305 Takes the name in @var{iterator}, which must be a valid macro name, and
3306 successively assign it each value from @var{paren-list} or
3307 @var{quote-list}. In @code{foreach}, @var{paren-list} is a
3308 comma-separated list of elements contained in parentheses. In
3309 @code{foreachq}, @var{quote-list} is a comma-separated list of elements
3310 contained in a quoted string. For each assignment to @var{iterator},
3311 append @var{text} to the overall expansion. @var{text} may refer to
3312 @var{iterator}. Any definition of @var{iterator} prior to this
3313 invocation is restored.
3316 As an example, this displays each word in a list inside of a sentence,
3317 using an implementation of @code{foreach} distributed as
3318 @file{m4-@value{VERSION}/@/examples/@/foreach.m4}, and @code{foreachq}
3319 in @file{m4-@value{VERSION}/@/examples/@/foreachq.m4}.
3323 $ @kbd{m4 -I examples}
3324 include(`foreach.m4')
3326 foreach(`x', (foo, bar, foobar), `Word was: x
3328 @result{}Word was: foo
3329 @result{}Word was: bar
3330 @result{}Word was: foobar
3331 include(`foreachq.m4')
3333 foreachq(`x', `foo, bar, foobar', `Word was: x
3335 @result{}Word was: foo
3336 @result{}Word was: bar
3337 @result{}Word was: foobar
3340 It is possible to be more complex; each element of the @var{paren-list}
3341 or @var{quote-list} can itself be a list, to pass as further arguments
3342 to a helper macro. This example generates a shell case statement:
3346 $ @kbd{m4 -I examples}
3347 include(`foreach.m4')
3349 define(`_case', ` $1)
3352 define(`_cat', `$1$2')dnl
3355 foreach(`x', `(`(`a', `vara')', `(`b', `varb')', `(`c', `varc')')',
3356 `_cat(`_case', x)')dnl
3358 @result{} vara=" a";;
3360 @result{} varb=" b";;
3362 @result{} varc=" c";;
3367 The implementation of the @code{foreach} macro is a bit more involved;
3368 it is a wrapper around two helper macros. First, @code{@w{_arg1}} is
3369 needed to grab the first element of a list. Second,
3370 @code{@w{_foreach}} implements the recursion, successively walking
3371 through the original list. Here is a simple implementation of
3376 $ @kbd{m4 -I examples}
3377 undivert(`foreach.m4')dnl
3378 @result{}divert(`-1')
3379 @result{}# foreach(x, (item_1, item_2, ..., item_n), stmt)
3380 @result{}# parenthesized list, simple version
3381 @result{}define(`foreach', `pushdef(`$1')_foreach($@@)popdef(`$1')')
3382 @result{}define(`_arg1', `$1')
3383 @result{}define(`_foreach', `ifelse(`$2', `()', `',
3384 @result{} `define(`$1', _arg1$2)$3`'$0(`$1', (shift$2), `$3')')')
3385 @result{}divert`'dnl
3388 Unfortunately, that implementation is not robust to macro names as list
3389 elements. Each iteration of @code{@w{_foreach}} is stripping another
3390 layer of quotes, leading to erratic results if list elements are not
3391 already fully expanded. The first cut at implementing @code{foreachq}
3392 takes this into account. Also, when using quoted elements in a
3393 @var{paren-list}, the overall list must be quoted. A @var{quote-list}
3394 has the nice property of requiring fewer characters to create a list
3395 containing the same quoted elements. To see the difference between the
3396 two macros, we attempt to pass double-quoted macro names in a list,
3397 expecting the macro name on output after one layer of quotes is removed
3398 during list iteration and the final layer removed during the final
3403 $ @kbd{m4 -I examples}
3404 define(`a', `1')define(`b', `2')define(`c', `3')
3406 include(`foreach.m4')
3408 include(`foreachq.m4')
3410 foreach(`x', `(``a'', ``(b'', ``c)'')', `x
3417 foreachq(`x', ```a'', ``(b'', ``c)''', `x
3424 Obviously, @code{foreachq} did a better job; here is its implementation:
3428 $ @kbd{m4 -I examples}
3429 undivert(`foreachq.m4')dnl
3430 @result{}include(`quote.m4')dnl
3431 @result{}divert(`-1')
3432 @result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
3433 @result{}# quoted list, simple version
3434 @result{}define(`foreachq', `pushdef(`$1')_foreachq($@@)popdef(`$1')')
3435 @result{}define(`_arg1', `$1')
3436 @result{}define(`_foreachq', `ifelse(quote($2), `', `',
3437 @result{} `define(`$1', `_arg1($2)')$3`'$0(`$1', `shift($2)', `$3')')')
3438 @result{}divert`'dnl
3441 Notice that @code{@w{_foreachq}} had to use the helper macro
3442 @code{quote} defined earlier (@pxref{Shift}), to ensure that the
3443 embedded @code{ifelse} call does not go haywire if a list element
3444 contains a comma. Unfortunately, this implementation of @code{foreachq}
3445 has its own severe flaw. Whereas the @code{foreach} implementation was
3446 linear, this macro is quadratic in the number of list elements, and is
3447 much more likely to trip up the limit set by the command line option
3448 @option{--nesting-limit} (or @option{-L}, @pxref{Limits control, ,
3449 Invoking m4}). Additionally, this implementation does not expand
3450 @samp{defn(`@var{iterator}')} very well, when compared with
3455 $ @kbd{m4 -I examples}
3456 include(`foreach.m4')include(`foreachq.m4')
3458 foreach(`name', `(`a', `b')', ` defn(`name')')
3460 foreachq(`name', ``a', `b'', ` defn(`name')')
3461 @result{} _arg1(`a', `b') _arg1(shift(`a', `b'))
3464 It is possible to have robust iteration with linear behavior and sane
3465 @var{iterator} contents for either list style. See if you can learn
3466 from the best elements of both of these implementations to create robust
3467 macros (or @pxref{Improved foreach, , Answers}).
3470 @section Working with definition stacks
3472 @cindex definition stack
3473 @cindex pushdef stack
3474 @cindex stack, macro definition
3475 Thanks to @code{pushdef}, manipulation of a stack is an intrinsic
3476 operation in @code{m4}. Normally, only the topmost definition in a
3477 stack is important, but sometimes, it is desirable to manipulate the
3478 entire definition stack.
3480 @deffn Composite stack_foreach (@var{macro}, @var{action})
3481 @deffnx Composite stack_foreach_lifo (@var{macro}, @var{action})
3482 For each of the @code{pushdef} definitions associated with @var{macro},
3483 invoke the macro @var{action} with a single argument of that definition.
3484 @code{stack_foreach} visits the oldest definition first, while
3485 @code{stack_foreach_lifo} visits the current definition first.
3486 @var{action} should not modify or dereference @var{macro}. There are a
3487 few special macros, such as @code{defn}, which cannot be used as the
3488 @var{macro} parameter.
3491 A sample implementation of these macros is distributed in the file
3492 @file{m4-@value{VERSION}/@/examples/@/stack.m4}.
3496 $ @kbd{m4 -I examples}
3499 pushdef(`a', `1')pushdef(`a', `2')pushdef(`a', `3')
3501 define(`show', ``$1'
3504 stack_foreach(`a', `show')dnl
3508 stack_foreach_lifo(`a', `show')dnl
3514 Now for the implementation. Note the definition of a helper macro,
3515 @code{_stack_reverse}, which destructively swaps the contents of one
3516 stack of definitions into the reverse order in the temporary macro
3517 @samp{tmp-$1}. By calling the helper twice, the original order is
3518 restored back into the macro @samp{$1}; since the operation is
3519 destructive, this explains why @samp{$1} must not be modified or
3520 dereferenced during the traversal. The caller can then inject
3521 additional code to pass the definition currently being visited to
3522 @samp{$2}. The choice of helper names is intentional; since @samp{-} is
3523 not valid as part of a macro name, there is no risk of conflict with a
3524 valid macro name, and the code is guaranteed to use @code{defn} where
3525 necessary. Finally, note that any macro used in the traversal of a
3526 @code{pushdef} stack, such as @code{pushdef} or @code{defn}, cannot be
3527 handled by @code{stack_foreach}, since the macro would temporarily be
3528 undefined during the algorithm.
3532 $ @kbd{m4 -I examples}
3533 undivert(`stack.m4')dnl
3534 @result{}divert(`-1')
3535 @result{}# stack_foreach(macro, action)
3536 @result{}# Invoke ACTION with a single argument of each definition
3537 @result{}# from the definition stack of MACRO, starting with the oldest.
3538 @result{}define(`stack_foreach',
3539 @result{}`_stack_reverse(`$1', `tmp-$1')'dnl
3540 @result{}`_stack_reverse(`tmp-$1', `$1', `$2(defn(`$1'))')')
3541 @result{}# stack_foreach_lifo(macro, action)
3542 @result{}# Invoke ACTION with a single argument of each definition
3543 @result{}# from the definition stack of MACRO, starting with the newest.
3544 @result{}define(`stack_foreach_lifo',
3545 @result{}`_stack_reverse(`$1', `tmp-$1', `$2(defn(`$1'))')'dnl
3546 @result{}`_stack_reverse(`tmp-$1', `$1')')
3547 @result{}define(`_stack_reverse',
3548 @result{}`ifdef(`$1', `pushdef(`$2', defn(`$1'))$3`'popdef(`$1')$0($@@)')')
3549 @result{}divert`'dnl
3553 @section Building macros with macros
3555 @cindex macro composition
3556 @cindex composing macros
3557 Since m4 is a macro language, it is possible to write macros that
3558 can build other macros. First on the list is a way to automate the
3559 creation of blind macros.
3561 @cindex macro, blind
3563 @deffn Composite define_blind (@var{name}, @ovar{value})
3564 Defines @var{name} as a blind macro, such that @var{name} will expand to
3565 @var{value} only when given explicit arguments. @var{value} should not
3566 be the result of @code{defn} (@pxref{Defn}). This macro is only
3567 recognized with parameters, and results in an empty string.
3570 Defining a macro to define another macro can be a bit tricky. We want
3571 to use a literal @samp{$#} in the argument to the nested @code{define}.
3572 However, if @samp{$} and @samp{#} are adjacent in the definition of
3573 @code{define_blind}, then it would be expanded as the number of
3574 arguments to @code{define_blind} rather than the intended number of
3575 arguments to @var{name}. The solution is to pass the difficult
3576 characters through extra arguments to a helper macro
3577 @code{_define_blind}. When composing macros, it is a common idiom to
3578 need a helper macro to concatenate text that forms parameters in the
3579 composed macro, rather than interpreting the text as a parameter of the
3582 As for the limitation against using @code{defn}, there are two reasons.
3583 If a macro was previously defined with @code{define_blind}, then it can
3584 safely be renamed to a new blind macro using plain @code{define}; using
3585 @code{define_blind} to rename it just adds another layer of
3586 @code{ifelse}, occupying memory and slowing down execution. And if a
3587 macro is a builtin, then it would result in an attempt to define a macro
3588 consisting of both text and a builtin token; this is not supported, and
3589 the builtin token is flattened to an empty string.
3591 With that explanation, here's the definition, and some sample usage.
3592 Notice that @code{define_blind} is itself a blind macro.
3596 define(`define_blind', `ifelse(`$#', `0', ``$0'',
3597 `_$0(`$1', `$2', `$'`#', `$'`0')')')
3599 define(`_define_blind', `define(`$1',
3600 `ifelse(`$3', `0', ``$4'', `$2')')')
3603 @result{}define_blind
3604 define_blind(`foo', `arguments were $*')
3609 @result{}arguments were bar
3610 define(`blah', defn(`foo'))
3615 @result{}arguments were a,b
3617 @result{}ifelse(`$#', `0', ``$0'', `arguments were $*')
3620 @cindex currying arguments
3621 @cindex argument currying
3622 Another interesting composition tactic is argument @dfn{currying}, or
3623 factoring a macro that takes multiple arguments for use in a context
3624 that provides exactly one argument.
3626 @deffn Composite curry (@var{macro}, @dots{})
3627 Expand to a macro call that takes exactly one argument, then appends
3628 that argument to the original arguments and invokes @var{macro} with the
3629 resulting list of arguments.
3632 A demonstration of currying makes the intent of this macro a little more
3633 obvious. The macro @code{stack_foreach} mentioned earlier is an example
3634 of a context that provides exactly one argument to a macro name. But
3635 coupled with currying, we can invoke @code{reverse} with two arguments
3636 for each definition of a macro stack. This example uses the file
3637 @file{m4-@value{VERSION}/@/examples/@/curry.m4} included in the
3642 $ @kbd{m4 -I examples}
3643 include(`curry.m4')include(`stack.m4')
3645 define(`reverse', `ifelse(`$#', `0', , `$#', `1', ``$1'',
3646 `reverse(shift($@@)), `$1'')')
3648 pushdef(`a', `1')pushdef(`a', `2')pushdef(`a', `3')
3650 stack_foreach(`a', `:curry(`reverse', `4')')
3651 @result{}:1, 4:2, 4:3, 4
3652 curry(`curry', `reverse', `1')(`2')(`3')
3656 Now for the implementation. Notice how @code{curry} leaves off with a
3657 macro name but no open parenthesis, while still in the middle of
3658 collecting arguments for @samp{$1}. The macro @code{_curry} is the
3659 helper macro that takes one argument, then adds it to the list and
3660 finally supplies the closing parenthesis. The use of a comma inside the
3661 @code{shift} call allows currying to also work for a macro that takes
3662 one argument, although it often makes more sense to invoke that macro
3663 directly rather than going through @code{curry}.
3667 $ @kbd{m4 -I examples}
3668 undivert(`curry.m4')dnl
3669 @result{}divert(`-1')
3670 @result{}# curry(macro, args)
3671 @result{}# Expand to a macro call that takes one argument, then invoke
3672 @result{}# macro(args, extra).
3673 @result{}define(`curry', `$1(shift($@@,)_$0')
3674 @result{}define(`_curry', ``$1')')
3675 @result{}divert`'dnl
3678 Unfortunately, with M4 1.4.x, @code{curry} is unable to handle builtin
3679 tokens, which are silently flattened to the empty string when passed
3680 through another text macro. This limitation will be lifted in a future
3683 @cindex renaming macros
3684 @cindex copying macros
3685 @cindex macros, copying
3686 Putting the last few concepts together, it is possible to copy or rename
3687 an entire stack of macro definitions.
3689 @deffn Composite copy (@var{source}, @var{dest})
3690 @deffnx Composite rename (@var{source}, @var{dest})
3691 Ensure that @var{dest} is undefined, then define it to the same stack of
3692 definitions currently in @var{source}. @code{copy} leaves @var{source}
3693 unchanged, while @code{rename} undefines @var{source}. There are only a
3694 few macros, such as @code{copy} or @code{defn}, which cannot be copied
3698 The implementation is relatively straightforward (although since it uses
3699 @code{curry}, it is unable to copy builtin macros, such as the second
3700 definition of @code{a} as a synonym for @code{divnum}. See if you can
3701 design a version that works around this limitation, or @pxref{Improved
3706 $ @kbd{m4 -I examples}
3707 include(`curry.m4')include(`stack.m4')
3709 define(`rename', `copy($@@)undefine(`$1')')dnl
3710 define(`copy', `ifdef(`$2', `errprint(`$2 already defined
3712 `stack_foreach(`$1', `curry(`pushdef', `$2')')')')dnl
3713 pushdef(`a', `1')pushdef(`a', defn(`divnum'))pushdef(`a', `2')
3728 @chapter How to debug macros and input
3730 @cindex debugging macros
3731 @cindex macros, debugging
3732 When writing macros for @code{m4}, they often do not work as intended on
3733 the first try (as is the case with most programming languages).
3734 Fortunately, there is support for macro debugging in @code{m4}.
3737 * Dumpdef:: Displaying macro definitions
3738 * Trace:: Tracing macro calls
3739 * Debug Levels:: Controlling debugging output
3740 * Debug Output:: Saving debugging output
3744 @section Displaying macro definitions
3746 @cindex displaying macro definitions
3747 @cindex macros, displaying definitions
3748 @cindex definitions, displaying macro
3749 @cindex standard error, output to
3750 If you want to see what a name expands into, you can use the builtin
3753 @deffn Builtin dumpdef (@ovar{names@dots{}})
3754 Accepts any number of arguments. If called without any arguments,
3755 it displays the definitions of all known names, otherwise it displays
3756 the definitions of the @var{names} given. The output is printed to the
3757 current debug file (usually standard error), and is sorted by name. If
3758 an unknown name is encountered, a warning is printed.
3760 The expansion of @code{dumpdef} is void.
3765 define(`foo', `Hello world.')
3768 @error{}foo:@tabchar{}`Hello world.'
3771 @error{}define:@tabchar{}<define>
3775 The last example shows how builtin macros definitions are displayed.
3776 The definition that is dumped corresponds to what would occur if the
3777 macro were to be called at that point, even if other definitions are
3778 still live due to redefining a macro during argument collection.
3782 pushdef(`f', ``$0'1')pushdef(`f', ``$0'2')
3784 f(popdef(`f')dumpdef(`f'))
3785 @error{}f:@tabchar{}``$0'1'
3787 f(popdef(`f')dumpdef(`f'))
3788 @error{}m4:stdin:3: undefined macro `f'
3792 @xref{Debug Levels}, for information on controlling the details of the
3796 @section Tracing macro calls
3798 @cindex tracing macro expansion
3799 @cindex macro expansion, tracing
3800 @cindex expansion, tracing macro
3801 @cindex standard error, output to
3802 It is possible to trace macro calls and expansions through the builtins
3803 @code{traceon} and @code{traceoff}:
3805 @deffn Builtin traceon (@ovar{names@dots{}})
3806 @deffnx Builtin traceoff (@ovar{names@dots{}})
3807 When called without any arguments, @code{traceon} and @code{traceoff}
3808 will turn tracing on and off, respectively, for all currently defined
3811 When called with arguments, only the macros listed in @var{names} are
3812 affected, whether or not they are currently defined.
3814 The expansion of @code{traceon} and @code{traceoff} is void.
3817 Whenever a traced macro is called and the arguments have been collected,
3818 the call is displayed. If the expansion of the macro call is not void,
3819 the expansion can be displayed after the call. The output is printed
3820 to the current debug file (defaulting to standard error, @pxref{Debug
3825 define(`foo', `Hello World.')
3827 define(`echo', `$@@')
3829 traceon(`foo', `echo')
3832 @error{}m4trace: -1- foo -> `Hello World.'
3833 @result{}Hello World.
3834 echo(`gnus', `and gnats')
3835 @error{}m4trace: -1- echo(`gnus', `and gnats') -> ``gnus',`and gnats''
3836 @result{}gnus,and gnats
3839 The number between dashes is the depth of the expansion. It is one most
3840 of the time, signifying an expansion at the outermost level, but it
3841 increases when macro arguments contain unquoted macro calls. The
3842 maximum number that will appear between dashes is controlled by the
3843 option @option{--nesting-limit} (or @option{-L}, @pxref{Limits control,
3844 , Invoking m4}). Additionally, the option @option{--trace} (or
3845 @option{-t}) can be used to invoke @code{traceon(@var{name})} before
3848 @comment The explicit -dp neutralizes the testsuite default of -d.
3849 @comment options: -dp -L3 -tifelse
3852 $ @kbd{m4 -L 3 -t ifelse}
3854 @error{}m4trace: -1- ifelse
3856 ifelse(ifelse(ifelse(`three levels')))
3857 @error{}m4trace: -3- ifelse
3858 @error{}m4trace: -2- ifelse
3859 @error{}m4trace: -1- ifelse
3861 ifelse(ifelse(ifelse(ifelse(`four levels'))))
3862 @error{}m4:stdin:3: recursion limit of 3 exceeded, use -L<N> to change it
3865 Tracing by name is an attribute that is preserved whether the macro is
3866 defined or not. This allows the selection of macros to trace before
3867 those macros are defined.
3879 define(`foo', `bar')
3882 @error{}m4trace: -1- foo -> `bar'
3886 ifdef(`foo', `yes', `no')
3889 @error{}m4:stdin:9: undefined macro `foo'
3891 define(`foo', `blah')
3894 @error{}m4trace: -1- foo -> `blah'
3902 Tracing even works on builtins. However, @code{defn} (@pxref{Defn})
3903 does not transfer tracing status.
3910 @error{}m4trace: -1- traceon(`traceoff')
3912 traceoff(`traceoff')
3913 @error{}m4trace: -1- traceoff(`traceoff')
3917 traceon(`eval', `m4_divnum')
3919 define(`m4_eval', defn(`eval'))
3921 define(`m4_divnum', defn(`divnum'))
3924 @error{}m4trace: -1- eval(`0') -> `0'
3927 @error{}m4trace: -2- m4_divnum -> `0'
3931 @xref{Debug Levels}, for information on controlling the details of the
3932 display. The format of the trace output is not specified by
3933 POSIX, and varies between implementations of @code{m4}.
3936 @comment not worth including in the manual, but this tests a trace code
3937 @comment path that was temporarily broken
3938 @comment options: -de --trace ifelse
3940 $ @kbd{m4 -de --trace ifelse}
3941 define(`e', `ifelse(`$1', `$2', `ifelse(`$1', `$2', `e(shift($@@))')')')
3944 @error{}m4trace: -1- ifelse -> ifelse(`1', `1', `e(shift(`1',`1'))')
3945 @error{}m4trace: -1- ifelse -> e(shift(`1',`1'))
3946 @error{}m4trace: -1- ifelse
3952 @section Controlling debugging output
3954 @cindex controlling debugging output
3955 @cindex debugging output, controlling
3956 The @option{-d} option to @code{m4} (or @option{--debug},
3957 @pxref{Debugging options, , Invoking m4}) controls the amount of details
3959 categories of output. Trace output is requested by @code{traceon}
3960 (@pxref{Trace}), and each line is prefixed by @samp{m4trace:} in
3961 relation to a macro invocation. Debug output tracks useful events not
3962 associated with a macro invocation, and each line is prefixed by
3963 @samp{m4debug:}. Finally, @code{dumpdef} (@pxref{Dumpdef}) output is
3964 affected, with no prefix added to the output lines.
3966 The @var{flags} following the option can be one or more of the
3971 In trace output, show the actual arguments that were collected before
3972 invoking the macro. This applies to all macro calls if the @samp{t}
3973 flag is used, otherwise only the macros covered by calls of
3974 @code{traceon}. Arguments are subject to length truncation specified by
3975 the command line option @option{--arglength} (or @option{-l}).
3978 In trace output, show several trace lines for each macro call. A line
3979 is shown when the macro is seen, but before the arguments are collected;
3980 a second line when the arguments have been collected and a third line
3981 after the call has completed.
3984 In trace output, show the expansion of each macro call, if it is not
3985 void. This applies to all macro calls if the @samp{t} flag is used,
3986 otherwise only the macros covered by calls of @code{traceon}. The
3987 expansion is subject to length truncation specified by the command line
3988 option @option{--arglength} (or @option{-l}).
3991 In debug and trace output, include the name of the current input file in
3995 In debug output, print a message each time the current input file is
3999 In debug and trace output, include the current input line number in the
4003 In debug output, print a message when a named file is found through the
4004 path search mechanism (@pxref{Search Path}), giving the actual file name
4008 In trace and dumpdef output, quote actual arguments and macro expansions
4009 in the display with the current quotes. This is useful in connection
4010 with the @samp{a} and @samp{e} flags above.
4013 In trace output, trace all macro calls made in this invocation of
4014 @code{m4}, regardless of the settings of @code{traceon}.
4017 In trace output, add a unique `macro call id' to each line of the trace
4018 output. This is useful in connection with the @samp{c} flag above.
4021 A shorthand for all of the above flags.
4024 If no flags are specified with the @option{-d} option, the default is
4025 @samp{aeq}. The examples throughout this manual assume the default
4028 @cindex GNU extensions
4029 There is a builtin macro @code{debugmode}, which allows on-the-fly control of
4030 the debugging output format:
4032 @deffn Builtin debugmode (@ovar{flags})
4033 The argument @var{flags} should be a subset of the letters listed above.
4034 As special cases, if the argument starts with a @samp{+}, the flags are
4035 added to the current debug flags, and if it starts with a @samp{-}, they
4036 are removed. If no argument is present, all debugging flags are cleared
4037 (as if no @option{-d} was given), and with an empty argument the flags
4038 are reset to the default of @samp{aeq}.
4040 The expansion of @code{debugmode} is void.
4043 @comment The explicit -dp neutralizes the testsuite default of -d.
4044 @comment options: -dp
4047 define(`foo', `FOO')
4054 @error{}m4trace: -1- foo -> `FOO'
4059 @error{}m4trace: -1- foo
4064 @error{}m4trace:8: -1- foo
4068 The following example demonstrates the behavior of length truncation,
4069 when specified on the command line. Note that each argument and the
4070 final result are individually truncated. Also, the special tokens for
4071 builtin functions are not truncated.
4073 @comment options: -l6
4076 define(`echo', `$@@')debugmode(`+t')
4078 echo(`1', `long string')
4079 @error{}m4trace: -1- echo(`1', `long s...') -> ``1',`l...'
4080 @result{}1,long string
4081 indir(`echo', defn(`changequote'))
4082 @error{}m4trace: -2- defn(`change...')
4083 @error{}m4trace: -1- indir(`echo', <changequote>) -> ``''
4087 This example shows the effects of the debug flags that are not related
4091 @comment options: -dip
4093 $ @kbd{m4 -dip -I examples}
4094 @error{}m4debug: input read from stdin
4096 @error{}m4debug: path search for `foo' found `examples/foo'
4097 @error{}m4debug: input read from examples/foo
4099 @error{}m4debug: input reverted to stdin, line 1
4101 @error{}m4debug: input exhausted
4105 @section Saving debugging output
4107 @cindex saving debugging output
4108 @cindex debugging output, saving
4109 @cindex output, saving debugging
4110 @cindex GNU extensions
4111 Debug and tracing output can be redirected to files using either the
4112 @option{--debugfile} option to @code{m4} (@pxref{Debugging options, ,
4113 Invoking m4}), or with the builtin macro @code{debugfile}:
4115 @deffn Builtin debugfile (@ovar{file})
4116 Sends all further debug and trace output to @var{file}, opened in append
4117 mode. If @var{file} is the empty string, debug and trace output are
4118 discarded. If @code{debugfile} is called without any arguments, debug
4119 and trace output are sent to standard error. This does not affect
4120 warnings, error messages, or @code{errprint} output, which are
4121 always sent to standard error. If @var{file} cannot be opened, the
4122 current debug file is unchanged, and an error is issued.
4124 The expansion of @code{debugfile} is void.
4132 @error{}m4:stdin:2: Warning: excess arguments to builtin `divnum' ignored
4133 @error{}m4trace: -1- divnum(`extra') -> `0'
4138 @error{}m4:stdin:4: Warning: excess arguments to builtin `divnum' ignored
4143 @error{}m4trace: -1- divnum -> `0'
4148 @chapter Input control
4150 This chapter describes various builtin macros for controlling the input
4154 * Dnl:: Deleting whitespace in input
4155 * Changequote:: Changing the quote characters
4156 * Changecom:: Changing the comment delimiters
4157 * Changeword:: Changing the lexical structure of words
4158 * M4wrap:: Saving text until end of input
4162 @section Deleting whitespace in input
4164 @cindex deleting whitespace in input
4165 @cindex discarding input
4166 @cindex input, discarding
4167 The builtin @code{dnl} stands for ``Discard to Next Line'':
4170 All characters, up to and including the next newline, are discarded
4171 without performing any macro expansion. A warning is issued if the end
4172 of the file is encountered without a newline.
4174 The expansion of @code{dnl} is void.
4177 It is often used in connection with @code{define}, to remove the
4178 newline that follows the call to @code{define}. Thus
4181 define(`foo', `Macro `foo'.')dnl A very simple macro, indeed.
4186 The input up to and including the next newline is discarded, as opposed
4187 to the way comments are treated (@pxref{Comments}).
4189 Usually, @code{dnl} is immediately followed by an end of line or some
4190 other whitespace. GNU @code{m4} will produce a warning diagnostic if
4191 @code{dnl} is followed by an open parenthesis. In this case, @code{dnl}
4192 will collect and process all arguments, looking for a matching close
4193 parenthesis. All predictable side effects resulting from this
4194 collection will take place. @code{dnl} will return no output. The
4195 input following the matching close parenthesis up to and including the
4196 next newline, on whatever line containing it, will still be discarded.
4199 dnl(`args are ignored, but side effects occur',
4200 define(`foo', `like this')) while this text is ignored: undefine(`foo')
4201 @error{}m4:stdin:1: Warning: excess arguments to builtin `dnl' ignored
4202 See how `foo' was defined, foo?
4203 @result{}See how foo was defined, like this?
4206 If the end of file is encountered without a newline character, a
4207 warning is issued and dnl stops consuming input.
4210 m4wrap(`m4wrap(`2 hi
4216 @error{}m4:stdin:1: Warning: end of file treated as newline
4221 @section Changing the quote characters
4223 @cindex changing quote delimiters
4224 @cindex quote delimiters, changing
4225 @cindex delimiters, changing
4226 The default quote delimiters can be changed with the builtin
4229 @deffn Builtin changequote (@dvar{start, `}, @dvar{end, '})
4230 This sets @var{start} as the new begin-quote delimiter and @var{end} as
4231 the new end-quote delimiter. If both arguments are missing, the default
4232 quotes (@code{`} and @code{'}) are used. If @var{start} is void, then
4233 quoting is disabled. Otherwise, if @var{end} is missing or void, the
4234 default end-quote delimiter (@code{'}) is used. The quote delimiters
4235 can be of any length.
4237 The expansion of @code{changequote} is void.
4241 changequote(`[', `]')
4243 define([foo], [Macro [foo].])
4249 The quotation strings can safely contain non-@sc{ascii} characters.
4256 changequote(`«', `»')
4262 If no single character is appropriate, @var{start} and @var{end} can be
4263 of any length. Other implementations cap the delimiter length to five
4264 characters, but GNU has no inherent limit.
4267 changequote(`[[[', `]]]')
4269 define([[[foo]]], [[[Macro [[[[[foo]]]]].]]])
4272 @result{}Macro [[foo]].
4275 Calling @code{changequote} with @var{start} as the empty string will
4276 effectively disable the quoting mechanism, leaving no way to quote text.
4277 However, using an empty string is not portable, as some other
4278 implementations of @code{m4} revert to the default quoting, while others
4279 preserve the prior non-empty delimiter. If @var{start} is not empty,
4280 then an empty @var{end} will use the default end-quote delimiter of
4281 @samp{'}, as otherwise, it would be impossible to end a quoted string.
4282 Again, this is not portable, as some other @code{m4} implementations
4283 reuse @var{start} as the end-quote delimiter, while others preserve the
4284 previous non-empty value. Omitting both arguments restores the default
4285 begin-quote and end-quote delimiters; fortunately this behavior is
4286 portable to all implementations of @code{m4}.
4289 define(`foo', `Macro `FOO'.')
4294 @result{}Macro `FOO'.
4296 @result{}`Macro `FOO'.'
4303 There is no way in @code{m4} to quote a string containing an unmatched
4304 begin-quote, except using @code{changequote} to change the current
4307 If the quotes should be changed from, say, @samp{[} to @samp{[[},
4308 temporary quote characters have to be defined. To achieve this, two
4309 calls of @code{changequote} must be made, one for the temporary quotes
4310 and one for the new quotes.
4312 Macros are recognized in preference to the begin-quote string, so if a
4313 prefix of @var{start} can be recognized as part of a potential macro
4314 name, the quoting mechanism is effectively disabled. Unless you use
4315 @code{changeword} (@pxref{Changeword}), this means that @var{start}
4316 should not begin with a letter, digit, or @samp{_} (underscore).
4317 However, even though quoted strings are not recognized, the quote
4318 characters can still be discerned in macro expansion and in trace
4322 define(`echo', `$@@')
4326 changequote(`q', `Q')
4334 changequote(`-', `EOF')
4340 changequote(`1', `2')
4348 Quotes are recognized in preference to argument collection. In
4349 particular, if @var{start} is a single @samp{(}, then argument
4350 collection is effectively disabled. For portability with other
4351 implementations, it is a good idea to avoid @samp{(}, @samp{,}, and
4352 @samp{)} as the first character in @var{start}.
4355 define(`echo', `$#:$@@:')
4359 changequote(`(',`)')
4365 changequote(`((', `))')
4373 changequote(`,', `)')
4379 However, if you are not worried about portability, using @samp{(} and
4380 @samp{)} as quoting characters has an interesting property---you can use
4381 it to compute a quoted string containing the expansion of any quoted
4382 text, as long as the expansion results in both balanced quotes and
4383 balanced parentheses. The trick is realizing @code{expand} uses
4384 @samp{$1} unquoted, to trigger its expansion using the normal quoting
4385 characters, but uses extra parentheses to group unquoted commas that
4386 occur in the expansion without consuming whitespace following those
4387 commas. Then @code{_expand} uses @code{changequote} to convert the
4388 extra parentheses back into quoting characters. Note that it takes two
4389 more @code{changequote} invocations to restore the original quotes.
4390 Contrast the behavior on whitespace when using @samp{$*}, via
4391 @code{quote}, to attempt the same task.
4394 changequote(`[', `]')dnl
4395 define([a], [1, (b)])dnl
4397 define([quote], [[$*]])dnl
4398 define([expand], [_$0(($1))])dnl
4400 [changequote([(], [)])$1changequote`'changequote(`[', `]')])dnl
4401 expand([a, a, [a, a], [[a, a]]])
4402 @result{}1, (2), 1, (2), a, a, [a, a]
4403 quote(a, a, [a, a], [[a, a]])
4404 @result{}1,(2),1,(2),a, a,[a, a]
4407 If @var{end} is a prefix of @var{start}, the end-quote will be
4408 recognized in preference to a nested begin-quote. In particular,
4409 changing the quotes to have the same string for @var{start} and
4410 @var{end} disables nesting of quotes. When quote nesting is disabled,
4411 it is impossible to double-quote strings across macro expansions, so
4412 using the same string is not done very often.
4417 changequote(`""', `"')
4429 changequote(`"', `"')
4436 @comment And another stress test, not worth documenting in the manual.
4438 define(`aaaaaaaaaaaaaaaaaaaa', `A')define(`q', `"$@@"')
4440 changequote(`"', `"')
4442 q(q("aaaaaaaaaaaaaaaaaaaa", "a"))
4447 It is an error if the end of file occurs within a quoted string.
4452 @result{}hello world
4455 @error{}m4:stdin:2: ERROR: end of file in string
4460 ifelse(`dangling quote
4462 @error{}m4:stdin:1: ERROR: end of file in string
4466 @section Changing the comment delimiters
4468 @cindex changing comment delimiters
4469 @cindex comment delimiters, changing
4470 @cindex delimiters, changing
4471 The default comment delimiters can be changed with the builtin
4472 macro @code{changecom}:
4474 @deffn Builtin changecom (@ovar{start}, @dvar{end, @key{NL}})
4475 This sets @var{start} as the new begin-comment delimiter and @var{end}
4476 as the new end-comment delimiter. If both arguments are missing, or
4477 @var{start} is void, then comments are disabled. Otherwise, if
4478 @var{end} is missing or void, the default end-comment delimiter of
4479 newline is used. The comment delimiters can be of any length.
4481 The expansion of @code{changecom} is void.
4485 define(`comment', `COMMENT')
4488 @result{}# A normal comment
4489 changecom(`/*', `*/')
4491 # Not a comment anymore
4492 @result{}# Not a COMMENT anymore
4493 But: /* this is a comment now */ while this is not a comment
4494 @result{}But: /* this is a comment now */ while this is not a COMMENT
4497 @cindex comments, copied to output
4498 Note how comments are copied to the output, much as if they were quoted
4499 strings. If you want the text inside a comment expanded, quote the
4500 begin-comment delimiter.
4502 Calling @code{changecom} without any arguments, or with @var{start} as
4503 the empty string, will effectively disable the commenting mechanism. To
4504 restore the original comment start of @samp{#}, you must explicitly ask
4505 for it. If @var{start} is not empty, then an empty @var{end} will use
4506 the default end-comment delimiter of newline, as otherwise, it would be
4507 impossible to end a comment. However, this is not portable, as some
4508 other @code{m4} implementations preserve the previous non-empty
4512 define(`comment', `COMMENT')
4516 # Not a comment anymore
4517 @result{}# Not a COMMENT anymore
4521 @result{}# comment again
4524 The comment strings can safely contain non-@sc{ascii} characters.
4531 changecom(`«', `»')
4537 If no single character is appropriate, @var{start} and @var{end} can be
4538 of any length. Other implementations cap the delimiter length to five
4539 characters, but GNU has no inherent limit.
4541 Comments are recognized in preference to macros. However, this is not
4542 compatible with other implementations, where macros and even quoting
4543 takes precedence over comments, so it may change in a future release.
4544 For portability, this means that @var{start} should not begin with a
4545 letter, digit, or @samp{_} (underscore), and that neither the
4546 start-quote nor the start-comment string should be a prefix of the
4552 define(`hi1hi2', `hello')
4566 Comments are recognized in preference to argument collection. In
4567 particular, if @var{start} is a single @samp{(}, then argument
4568 collection is effectively disabled. For portability with other
4569 implementations, it is a good idea to avoid @samp{(}, @samp{,}, and
4570 @samp{)} as the first character in @var{start}.
4573 define(`echo', `$#:$*:$@@:')
4583 changecom(`((', `))')
4592 @result{}1:HI,hi)bye:HI,hi)bye:
4596 @result{}3:HI,,HI,HI:HI,,`'hi,HI:
4597 echo(hi,`,`'hi',hi`'changecom(`,,', `hi'))
4598 @result{}3:HI,,`'hi,HI:HI,,`'hi,HI:
4601 It is an error if the end of file occurs within a comment.
4605 changecom(`/*', `*/')
4609 @error{}m4:stdin:2: ERROR: end of file in comment
4613 @section Changing the lexical structure of words
4615 @cindex lexical structure of words
4616 @cindex words, lexical structure of
4617 @cindex syntax, changing
4618 @cindex changing syntax
4619 @cindex regular expressions
4621 The macro @code{changeword} and all associated functionality is
4622 experimental. It is only available if the @option{--enable-changeword}
4623 option was given to @command{configure}, at GNU @code{m4}
4625 time. The functionality will go away in the future, to be replaced by
4626 other new features that are more efficient at providing the same
4627 capabilities. @emph{Do not rely on it}. Please direct your comments
4628 about it the same way you would do for bugs.
4631 A file being processed by @code{m4} is split into quoted strings, words
4632 (potential macro names) and simple tokens (any other single character).
4633 Initially a word is defined by the following regular expression:
4637 [_a-zA-Z][_a-zA-Z0-9]*
4640 Using @code{changeword}, you can change this regular expression:
4642 @deffn {Optional builtin} changeword (@var{regex})
4643 Changes the regular expression for recognizing macro names to be
4644 @var{regex}. If @var{regex} is empty, use
4645 @samp{[_a-zA-Z][_a-zA-Z0-9]*}. @var{regex} must obey the constraint
4646 that every prefix of the desired final pattern is also accepted by the
4647 regular expression. If @var{regex} contains grouping parentheses, the
4648 macro invoked is the portion that matched the first group, rather than
4649 the entire matching string.
4651 The expansion of @code{changeword} is void.
4652 The macro @code{changeword} is recognized only with parameters.
4655 Relaxing the lexical rules of @code{m4} might be useful (for example) if
4656 you wanted to apply translations to a file of numbers:
4659 ifdef(`changeword', `', `errprint(` skipping: no changeword support
4661 changeword(`[_a-zA-Z0-9]+')
4667 Tightening the lexical rules is less useful, because it will generally
4668 make some of the builtins unavailable. You could use it to prevent
4669 accidental call of builtins, for example:
4672 ifdef(`changeword', `', `errprint(` skipping: no changeword support
4674 define(`_indir', defn(`indir'))
4676 changeword(`_[_a-zA-Z0-9]*')
4679 @result{}esyscmd(foo)
4680 _indir(`esyscmd', `echo hi')
4685 Because @code{m4} constructs its words a character at a time, there
4686 is a restriction on the regular expressions that may be passed to
4687 @code{changeword}. This is that if your regular expression accepts
4688 @samp{foo}, it must also accept @samp{f} and @samp{fo}.
4691 ifdef(`changeword', `', `errprint(` skipping: no changeword support
4697 dnl This example wants to recognize changeword, dnl, and `foo\n'.
4698 dnl First, we check that our regexp will match.
4699 regexp(`changeword', `[cd][a-z]*\|foo[
4703 ', `[cd][a-z]*\|foo[
4706 regexp(`f', `[cd][a-z]*\|foo[
4711 changeword(`[cd][a-z]*\|foo[
4714 dnl Even though `foo\n' matches, we forgot to allow `f'.
4717 changeword(`[cd][a-z]*\|fo*[
4720 dnl Now we can call `foo\n'.
4726 @comment One more test of including newline in a macro name; but this
4727 @comment does not need to be displayed in the manual. This ensures
4728 @comment that line numbering is correct when dnl cuts across include
4729 @comment file boundaries, and when __file__ or __line__ is the last
4730 @comment token in an include file.
4733 ifdef(`changeword', `', `errprint(` skipping: no changeword support
4738 include(`foo') ignored
4740 changeword(`\([_a-zA-Z][_a-zA-Z0-9]*\|bar
4745 include(`foo') ignored
4752 ', defn(`__file__'))
4755 @result{}examples/foo
4757 ', defn(`__line__'))
4766 @code{changeword} has another function. If the regular expression
4767 supplied contains any grouped subexpressions, then text outside
4768 the first of these is discarded before symbol lookup. So:
4771 ifdef(`changeword', `', `errprint(` skipping: no changeword support
4774 `errprint(` skipping: syscmd does not have unix semantics
4776 changecom(`/*', `*/')dnl
4777 define(`foo', `bar')dnl
4778 changeword(`#\([_a-zA-Z0-9]*\)')
4780 #esyscmd(`echo foo \#foo')
4785 @code{m4} now requires a @samp{#} mark at the beginning of every
4786 macro invocation, so one can use @code{m4} to preprocess plain
4787 text without losing various words like @samp{divert}.
4789 In @code{m4}, macro substitution is based on text, while in @TeX{}, it
4790 is based on tokens. @code{changeword} can throw this difference into
4791 relief. For example, here is the same idea represented in @TeX{} and
4792 @code{m4}. First, the @TeX{} version:
4796 \def\a@{\message@{Hello@}@}
4805 Then, the @code{m4} version:
4808 ifdef(`changeword', `', `errprint(` skipping: no changeword support
4810 define(`a', `errprint(`Hello')')dnl
4811 changeword(`@@\([_a-zA-Z0-9]*\)')
4814 @result{}errprint(Hello)
4817 In the @TeX{} example, the first line defines a macro @code{a} to
4818 print the message @samp{Hello}. The second line defines @key{@@} to
4819 be usable instead of @key{\} as an escape character. The third line
4820 defines @key{\} to be a normal printing character, not an escape.
4821 The fourth line invokes the macro @code{a}. So, when @TeX{} is run
4822 on this file, it displays the message @samp{Hello}.
4824 When the @code{m4} example is passed through @code{m4}, it outputs
4825 @samp{errprint(Hello)}. The reason for this is that @TeX{} does
4826 lexical analysis of macro definition when the macro is @emph{defined}.
4827 @code{m4} just stores the text, postponing the lexical analysis until
4828 the macro is @emph{used}.
4830 You should note that using @code{changeword} will slow @code{m4} down
4831 by a factor of about seven, once it is changed to something other
4832 than the default regular expression. You can invoke @code{changeword}
4833 with the empty string to restore the default word definition, and regain
4837 @section Saving text until end of input
4839 @cindex saving input
4840 @cindex input, saving
4841 @cindex deferring expansion
4842 @cindex expansion, deferring
4843 It is possible to `save' some text until the end of the normal input has
4844 been seen. Text can be saved, to be read again by @code{m4} when the
4845 normal input has been exhausted. This feature is normally used to
4846 initiate cleanup actions before normal exit, e.g., deleting temporary
4849 To save input text, use the builtin @code{m4wrap}:
4851 @deffn Builtin m4wrap (@var{string}, @dots{})
4852 Stores @var{string} in a safe place, to be reread when end of input is
4853 reached. As a GNU extension, additional arguments are
4854 concatenated with a space to the @var{string}.
4856 The expansion of @code{m4wrap} is void.
4857 The macro @code{m4wrap} is recognized only with parameters.
4861 define(`cleanup', `This is the `cleanup' action.
4866 This is the first and last normal input line.
4867 @result{}This is the first and last normal input line.
4869 @result{}This is the cleanup action.
4872 The saved input is only reread when the end of normal input is seen, and
4873 not if @code{m4exit} is used to exit @code{m4}.
4875 @comment FIXME: this contradicts POSIX, which requires that "If the
4876 @comment m4wrap macro is used multiple times, the arguments specified
4877 @comment shall be processed in the order in which the m4wrap macros were
4878 @comment processed."
4879 It is safe to call @code{m4wrap} from saved text, but then the order in
4880 which the saved text is reread is undefined. If @code{m4wrap} is not used
4881 recursively, the saved pieces of text are reread in the opposite order
4882 in which they were saved (LIFO---last in, first out). However, this
4883 behavior is likely to change in a future release, to match
4884 POSIX, so you should not depend on this order.
4886 It is possible to emulate POSIX behavior even
4887 with older versions of GNU M4 by including the file
4888 @file{m4-@value{VERSION}/@/examples/@/wrapfifo.m4} from the
4893 $ @kbd{m4 -I examples}
4894 undivert(`wrapfifo.m4')dnl
4895 @result{}dnl Redefine m4wrap to have FIFO semantics.
4896 @result{}define(`_m4wrap_level', `0')dnl
4897 @result{}define(`m4wrap',
4898 @result{}`ifdef(`m4wrap'_m4wrap_level,
4899 @result{} `define(`m4wrap'_m4wrap_level,
4900 @result{} defn(`m4wrap'_m4wrap_level)`$1')',
4901 @result{} `builtin(`m4wrap', `define(`_m4wrap_level',
4902 @result{} incr(_m4wrap_level))dnl
4903 @result{}m4wrap'_m4wrap_level)dnl
4904 @result{}define(`m4wrap'_m4wrap_level, `$1')')')dnl
4905 include(`wrapfifo.m4')
4907 m4wrap(`a`'m4wrap(`c
4908 ', `d')')m4wrap(`b')
4914 It is likewise possible to emulate LIFO behavior without resorting to
4915 the GNU M4 extension of @code{builtin}, by including the file
4916 @file{m4-@value{VERSION}/@/examples/@/wraplifo.m4} from the
4917 distribution. (Unfortunately, both examples shown here share some
4918 subtle bugs. See if you can find and correct them; or @pxref{Improved
4919 m4wrap, , Answers}).
4923 $ @kbd{m4 -I examples}
4924 undivert(`wraplifo.m4')dnl
4925 @result{}dnl Redefine m4wrap to have LIFO semantics.
4926 @result{}define(`_m4wrap_level', `0')dnl
4927 @result{}define(`_m4wrap', defn(`m4wrap'))dnl
4928 @result{}define(`m4wrap',
4929 @result{}`ifdef(`m4wrap'_m4wrap_level,
4930 @result{} `define(`m4wrap'_m4wrap_level,
4931 @result{} `$1'defn(`m4wrap'_m4wrap_level))',
4932 @result{} `_m4wrap(`define(`_m4wrap_level', incr(_m4wrap_level))dnl
4933 @result{}m4wrap'_m4wrap_level)dnl
4934 @result{}define(`m4wrap'_m4wrap_level, `$1')')')dnl
4935 include(`wraplifo.m4')
4937 m4wrap(`a`'m4wrap(`c
4938 ', `d')')m4wrap(`b')
4944 Here is an example of implementing a factorial function using
4948 define(`f', `ifelse(`$1', `0', `Answer: 0!=1
4949 ', eval(`$1>1'), `0', `Answer: $2$1=eval(`$2$1')
4950 ', `m4wrap(`f(decr(`$1'), `$2$1*')')')')
4955 @result{}Answer: 10*9*8*7*6*5*4*3*2*1=3628800
4958 Invocations of @code{m4wrap} at the same recursion level are
4959 concatenated and rescanned as usual:
4965 m4wrap(`a')m4wrap(`a')
4972 however, the transition between recursion levels behaves like an end of
4973 file condition between two input files.
4977 m4wrap(`m4wrap(`)')len(abc')
4980 @error{}m4:stdin:1: ERROR: end of file in argument list
4983 @node File Inclusion
4984 @chapter File inclusion
4986 @cindex file inclusion
4987 @cindex inclusion, of files
4988 @code{m4} allows you to include named files at any point in the input.
4991 * Include:: Including named files
4992 * Search Path:: Searching for include files
4996 @section Including named files
4998 There are two builtin macros in @code{m4} for including files:
5000 @deffn Builtin include (@var{file})
5001 @deffnx Builtin sinclude (@var{file})
5002 Both macros cause the file named @var{file} to be read by
5003 @code{m4}. When the end of the file is reached, input is resumed from
5004 the previous input file.
5006 The expansion of @code{include} and @code{sinclude} is therefore the
5007 contents of @var{file}.
5009 If @var{file} does not exist, is a directory, or cannot otherwise be
5010 read, the expansion is void,
5011 and @code{include} will fail with an error while @code{sinclude} is
5012 silent. The empty string counts as a file that does not exist.
5014 The macros @code{include} and @code{sinclude} are recognized only with
5021 @error{}m4:stdin:1: cannot open `none': No such file or directory
5024 @error{}m4:stdin:2: cannot open `': No such file or directory
5032 The rest of this section assumes that @code{m4} is invoked with the
5033 @option{-I} option (@pxref{Preprocessor features, , Invoking m4})
5034 pointing to the @file{m4-@value{VERSION}/@/examples}
5035 directory shipped as part of the GNU @code{m4} package. The
5036 file @file{m4-@value{VERSION}/@/examples/@/incl.m4} in the distribution
5041 $ @kbd{cat examples/incl.m4}
5042 @result{}Include file start
5044 @result{}Include file end
5047 Normally file inclusion is used to insert the contents of a file
5048 into the input stream. The contents of the file will be read by
5049 @code{m4} and macro calls in the file will be expanded:
5053 $ @kbd{m4 -I examples}
5054 define(`foo', `FOO')
5057 @result{}Include file start
5059 @result{}Include file end
5063 The fact that @code{include} and @code{sinclude} expand to the contents
5064 of the file can be used to define macros that operate on entire files.
5065 Here is an example, which defines @samp{bar} to expand to the contents
5070 $ @kbd{m4 -I examples}
5071 define(`bar', include(`incl.m4'))
5073 This is `bar': >>bar<<
5074 @result{}This is bar: >>Include file start
5076 @result{}Include file end
5080 This use of @code{include} is not trivial, though, as files can contain
5081 quotes, commas, and parentheses, which can interfere with the way the
5082 @code{m4} parser works. GNU @code{m4} seamlessly concatenates
5083 the file contents with the next character, even if the included file
5084 ended in the middle of a comment, string, or macro call. These
5085 conditions are only treated as end of file errors if specified as input
5086 files on the command line.
5088 In GNU @code{m4}, an alternative method of reading files is
5089 using @code{undivert} (@pxref{Undivert}) on a named file.
5092 @comment Test that include(`file/') detects that file is not a
5093 @comment directory; we can assume that the current directory contains a
5094 @comment Makefile. mingw fails with EINVAL rather than ENOTDIR.
5097 @comment xerr: ignore
5099 include(`Makefile/')
5100 @error{}m4:stdin:1: cannot open `Makefile/': Not a directory
5104 @comment POSIX allows, but doesn't require, failure on reading
5105 @comment directories. But since they aren't text files, it never makes
5106 @comment sense, so we globally forbid it even if fopen doesn't. mingw
5107 @comment fails with EACCES rather than EISDIR.
5110 @comment xerr: ignore
5113 @error{}m4:stdin:1: cannot open `.': Is a directory
5117 @comment Meanwhile, ignore errors with sinclude.
5120 sinclude(`Makefile/')
5128 @section Searching for include files
5130 @cindex search path for included files
5131 @cindex included files, search path for
5132 @cindex GNU extensions
5133 GNU @code{m4} allows included files to be found in other directories
5134 than the current working directory.
5136 @cindex @env{M4PATH}
5137 If the @option{--prepend-include} or @option{-B} command-line option was
5138 provided (@pxref{Preprocessor features, , Invoking m4}), those
5139 directories are searched first, in reverse order that those options were
5140 listed on the command line. Then @code{m4} looks in the current working
5141 directory. Next comes the directories specified with the
5142 @option{--include} or @option{-I} option, in the order found on the
5143 command line. Finally, if the @env{M4PATH} environment variable is set,
5144 it is expected to contain a colon-separated list of directories, which
5145 will be searched in order.
5147 If the automatic search for include-files causes trouble, the @samp{p}
5148 debug flag (@pxref{Debug Levels}) can help isolate the problem.
5151 @chapter Diverting and undiverting output
5153 @cindex deferring output
5154 Diversions are a way of temporarily saving output. The output of
5155 @code{m4} can at any time be diverted to a temporary file, and be
5156 reinserted into the output stream, @dfn{undiverted}, again at a later
5159 @cindex @env{TMPDIR}
5160 Numbered diversions are counted from 0 upwards, diversion number 0
5161 being the normal output stream. GNU
5162 @code{m4} tries to keep diversions in memory. However, there is a
5163 limit to the overall memory usable by all diversions taken together
5164 (512K, currently). When this maximum is about to be exceeded,
5165 a temporary file is opened to receive the contents of the biggest
5166 diversion still in memory, freeing this memory for other diversions.
5167 When creating the temporary file, @code{m4} honors the value of the
5168 environment variable @env{TMPDIR}, and falls back to @file{/tmp}.
5169 Thus, the amount of available disk space provides the only real limit on
5170 the number and aggregate size of diversions.
5173 @comment We need to test spilled diversions, but don't need to expose
5174 @comment this highly repetitive test in the manual.
5177 divert(`-1')define(`f', `.')
5178 define(`f', defn(`f')defn(`f'))
5179 define(`f', defn(`f')defn(`f'))
5180 define(`f', defn(`f')defn(`f'))
5181 define(`f', defn(`f')defn(`f'))
5182 define(`f', defn(`f')defn(`f'))
5183 define(`f', defn(`f')defn(`f'))
5184 define(`f', defn(`f')defn(`f'))
5185 define(`f', defn(`f')defn(`f'))
5186 define(`f', defn(`f')defn(`f'))
5187 define(`f', defn(`f')defn(`f'))
5188 define(`f', defn(`f')defn(`f'))
5189 define(`f', defn(`f')defn(`f'))
5190 define(`f', defn(`f')defn(`f'))
5191 define(`f', defn(`f')defn(`f'))
5192 define(`f', defn(`f')defn(`f'))
5193 define(`f', defn(`f')defn(`f'))
5194 define(`f', defn(`f')defn(`f'))
5195 define(`f', defn(`f')defn(`f'))
5196 define(`f', defn(`f')defn(`f'))
5197 define(`f', defn(`f')defn(`f'))
5205 divert(`-1')undivert
5211 @comment Another test of spilled diversions.
5214 divert(`-1')define(`f', `.')
5215 define(`f', defn(`f')defn(`f'))
5216 define(`f', defn(`f')defn(`f'))
5217 define(`f', defn(`f')defn(`f'))
5218 define(`f', defn(`f')defn(`f'))
5219 define(`f', defn(`f')defn(`f'))
5220 define(`f', defn(`f')defn(`f'))
5221 define(`f', defn(`f')defn(`f'))
5222 define(`f', defn(`f')defn(`f'))
5223 define(`f', defn(`f')defn(`f'))
5224 define(`f', defn(`f')defn(`f'))
5225 define(`f', defn(`f')defn(`f'))
5226 define(`f', defn(`f')defn(`f'))
5227 define(`f', defn(`f')defn(`f'))
5228 define(`f', defn(`f')defn(`f'))
5229 define(`f', defn(`f')defn(`f'))
5230 define(`f', defn(`f')defn(`f'))
5231 define(`f', defn(`f')defn(`f'))
5232 define(`f', defn(`f')defn(`f'))
5233 define(`f', defn(`f')defn(`f'))
5234 define(`f', defn(`f')defn(`f'))
5243 @comment Catch regression in 1.4.10 with spilled diversions.
5247 `errprint(` skipping: syscmd does not have unix semantics
5249 changequote(`[', `]')dnl
5250 syscmd([echo 'divert(1)hi
5251 format(%1000000d, 1)' | ']__program__[' | sed -n 1p])dnl
5257 @comment Avoid quadratic copying time when transferring diversions;
5258 @comment test both in-memory and spilled to file.
5262 $ @kbd{m4 -I examples}
5263 include(`forloop2.m4')dnl
5264 divert(`1')format(`%10000s', `')dnl
5265 forloop(`i', `1', `10000',
5266 `divert(incr(i))undivert(i)')dnl
5267 divert(`9001')format(`%1000000s', `')dnl
5268 forloop(`i', `9001', `10000',
5269 `divert(incr(i))undivert(i)')dnl
5270 divert(`-1')undivert
5274 Diversions make it possible to generate output in a different order than
5275 the input was read. It is possible to implement topological sorting
5276 dependencies. For example, GNU Autoconf makes use of
5277 diversions under the hood to ensure that the expansion of a prerequisite
5278 macro appears in the output prior to the expansion of a dependent macro,
5279 regardless of which order the two macros were invoked in the user's
5283 * Divert:: Diverting output
5284 * Undivert:: Undiverting output
5285 * Divnum:: Diversion numbers
5286 * Cleardivert:: Discarding diverted text
5290 @section Diverting output
5292 @cindex diverting output to files
5293 @cindex output, diverting to files
5294 @cindex files, diverting output to
5295 Output is diverted using @code{divert}:
5297 @deffn Builtin divert (@dvar{number, 0})
5298 The current diversion is changed to @var{number}. If @var{number} is left
5299 out or empty, it is assumed to be zero. If @var{number} cannot be
5300 parsed, the diversion is unchanged.
5302 The expansion of @code{divert} is void.
5305 When all the @code{m4} input will have been processed, all existing
5306 diversions are automatically undiverted, in numerical order.
5310 This text is diverted.
5313 This text is not diverted.
5314 @result{}This text is not diverted.
5317 @result{}This text is diverted.
5320 Several calls of @code{divert} with the same argument do not overwrite
5321 the previous diverted text, but append to it. Diversions are printed
5322 after any wrapped text is expanded.
5325 define(`text', `TEXT')
5327 divert(`1')`diverted text.'
5330 m4wrap(`Wrapped text precedes ')
5333 @result{}Wrapped TEXT precedes diverted text.
5336 @cindex discarding input
5337 @cindex input, discarding
5338 If output is diverted to a negative diversion, it is simply discarded.
5339 This can be used to suppress unwanted output. A common example of
5340 unwanted output is the trailing newlines after macro definitions. Here
5341 is a common programming idiom in @code{m4} for avoiding them.
5345 define(`foo', `Macro `foo'.')
5346 define(`bar', `Macro `bar'.')
5351 @cindex GNU extensions
5352 Traditional implementations only supported ten diversions. But as a
5353 GNU extension, diversion numbers can be as large as positive
5354 integers will allow, rather than treating a multi-digit diversion number
5355 as a request to discard text.
5358 divert(eval(`1<<28'))world
5365 Note that @code{divert} is an English word, but also an active macro
5366 without arguments. When processing plain text, the word might appear in
5367 normal text and be unintentionally swallowed as a macro invocation. One
5368 way to avoid this is to use the @option{-P} option to rename all
5369 builtins (@pxref{Operation modes, , Invoking m4}). Another is to write
5370 a wrapper that requires a parameter to be recognized.
5373 We decided to divert the stream for irrigation.
5374 @result{}We decided to the stream for irrigation.
5375 define(`divert', `ifelse(`$#', `0', ``$0'', `builtin(`$0', $@@)')')
5381 We decided to divert the stream for irrigation.
5382 @result{}We decided to divert the stream for irrigation.
5386 @section Undiverting output
5388 Diverted text can be undiverted explicitly using the builtin
5391 @deffn Builtin undivert (@ovar{diversions@dots{}})
5392 Undiverts the numeric @var{diversions} given by the arguments, in the
5393 order given. If no arguments are supplied, all diversions are
5394 undiverted, in numerical order.
5396 @cindex file inclusion
5397 @cindex inclusion, of files
5398 @cindex GNU extensions
5399 As a GNU extension, @var{diversions} may contain non-numeric
5400 strings, which are treated as the names of files to copy into the output
5401 without expansion. A warning is issued if a file could not be opened.
5403 The expansion of @code{undivert} is void.
5408 This text is diverted.
5411 This text is not diverted.
5412 @result{}This text is not diverted.
5415 @result{}This text is diverted.
5419 Notice the last two blank lines. One of them comes from the newline
5420 following @code{undivert}, the other from the newline that followed the
5421 @code{divert}! A diversion often starts with a blank line like this.
5423 When diverted text is undiverted, it is @emph{not} reread by @code{m4},
5424 but rather copied directly to the current output, and it is therefore
5425 not an error to undivert into a diversion. Undiverting the empty string
5426 is the same as specifying diversion 0; in either case nothing happens
5427 since the output has already been flushed.
5430 divert(`1')diverted text
5438 @result{}diverted text
5441 divert(`2')undivert(`1')diverted text`'divert
5447 @result{}diverted text
5450 When a diversion has been undiverted, the diverted text is discarded,
5451 and it is not possible to bring back diverted text more than once.
5455 This text is diverted first.
5456 divert(`0')undivert(`1')dnl
5458 @result{}This text is diverted first.
5462 This text is also diverted but not appended.
5463 divert(`0')undivert(`1')dnl
5465 @result{}This text is also diverted but not appended.
5468 Attempts to undivert the current diversion are silently ignored. Thus,
5469 when the current diversion is not 0, the current diversion does not get
5470 rearranged among the other diversions.
5476 divert(`2')undivert`'dnl
5477 divert`'undivert`'dnl
5483 @cindex GNU extensions
5484 @cindex file inclusion
5485 @cindex inclusion, of files
5486 GNU @code{m4} allows named files to be undiverted. Given a
5487 non-numeric argument, the contents of the file named will be copied,
5488 uninterpreted, to the current output. This complements the builtin
5489 @code{include} (@pxref{Include}). To illustrate the difference, assume
5490 the file @file{foo} contains:
5502 define(`bar', `BAR')
5512 If the file is not found (or cannot be read), an error message is
5513 issued, and the expansion is void. It is possible to intermix files
5514 and diversion numbers.
5517 divert(`1')diversion one
5518 divert(`2')undivert(`foo')dnl
5519 divert(`3')diversion three
5521 undivert(`1', `2', `foo', `3')dnl
5522 @result{}diversion one
5525 @result{}diversion three
5529 @section Diversion numbers
5531 @cindex diversion numbers
5532 The current diversion is tracked by the builtin @code{divnum}:
5534 @deffn Builtin divnum
5535 Expands to the number of the current diversion.
5542 Diversion one: divnum
5544 Diversion two: divnum
5547 @result{}Diversion one: 1
5549 @result{}Diversion two: 2
5553 @section Discarding diverted text
5555 @cindex discarding diverted text
5556 @cindex diverted text, discarding
5557 Often it is not known, when output is diverted, whether the diverted
5558 text is actually needed. Since all non-empty diversion are brought back
5559 on the main output stream when the end of input is seen, a method of
5560 discarding a diversion is needed. If all diversions should be
5561 discarded, the easiest is to end the input to @code{m4} with
5562 @samp{divert(`-1')} followed by an explicit @samp{undivert}:
5566 Diversion one: divnum
5568 Diversion two: divnum
5575 No output is produced at all.
5577 Clearing selected diversions can be done with the following macro:
5579 @deffn Composite cleardivert (@ovar{diversions@dots{}})
5580 Discard the contents of each of the listed numeric @var{diversions}.
5584 define(`cleardivert',
5585 `pushdef(`_n', divnum)divert(`-1')undivert($@@)divert(_n)popdef(`_n')')
5589 It is called just like @code{undivert}, but the effect is to clear the
5590 diversions, given by the arguments. (This macro has a nasty bug! You
5591 should try to see if you can find it and correct it; or @pxref{Improved
5592 cleardivert, , Answers}).
5595 @chapter Macros for text handling
5597 There are a number of builtins in @code{m4} for manipulating text in
5598 various ways, extracting substrings, searching, substituting, and so on.
5601 * Len:: Calculating length of strings
5602 * Index macro:: Searching for substrings
5603 * Regexp:: Searching for regular expressions
5604 * Substr:: Extracting substrings
5605 * Translit:: Translating characters
5606 * Patsubst:: Substituting text by regular expression
5607 * Format:: Formatting strings (printf-like)
5611 @section Calculating length of strings
5613 @cindex length of strings
5614 @cindex strings, length of
5615 The length of a string can be calculated by @code{len}:
5617 @deffn Builtin len (@var{string})
5618 Expands to the length of @var{string}, as a decimal number.
5620 The macro @code{len} is recognized only with parameters.
5631 @section Searching for substrings
5633 @cindex substrings, locating
5634 Searching for substrings is done with @code{index}:
5636 @deffn Builtin index (@var{string}, @var{substring})
5637 Expands to the index of the first occurrence of @var{substring} in
5638 @var{string}. The first character in @var{string} has index 0. If
5639 @var{substring} does not occur in @var{string}, @code{index} expands to
5642 The macro @code{index} is recognized only with parameters.
5646 index(`gnus, gnats, and armadillos', `nat')
5648 index(`gnus, gnats, and armadillos', `dag')
5652 Omitting @var{substring} evokes a warning, but still produces output;
5653 contrast this with an empty @var{substring}.
5657 @error{}m4:stdin:1: Warning: too few arguments to builtin `index'
5666 @comment Expose a bug in the strstr() algorithm present in glibc
5667 @comment 2.9 through 2.12 and in gnulib up to Sep 2010.
5670 index(`;:11-:12-:12-:12-:12-:12-:12-:12-:12.:12.:12.:12.:12.:12.:12.:12.:12-',
5671 `:12-:12-:12-:12-:12-:12-:12-:12-')
5675 @comment Expose a bug in the gnulib replacement strstr() algorithm
5676 @comment present from Jun 2010 to Feb 2011, including m4 1.4.15.
5679 index(`..wi.d.', `.d.')
5685 @section Searching for regular expressions
5687 @cindex basic regular expressions
5688 @cindex regular expressions
5689 @cindex expressions, regular
5690 @cindex GNU extensions
5691 Searching for regular expressions is done with the builtin
5694 @deffn Builtin regexp (@var{string}, @var{regexp}, @ovar{replacement})
5695 Searches for @var{regexp} in @var{string}. The syntax for regular
5696 expressions is the same as in GNU Emacs, which is similar to
5697 BRE, Basic Regular Expressions in POSIX.
5699 @xref{Regexps, , Syntax of Regular Expressions, emacs, The GNU Emacs
5704 @uref{https://www.gnu.org/@/software/@/emacs/@/manual/@/emacs.html#Regexps,
5705 Syntax of Regular Expressions} in the GNU Emacs Manual.
5707 Support for ERE, Extended Regular Expressions is not
5708 available, but will be added in GNU M4 2.0.
5710 If @var{replacement} is omitted, @code{regexp} expands to the index of
5711 the first match of @var{regexp} in @var{string}. If @var{regexp} does
5712 not match anywhere in @var{string}, it expands to -1.
5714 If @var{replacement} is supplied, and there was a match, @code{regexp}
5715 changes the expansion to this argument, with @samp{\@var{n}} substituted
5716 by the text matched by the @var{n}th parenthesized sub-expression of
5717 @var{regexp}, up to nine sub-expressions. The escape @samp{\&} is
5718 replaced by the text of the entire regular expression matched. For
5719 all other characters, @samp{\} treats the next character literally. A
5720 warning is issued if there were fewer sub-expressions than the
5721 @samp{\@var{n}} requested, or if there is a trailing @samp{\}. If there
5722 was no match, @code{regexp} expands to the empty string.
5724 The macro @code{regexp} is recognized only with parameters.
5728 regexp(`GNUs not Unix', `\<[a-z]\w+')
5730 regexp(`GNUs not Unix', `\<Q\w*')
5732 regexp(`GNUs not Unix', `\w\(\w+\)$', `*** \& *** \1 ***')
5733 @result{}*** Unix *** nix ***
5734 regexp(`GNUs not Unix', `\<Q\w*', `*** \& *** \1 ***')
5738 Here are some more examples on the handling of backslash:
5741 regexp(`abc', `\(b\)', `\\\10\a')
5743 regexp(`abc', `b', `\1\')
5744 @error{}m4:stdin:2: Warning: sub-expression 1 not present
5745 @error{}m4:stdin:2: Warning: trailing \ ignored in replacement
5747 regexp(`abc', `\(\(d\)?\)\(c\)', `\1\2\3\4\5\6')
5748 @error{}m4:stdin:3: Warning: sub-expression 4 not present
5749 @error{}m4:stdin:3: Warning: sub-expression 5 not present
5750 @error{}m4:stdin:3: Warning: sub-expression 6 not present
5754 Omitting @var{regexp} evokes a warning, but still produces output;
5755 contrast this with an empty @var{regexp} argument.
5759 @error{}m4:stdin:1: Warning: too few arguments to builtin `regexp'
5763 regexp(`abc', `', `\\def')
5768 @section Extracting substrings
5770 @cindex extracting substrings
5771 @cindex substrings, extracting
5772 Substrings are extracted with @code{substr}:
5774 @deffn Builtin substr (@var{string}, @var{from}, @ovar{length})
5775 Expands to the substring of @var{string}, which starts at index
5776 @var{from}, and extends for @var{length} characters, or to the end of
5777 @var{string}, if @var{length} is omitted. The starting index of a string
5778 is always 0. The expansion is empty if there is an error parsing
5779 @var{from} or @var{length}, if @var{from} is beyond the end of
5780 @var{string}, or if @var{length} is negative.
5782 The macro @code{substr} is recognized only with parameters.
5786 substr(`gnus, gnats, and armadillos', `6')
5787 @result{}gnats, and armadillos
5788 substr(`gnus, gnats, and armadillos', `6', `5')
5792 Omitting @var{from} evokes a warning, but still produces output.
5796 @error{}m4:stdin:1: Warning: too few arguments to builtin `substr'
5799 @error{}m4:stdin:2: empty string treated as 0 in builtin `substr'
5804 @section Translating characters
5806 @cindex translating characters
5807 @cindex characters, translating
5808 Character translation is done with @code{translit}:
5810 @deffn Builtin translit (@var{string}, @var{chars}, @ovar{replacement})
5811 Expands to @var{string}, with each character that occurs in
5812 @var{chars} translated into the character from @var{replacement} with
5815 If @var{replacement} is shorter than @var{chars}, the excess characters
5816 of @var{chars} are deleted from the expansion; if @var{chars} is
5817 shorter, the excess characters in @var{replacement} are silently
5818 ignored. If @var{replacement} is omitted, all characters in
5819 @var{string} that are present in @var{chars} are deleted from the
5820 expansion. If a character appears more than once in @var{chars}, only
5821 the first instance is used in making the translation. Only a single
5822 translation pass is made, even if characters in @var{replacement} also
5823 appear in @var{chars}.
5825 As a GNU extension, both @var{chars} and @var{replacement} can
5826 contain character-ranges, e.g., @samp{a-z} (meaning all lowercase
5827 letters) or @samp{0-9} (meaning all digits). To include a dash @samp{-}
5828 in @var{chars} or @var{replacement}, place it first or last in the
5829 entire string, or as the last character of a range. Back-to-back ranges
5830 can share a common endpoint. It is not an error for the last character
5831 in the range to be `larger' than the first. In that case, the range
5832 runs backwards, i.e., @samp{9-0} means the string @samp{9876543210}.
5833 The expansion of a range is dependent on the underlying encoding of
5834 characters, so using ranges is not always portable between machines.
5836 The macro @code{translit} is recognized only with parameters.
5840 translit(`GNUs not Unix', `A-Z')
5842 translit(`GNUs not Unix', `a-z', `A-Z')
5843 @result{}GNUS NOT UNIX
5844 translit(`GNUs not Unix', `A-Z', `z-a')
5845 @result{}tmfs not fnix
5846 translit(`+,-12345', `+--1-5', `<;>a-c-a')
5848 translit(`abcdef', `aabdef', `bcged')
5852 In the @sc{ascii} encoding, the first example deletes all uppercase
5853 letters, the second converts lowercase to uppercase, and the third
5854 `mirrors' all uppercase letters, while converting them to lowercase.
5855 The two first cases are by far the most common, even though they are not
5856 portable to @sc{ebcdic} or other encodings. The fourth example shows a
5857 range ending in @samp{-}, as well as back-to-back ranges. The final
5858 example shows that @samp{a} is mapped to @samp{b}, not @samp{c}; the
5859 resulting @samp{b} is not further remapped to @samp{g}; the @samp{d} and
5860 @samp{e} are swapped, and the @samp{f} is discarded.
5863 @comment No need to fight 8-bit characters, as it is difficult to get
5864 @comment rendering right in both info and dvi, and examples like this
5865 @comment do not work correctly with UTF-8 anyway since m4 is byte-oriented.
5868 translit(`«abc~', `~-»')
5872 @comment Stress test short arguments, since they use a different code
5875 translit(`abcdeabcde', `a')
5877 translit(`abcdeabcde', `ab')
5879 translit(`abcdeabcde', `a', `f')
5881 translit(`abcdeabcde', `a', `f')
5883 translit(`abcdeabcde', `a', `fg')
5885 translit(`abcdeabcde', `ab', `f')
5887 translit(`abcdeabcde', `ab', `fg')
5889 translit(`abcdeabcde', `ab', `ba')
5891 translit(`abcdeabcde', `e', `f')
5893 translit(`abc', `', `cde')
5895 translit(`', `a', `bc')
5900 Omitting @var{chars} evokes a warning, but still produces output.
5904 @error{}m4:stdin:1: Warning: too few arguments to builtin `translit'
5909 @section Substituting text by regular expression
5911 @cindex basic regular expressions
5912 @cindex regular expressions
5913 @cindex expressions, regular
5914 @cindex pattern substitution
5915 @cindex substitution by regular expression
5916 @cindex GNU extensions
5917 Global substitution in a string is done by @code{patsubst}:
5919 @deffn Builtin patsubst (@var{string}, @var{regexp}, @ovar{replacement})
5920 Searches @var{string} for matches of @var{regexp}, and substitutes
5921 @var{replacement} for each match. The syntax for regular expressions
5922 is the same as in GNU Emacs (@pxref{Regexp}).
5924 The parts of @var{string} that are not covered by any match of
5925 @var{regexp} are copied to the expansion. Whenever a match is found, the
5926 search proceeds from the end of the match, so a character from
5927 @var{string} will never be substituted twice. If @var{regexp} matches a
5928 string of zero length, the start position for the search is incremented,
5929 to avoid infinite loops.
5931 When a replacement is to be made, @var{replacement} is inserted into
5932 the expansion, with @samp{\@var{n}} substituted by the text matched by
5933 the @var{n}th parenthesized sub-expression of @var{patsubst}, for up to
5934 nine sub-expressions. The escape @samp{\&} is replaced by the text of
5935 the entire regular expression matched. For all other characters,
5936 @samp{\} treats the next character literally. A warning is issued if
5937 there were fewer sub-expressions than the @samp{\@var{n}} requested, or
5938 if there is a trailing @samp{\}.
5940 The @var{replacement} argument can be omitted, in which case the text
5941 matched by @var{regexp} is deleted.
5943 The macro @code{patsubst} is recognized only with parameters.
5947 patsubst(`GNUs not Unix', `^', `OBS: ')
5948 @result{}OBS: GNUs not Unix
5949 patsubst(`GNUs not Unix', `\<', `OBS: ')
5950 @result{}OBS: GNUs OBS: not OBS: Unix
5951 patsubst(`GNUs not Unix', `\w*', `(\&)')
5952 @result{}(GNUs)() (not)() (Unix)()
5953 patsubst(`GNUs not Unix', `\w+', `(\&)')
5954 @result{}(GNUs) (not) (Unix)
5955 patsubst(`GNUs not Unix', `[A-Z][a-z]+')
5956 @result{}GN not@w{ }
5957 patsubst(`GNUs not Unix', `not', `NOT\')
5958 @error{}m4:stdin:6: Warning: trailing \ ignored in replacement
5959 @result{}GNUs NOT Unix
5962 Here is a slightly more realistic example, which capitalizes individual
5963 words or whole sentences, by substituting calls of the macros
5964 @code{upcase} and @code{downcase} into the strings.
5966 @deffn Composite upcase (@var{text})
5967 @deffnx Composite downcase (@var{text})
5968 @deffnx Composite capitalize (@var{text})
5969 Expand to @var{text}, but with capitalization changed: @code{upcase}
5970 changes all letters to upper case, @code{downcase} changes all letters
5971 to lower case, and @code{capitalize} changes the first character of each
5972 word to upper case and the remaining characters to lower case.
5975 First, an example of their usage, using implementations distributed in
5976 @file{m4-@value{VERSION}/@/examples/@/capitalize.m4}.
5980 $ @kbd{m4 -I examples}
5981 include(`capitalize.m4')
5983 upcase(`GNUs not Unix')
5984 @result{}GNUS NOT UNIX
5985 downcase(`GNUs not Unix')
5986 @result{}gnus not unix
5987 capitalize(`GNUs not Unix')
5988 @result{}Gnus Not Unix
5991 Now for the implementation. There is a helper macro @code{_capitalize}
5992 which puts only its first word in mixed case. Then @code{capitalize}
5993 merely parses out the words, and replaces them with an invocation of
5994 @code{_capitalize}. (As presented here, the @code{capitalize} macro has
5995 some subtle flaws. You should try to see if you can find and correct
5996 them; or @pxref{Improved capitalize, , Answers}).
6000 $ @kbd{m4 -I examples}
6001 undivert(`capitalize.m4')dnl
6002 @result{}divert(`-1')
6003 @result{}# upcase(text)
6004 @result{}# downcase(text)
6005 @result{}# capitalize(text)
6006 @result{}# change case of text, simple version
6007 @result{}define(`upcase', `translit(`$*', `a-z', `A-Z')')
6008 @result{}define(`downcase', `translit(`$*', `A-Z', `a-z')')
6009 @result{}define(`_capitalize',
6010 @result{} `regexp(`$1', `^\(\w\)\(\w*\)',
6011 @result{} `upcase(`\1')`'downcase(`\2')')')
6012 @result{}define(`capitalize', `patsubst(`$1', `\w+', `_$0(`\&')')')
6013 @result{}divert`'dnl
6016 While @code{regexp} replaces the whole input with the replacement as
6017 soon as there is a match, @code{patsubst} replaces each
6018 @emph{occurrence} of a match and preserves non-matching pieces:
6024 patreg(`bar foo baz Foo', `foo\|Foo', `FOO')
6025 @result{}bar FOO baz FOO
6027 patreg(`aba abb 121', `\(.\)\(.\)\1', `\2\1\2')
6028 @result{}bab abb 212
6032 Omitting @var{regexp} evokes a warning, but still produces output;
6033 contrast this with an empty @var{regexp} argument.
6037 @error{}m4:stdin:1: Warning: too few arguments to builtin `patsubst'
6041 patsubst(`abc', `', `\\-')
6042 @result{}\-a\-b\-c\-
6046 @section Formatting strings (printf-like)
6048 @cindex formatted output
6049 @cindex output, formatted
6050 @cindex GNU extensions
6051 Formatted output can be made with @code{format}:
6053 @deffn Builtin format (@var{format-string}, @dots{})
6054 Works much like the C function @code{printf}. The first argument
6055 @var{format-string} can contain @samp{%} specifications which are
6056 satisfied by additional arguments, and the expansion of @code{format} is
6057 the formatted string.
6059 The macro @code{format} is recognized only with parameters.
6062 Its use is best described by a few examples:
6064 @comment This test is a bit fragile, if someone tries to port to a
6065 @comment platform without infinity.
6067 define(`foo', `The brown fox jumped over the lazy dog')
6069 format(`The string "%s" uses %d characters', foo, len(foo))
6070 @result{}The string "The brown fox jumped over the lazy dog" uses 38 characters
6071 format(`%*.*d', `-1', `-1', `1')
6073 format(`%.0f', `56789.9876')
6075 len(format(`%-*X', `5000', `1'))
6077 ifelse(format(`%010F', `infinity'), ` INF', `success',
6078 format(`%010F', `infinity'), ` INFINITY', `success',
6079 format(`%010F', `infinity'))
6081 ifelse(format(`%.1A', `1.999'), `0X1.0P+1', `success',
6082 format(`%.1A', `1.999'), `0X2.0P+0', `success',
6083 format(`%.1A', `1.999'))
6085 format(`%g', `0xa.P+1')
6089 Using the @code{forloop} macro defined earlier (@pxref{Forloop}), this
6090 example shows how @code{format} can be used to produce tabular output.
6094 $ @kbd{m4 -I examples}
6095 include(`forloop.m4')
6097 forloop(`i', `1', `10', `format(`%6d squared is %10d
6099 @result{} 1 squared is 1
6100 @result{} 2 squared is 4
6101 @result{} 3 squared is 9
6102 @result{} 4 squared is 16
6103 @result{} 5 squared is 25
6104 @result{} 6 squared is 36
6105 @result{} 7 squared is 49
6106 @result{} 8 squared is 64
6107 @result{} 9 squared is 81
6108 @result{} 10 squared is 100
6112 The builtin @code{format} is modeled after the ANSI C @samp{printf}
6113 function, and supports these @samp{%} specifiers: @samp{c}, @samp{s},
6114 @samp{d}, @samp{o}, @samp{x}, @samp{X}, @samp{u}, @samp{a}, @samp{A},
6115 @samp{e}, @samp{E}, @samp{f}, @samp{F}, @samp{g}, @samp{G}, and
6116 @samp{%}; it supports field widths and precisions, and the flags
6117 @samp{+}, @samp{-}, @samp{ }, @samp{0}, @samp{#}, and @samp{'}. For
6118 integer specifiers, the width modifiers @samp{hh}, @samp{h}, and
6119 @samp{l} are recognized, and for floating point specifiers, the width
6120 modifier @samp{l} is recognized. Items not yet supported include
6121 positional arguments, the @samp{n}, @samp{p}, @samp{S}, and @samp{C}
6122 specifiers, the @samp{z}, @samp{t}, @samp{j}, @samp{L} and @samp{ll}
6123 modifiers, and any platform extensions available in the native
6124 @code{printf}. For more details on the functioning of @code{printf},
6125 see the C Library Manual, or the POSIX specification (for
6126 example, @samp{%a} is supported even on platforms that haven't yet
6127 implemented C99 hexadecimal floating point output natively).
6129 Unrecognized specifiers result in a warning. It is anticipated that a
6130 future release of GNU @code{m4} will support more specifiers,
6131 and give better warnings when various problems such as overflow are
6132 encountered. Likewise, escape sequences are not yet recognized.
6136 @error{}m4:stdin:1: Warning: unrecognized specifier in `%p'
6141 @comment Expose a crash with a bad format string fixed in 1.4.15.
6142 @comment Unfortuntely, 8-bit bytes are hard to check for; but the
6143 @comment exit status is enough to sniff the crash in broken versions.
6145 @comment xerr: ignore
6147 format(`%'format(`%c', `128'))
6153 @chapter Macros for doing arithmetic
6156 @cindex integer arithmetic
6157 Integer arithmetic is included in @code{m4}, with a C-like syntax. As
6158 convenient shorthands, there are builtins for simple increment and
6159 decrement operations.
6162 * Incr:: Decrement and increment operators
6163 * Eval:: Evaluating integer expressions
6167 @section Decrement and increment operators
6169 @cindex decrement operator
6170 @cindex increment operator
6171 Increment and decrement of integers are supported using the builtins
6172 @code{incr} and @code{decr}:
6174 @deffn Builtin incr (@var{number})
6175 @deffnx Builtin decr (@var{number})
6176 Expand to the numerical value of @var{number}, incremented
6177 or decremented, respectively, by one. Except for the empty string, the
6178 expansion is empty if @var{number} could not be parsed.
6180 The macros @code{incr} and @code{decr} are recognized only with
6190 @error{}m4:stdin:3: empty string treated as 0 in builtin `incr'
6193 @error{}m4:stdin:4: empty string treated as 0 in builtin `decr'
6198 @section Evaluating integer expressions
6200 @cindex integer expression evaluation
6201 @cindex evaluation, of integer expressions
6202 @cindex expressions, evaluation of integer
6203 Integer expressions are evaluated with @code{eval}:
6205 @deffn Builtin eval (@var{expression}, @dvar{radix, 10}, @ovar{width})
6206 Expands to the value of @var{expression}. The expansion is empty
6207 if a problem is encountered while parsing the arguments. If specified,
6208 @var{radix} and @var{width} control the format of the output.
6210 Calculations are done with 32-bit signed numbers. Overflow silently
6211 results in wraparound. A warning is issued if division by zero is
6212 attempted, or if @var{expression} could not be parsed.
6214 Expressions can contain the following operators, listed in order of
6215 decreasing precedence.
6221 Unary plus and minus, and bitwise and logical negation
6225 Multiplication, division, and modulo
6227 Addition and subtraction
6231 Relational operators
6237 Bitwise exclusive-or
6246 The macro @code{eval} is recognized only with parameters.
6249 All binary operators, except exponentiation, are left associative. C
6250 operators that perform variable assignment, such as @samp{+=} or
6251 @samp{--}, are not implemented, since @code{eval} only operates on
6252 constants, not variables. Attempting to use them results in an error.
6253 However, since traditional implementations treated @samp{=} as an
6254 undocumented alias for @samp{==} as opposed to an assignment operator,
6255 this usage is supported as a special case. Be aware that a future
6256 version of GNU M4 may support assignment semantics as an
6257 extension when POSIX mode is not requested, and that using
6258 @samp{=} to check equality is not portable.
6263 @error{}m4:stdin:1: Warning: recommend ==, not =, for equality operator
6266 @error{}m4:stdin:2: invalid operator in eval: ++0
6269 @error{}m4:stdin:3: invalid operator in eval: 0 |= 1
6273 Note that some older @code{m4} implementations use @samp{^} as an
6274 alternate operator for the exponentiation, although POSIX
6275 requires the C behavior of bitwise exclusive-or. The precedence of the
6276 negation operators, @samp{~} and @samp{!}, was traditionally lower than
6277 equality. The unary operators could not be used reliably more than once
6278 on the same term without intervening parentheses. The traditional
6279 precedence of the equality operators @samp{==} and @samp{!=} was
6280 identical instead of lower than the relational operators such as
6281 @samp{<}, even through GNU M4 1.4.8. Starting with version
6282 1.4.9, GNU M4 correctly follows POSIX precedence
6283 rules. M4 scripts designed to be portable between releases must be
6284 aware that parentheses may be required to enforce C precedence rules.
6285 Likewise, division by zero, even in the unused branch of a
6286 short-circuiting operator, is not always well-defined in other
6289 Following are some examples where the current version of M4 follows C
6290 precedence rules, but where older versions and some other
6291 implementations of @code{m4} require explicit parentheses to get the
6297 eval(`(1 == 2) > 0')
6307 eval(`+ + - ~ ! ~ 0')
6312 @error{}m4:stdin:9: divide by zero in eval: 0 || 1 / 0
6317 @error{}m4:stdin:11: modulo by zero in eval: 2 && 1 % 0
6321 @cindex GNU extensions
6322 As a GNU extension, the operator @samp{**} performs integral
6323 exponentiation. The operator is right-associative, and if evaluated,
6324 the exponent must be non-negative, and at least one of the arguments
6325 must be non-zero, or a warning is issued.
6330 eval(`(2 ** 3) ** 2')
6338 @error{}m4:stdin:5: divide by zero in eval: 0 ** 0
6340 @error{}m4:stdin:6: negative exponent in eval: 4 ** -2
6344 Within @var{expression}, (but not @var{radix} or @var{width}), numbers
6345 without a special prefix are decimal. A simple @samp{0} prefix
6346 introduces an octal number. @samp{0x} introduces a hexadecimal number.
6347 As GNU extensions, @samp{0b} introduces a binary number.
6348 @samp{0r} introduces a number expressed in any radix between 1 and 36:
6349 the prefix should be immediately followed by the decimal expression of
6350 the radix, a colon, then the digits making the number. For radix 1,
6351 leading zeros are ignored, and all remaining digits must be @samp{1};
6352 for all other radices, the digits are @samp{0}, @samp{1}, @samp{2},
6353 @dots{}. Beyond @samp{9}, the digits are @samp{a}, @samp{b} @dots{} up
6354 to @samp{z}. Lower and upper case letters can be used interchangeably
6355 in numbers prefixes and as number digits.
6357 Parentheses may be used to group subexpressions whenever needed. For the
6358 relational operators, a true relation returns @code{1}, and a false
6359 relation return @code{0}.
6361 Here are a few examples of use of @code{eval}.
6372 eval(index(`Hello world', `llo') >= 0)
6374 eval(`0r1:0111 + 0b100 + 0r3:12')
6376 define(`square', `eval(`($1) ** 2')')
6380 square(square(`5')` + 1')
6382 define(`foo', `666')
6385 @error{}m4:stdin:11: bad expression in eval: foo / 6
6391 As the last two lines show, @code{eval} does not handle macro
6392 names, even if they expand to a valid expression (or part of a valid
6393 expression). Therefore all macros must be expanded before they are
6394 passed to @code{eval}.
6396 Some calculations are not portable to other implementations, since they
6397 have undefined semantics in C, but GNU @code{m4} has
6398 well-defined behavior on overflow. When shifting, an out-of-range shift
6399 amount is implicitly brought into the range of 32-bit signed integers
6400 using an implicit bit-wise and with 0x1f).
6403 define(`max_int', eval(`0x7fffffff'))
6405 define(`min_int', incr(max_int))
6411 ifelse(eval(min_int` / -1'), min_int, `overflow occurred')
6412 @result{}overflow occurred
6414 @result{}-2147483648
6415 eval(`0x80000000 % -1')
6423 If @var{radix} is specified, it specifies the radix to be used in the
6424 expansion. The default radix is 10; this is also the case if
6425 @var{radix} is the empty string. A warning results if the radix is
6426 outside the range of 1 through 36, inclusive. The result of @code{eval}
6427 is always taken to be signed. No radix prefix is output, and for
6428 radices greater than 10, the digits are lower case. The @var{width}
6429 argument specifies the minimum output width, excluding any negative
6430 sign. The result is zero-padded to extend the expansion to the
6431 requested width. A warning results if the width is negative. If
6432 @var{radix} or @var{width} is out of bounds, the expansion of
6433 @code{eval} is empty.
6442 eval(`666', `6', `10')
6444 eval(`-666', `6', `10')
6445 @result{}-0000003030
6448 `0r1:'eval(`10', `1', `11')
6449 @result{}0r1:01111111111
6453 @error{}m4:stdin:9: radix 37 in builtin `eval' out of range
6456 @error{}m4:stdin:10: negative width to builtin `eval'
6459 @error{}m4:stdin:11: empty string treated as 0 in builtin `eval'
6463 @node Shell commands
6464 @chapter Macros for running shell commands
6466 @cindex UNIX commands, running
6467 @cindex executing shell commands
6468 @cindex running shell commands
6469 @cindex shell commands, running
6470 @cindex commands, running shell
6471 There are a few builtin macros in @code{m4} that allow you to run shell
6472 commands from within @code{m4}.
6474 Note that the definition of a valid shell command is system dependent.
6475 On UNIX systems, this is the typical @command{/bin/sh}. But on other
6476 systems, such as native Windows, the shell has a different syntax of
6477 commands that it understands. Some examples in this chapter assume
6478 @command{/bin/sh}, and also demonstrate how to quit early with a known
6479 exit value if this is not the case.
6482 * Platform macros:: Determining the platform
6483 * Syscmd:: Executing simple commands
6484 * Esyscmd:: Reading the output of commands
6485 * Sysval:: Exit status
6486 * Mkstemp:: Making temporary files
6489 @node Platform macros
6490 @section Determining the platform
6492 @cindex platform macros
6493 Sometimes it is desirable for an input file to know which platform
6494 @code{m4} is running on. GNU @code{m4} provides several
6495 macros that are predefined to expand to the empty string; checking for
6496 their existence will confirm platform details.
6498 @deffn {Optional builtin} __gnu__
6499 @deffnx {Optional builtin} __os2__
6500 @deffnx {Optional builtin} os2
6501 @deffnx {Optional builtin} __unix__
6502 @deffnx {Optional builtin} unix
6503 @deffnx {Optional builtin} __windows__
6504 @deffnx {Optional builtin} windows
6505 Each of these macros is conditionally defined as needed to describe the
6506 environment of @code{m4}. If defined, each macro expands to the empty
6507 string. For now, these macros silently ignore all arguments, but in a
6508 future release of M4, they might warn if arguments are present.
6511 When GNU extensions are in effect (that is, when you did not
6512 use the @option{-G} option, @pxref{Limits control, , Invoking m4}),
6513 GNU @code{m4} will define the macro @code{@w{__gnu__}} to
6514 expand to the empty string.
6522 Extensions are ifdef(`__gnu__', `active', `inactive')
6523 @result{}Extensions are active
6526 @comment options: -G
6532 @result{}__gnu__(ignored)
6533 Extensions are ifdef(`__gnu__', `active', `inactive')
6534 @result{}Extensions are inactive
6537 On UNIX systems, GNU @code{m4} will define @code{@w{__unix__}}
6538 by default, or @code{unix} when the @option{-G} option is specified.
6540 On native Windows systems, GNU @code{m4} will define
6541 @code{@w{__windows__}} by default, or @code{windows} when the
6542 @option{-G} option is specified.
6544 On OS/2 systems, GNU @code{m4} will define @code{@w{__os2__}}
6545 by default, or @code{os2} when the @option{-G} option is specified.
6547 If GNU @code{m4} does not provide a platform macro for your system,
6548 please report that as a bug.
6551 define(`provided', `0')
6553 ifdef(`__unix__', `define(`provided', incr(provided))')
6555 ifdef(`__windows__', `define(`provided', incr(provided))')
6557 ifdef(`__os2__', `define(`provided', incr(provided))')
6564 @section Executing simple commands
6566 Any shell command can be executed, using @code{syscmd}:
6568 @deffn Builtin syscmd (@var{shell-command})
6569 Executes @var{shell-command} as a shell command.
6571 The expansion of @code{syscmd} is void, @emph{not} the output from
6572 @var{shell-command}! Output or error messages from @var{shell-command}
6573 are not read by @code{m4}. @xref{Esyscmd}, if you need to process the
6576 Prior to executing the command, @code{m4} flushes its buffers.
6577 The default standard input, output and error of @var{shell-command} are
6578 the same as those of @code{m4}.
6580 By default, the @var{shell-command} will be used as the argument to the
6581 @option{-c} option of the @command{/bin/sh} shell (or the version of
6582 @command{sh} specified by @samp{command -p getconf PATH}, if your system
6583 supports that). If you prefer a different shell, the
6584 @command{configure} script can be given the option
6585 @option{--with-syscmd-shell=@var{location}} to set the location of an
6586 alternative shell at GNU @code{m4} installation; the
6587 alternative shell must still support @option{-c}.
6589 The macro @code{syscmd} is recognized only with parameters.
6593 define(`foo', `FOO')
6600 Note how the expansion of @code{syscmd} keeps the trailing newline of
6601 the command, as well as using the newline that appeared after the macro.
6603 The following is an example of @var{shell-command} using the same
6604 standard input as @code{m4}:
6608 $ @kbd{echo "m4wrap(\`syscmd(\`cat')')" | m4}
6613 @comment If the user types the example below with stdin being an
6614 @comment interactive terminal, then cat will hang waiting for additional
6615 @comment input after m4 has exited. But the testsuite is using a pipe
6616 @comment for stdin. Hence, we have two versions - the one we feed the
6617 @comment testsuite below, and the one we display to the user above that
6618 @comment more accurately shows what the testsuite is really doing but
6619 @comment which the testsuite cannot parse.
6622 m4wrap(`syscmd(`cat')')
6628 It tells @code{m4} to read all of its input before executing the wrapped
6629 text, then hand a valid (albeit emptied) pipe as standard input for the
6630 @code{cat} subcommand. Therefore, you should be careful when using
6631 standard input (either by specifying no files, or by passing @samp{-} as
6632 a file name on the command line, @pxref{Command line files, , Invoking
6633 m4}), and also invoking subcommands via @code{syscmd} or @code{esyscmd}
6634 that consume data from standard input. When standard input is a
6635 seekable file, the subprocess will pick up with the next character not
6636 yet processed by @code{m4}; when it is a pipe or other non-seekable
6637 file, there is no guarantee how much data will already be buffered by
6638 @code{m4} and thus unavailable to the child.
6641 @section Reading the output of commands
6643 @cindex GNU extensions
6644 If you want @code{m4} to read the output of a shell command, use
6647 @deffn Builtin esyscmd (@var{shell-command})
6648 Expands to the standard output of the shell command
6649 @var{shell-command}.
6651 Prior to executing the command, @code{m4} flushes its buffers.
6652 The default standard input and standard error of @var{shell-command} are
6653 the same as those of @code{m4}. The error output of @var{shell-command}
6654 is not a part of the expansion: it will appear along with the error
6655 output of @code{m4}.
6657 By default, the @var{shell-command} will be used as the argument to the
6658 @option{-c} option of the @command{/bin/sh} shell (or the version of
6659 @command{sh} specified by @samp{command -p getconf PATH}, if your system
6660 supports that). If you prefer a different shell, the
6661 @command{configure} script can be given the option
6662 @option{--with-syscmd-shell=@var{location}} to set the location of an
6663 alternative shell at GNU @code{m4} installation; the
6664 alternative shell must still support @option{-c}.
6666 The macro @code{esyscmd} is recognized only with parameters.
6670 define(`foo', `FOO')
6677 Note how the expansion of @code{esyscmd} keeps the trailing newline of
6678 the command, as well as using the newline that appeared after the macro.
6680 Just as with @code{syscmd}, care must be exercised when sharing standard
6681 input between @code{m4} and the child process of @code{esyscmd}.
6684 @section Exit status
6686 @cindex UNIX commands, exit status from
6687 @cindex exit status from shell commands
6688 @cindex shell commands, exit status from
6689 @cindex commands, exit status from shell
6690 @cindex status of shell commands
6691 To see whether a shell command succeeded, use @code{sysval}:
6693 @deffn Builtin sysval
6694 Expands to the exit status of the last shell command run with
6695 @code{syscmd} or @code{esyscmd}. Expands to 0 if no command has been
6704 ifelse(sysval, `0', `zero', `non-zero')
6716 ifelse(sysval, `0', `zero', `non-zero')
6718 esyscmd(`echo dnl && exit 127')
6728 @code{sysval} results in 127 if there was a problem executing the
6729 command, for example, if the system-imposed argument length is exceeded,
6730 or if there were not enough resources to fork. It is not possible to
6731 distinguish between failed execution and successful execution that had
6732 an exit status of 127, unless there was output from the child process.
6734 On UNIX platforms, where it is possible to detect when command execution
6735 is terminated by a signal, rather than a normal exit, the result is the
6736 signal number shifted left by eight bits.
6738 @comment This test has difficulties being portable, even on platforms
6739 @comment where syscmd invokes /bin/sh. Kill is not portable with signal
6740 @comment names. According to autoconf, the only portable signal numbers
6741 @comment are 1 (HUP), 2 (INT), 9 (KILL), 13 (PIPE) and 15 (TERM). But
6742 @comment all shells handle SIGINT, and ksh handles HUP (as in, the shell
6743 @comment exits normally rather than letting the signal terminate it).
6744 @comment Also, TERM is flaky, as it can also kill the running m4 on
6745 @comment systems where /bin/sh does not create its own process group.
6746 @comment And PIPE is unreliable, since people tend to run with it
6747 @comment ignored, with m4 inheriting that choice. That leaves KILL as
6748 @comment the only signal we can reliably test, but even that is tricky:
6749 @comment on Haiku, 'kill -9' actually causes a process to die with
6750 @comment signal 15 named KILLTHR on that platform.
6752 dnl This test assumes kill is a shell builtin, and that signals are
6755 `errprint(` skipping: syscmd does not have unix semantics
6757 changequote(`[', `]')
6759 syscmd([/bin/sh -c 'kill -9 $$'; st=$?; test $st = 137 || test $st = 265])
6761 ifelse(sysval, [0], , [errprint([ skipping: shell does not send signal 9
6763 syscmd([kill -9 $$])
6771 esyscmd([kill -9 $$])
6778 @section Making temporary files
6780 @cindex temporary file names
6781 @cindex files, names of temporary
6782 Commands specified to @code{syscmd} or @code{esyscmd} might need a
6783 temporary file, for output or for some other purpose. There is a
6784 builtin macro, @code{mkstemp}, for making a temporary file:
6786 @deffn Builtin mkstemp (@var{template})
6787 @deffnx Builtin maketemp (@var{template})
6788 Expands to the quoted name of a new, empty file, made from the string
6789 @var{template}, which should end with the string @samp{XXXXXX}. The six
6790 @samp{X} characters are then replaced with random characters matching
6791 the regular expression @samp{[a-zA-Z0-9._-]}, in order to make the file
6792 name unique. If fewer than six @samp{X} characters are found at the end
6793 of @code{template}, the result will be longer than the template. The
6794 created file will have access permissions as if by @kbd{chmod =rw,go=},
6795 meaning that the current umask of the @code{m4} process is taken into
6796 account, and at most only the current user can read and write the file.
6798 The traditional behavior, standardized by POSIX, is that
6799 @code{maketemp} merely replaces the trailing @samp{X} with the process
6800 id, without creating a file or quoting the expansion, and without
6801 ensuring that the resulting
6802 string is a unique file name. In part, this means that using the same
6803 @var{template} twice in the same input file will result in the same
6804 expansion. This behavior is a security hole, as it is very easy for
6805 another process to guess the name that will be generated, and thus
6806 interfere with a subsequent use of @code{syscmd} trying to manipulate
6807 that file name. Hence, POSIX has recommended that all new
6808 implementations of @code{m4} provide the secure @code{mkstemp} builtin,
6809 and that users of @code{m4} check for its existence.
6811 The expansion is void and an error issued if a temporary file could
6814 The macros @code{mkstemp} and @code{maketemp} are recognized only with
6818 If you try this next example, you will most likely get different output
6819 for the two file names, since the replacement characters are randomly
6825 define(`tmp', `oops')
6827 maketemp(`/tmp/fooXXXXXX')
6828 @result{}/tmp/fooa07346
6829 ifdef(`mkstemp', `define(`maketemp', defn(`mkstemp'))',
6830 `define(`mkstemp', defn(`maketemp'))dnl
6831 errprint(`warning: potentially insecure maketemp implementation
6838 @cindex GNU extensions
6839 Unless you use the @option{--traditional} command line option (or
6840 @option{-G}, @pxref{Limits control, , Invoking m4}), the GNU
6841 version of @code{maketemp} is secure. This means that using the same
6842 template to multiple calls will generate multiple files. However, we
6843 recommend that you use the new @code{mkstemp} macro, introduced in
6844 GNU M4 1.4.8, which is secure even in traditional mode. Also,
6845 as of M4 1.4.11, the secure implementation quotes the resulting file
6846 name, so that you are guaranteed to know what file was created even if
6847 the random file name happens to match an existing macro. Notice that
6848 this example is careful to use @code{defn} to avoid unintended expansion
6853 define(`foo', `errprint(`oops')')
6855 syscmd(`rm -f foo-??????')sysval
6857 define(`file1', maketemp(`foo-XXXXXX'))dnl
6858 ifelse(esyscmd(`echo \` foo-?????? \''), ` foo-?????? ',
6859 `no file', `created')
6861 define(`file2', maketemp(`foo-XX'))dnl
6862 define(`file3', mkstemp(`foo-XXXXXX'))dnl
6863 ifelse(len(defn(`file1')), len(defn(`file2')),
6864 `same length', `different')
6865 @result{}same length
6866 ifelse(defn(`file1'), defn(`file2'), `same', `different file')
6867 @result{}different file
6868 ifelse(defn(`file2'), defn(`file3'), `same', `different file')
6869 @result{}different file
6870 ifelse(defn(`file1'), defn(`file3'), `same', `different file')
6871 @result{}different file
6872 syscmd(`rm 'defn(`file1') defn(`file2') defn(`file3'))
6879 @c Not worth documenting, but make sure we don't leave trailing NUL in
6883 syscmd(`rm -rf foodir')sysval
6885 syscmd(`mkdir foodir')sysval
6887 len(mkstemp(`foodir/fooXXXXX'))
6889 syscmd(`rm -r foodir')sysval
6893 @c Likewise, and ensure that traditional mode leaves the result unquoted
6894 @c without creating a file.
6896 @comment options: -G
6898 syscmd(`rm -f foo-*')sysval
6900 len(maketemp(`foo-XXXXX'))
6901 @error{}m4:stdin:2: recommend using mkstemp instead
6903 define(`abc', `def')
6907 @error{}m4:stdin:4: recommend using mkstemp instead
6908 syscmd(`test -f foo-*')ifelse(sysval, `0', `0', `1')
6914 @chapter Miscellaneous builtin macros
6916 This chapter describes various builtins, that do not really belong in
6917 any of the previous chapters.
6920 * Errprint:: Printing error messages
6921 * Location:: Printing current location
6922 * M4exit:: Exiting from @code{m4}
6926 @section Printing error messages
6928 @cindex printing error messages
6929 @cindex error messages, printing
6930 @cindex messages, printing error
6931 @cindex standard error, output to
6932 You can print error messages using @code{errprint}:
6934 @deffn Builtin errprint (@var{message}, @dots{})
6935 Prints @var{message} and the rest of the arguments to standard error,
6936 separated by spaces. Standard error is used, regardless of the
6937 @option{--debugfile} option (@pxref{Debugging options, , Invoking m4}).
6939 The expansion of @code{errprint} is void.
6940 The macro @code{errprint} is recognized only with parameters.
6944 errprint(`Invalid arguments to forloop
6946 @error{}Invalid arguments to forloop
6948 errprint(`1')errprint(`2',`3
6954 A trailing newline is @emph{not} printed automatically, so it should be
6955 supplied as part of the argument, as in the example. Unfortunately, the
6956 exact output of @code{errprint} is not very portable to other @code{m4}
6957 implementations: POSIX requires that all arguments be printed,
6958 but some implementations of @code{m4} only print the first.
6959 Furthermore, some BSD implementations always append a newline
6960 for each @code{errprint} call, regardless of whether the last argument
6961 already had one, and POSIX is silent on whether this is
6965 @section Printing current location
6967 @cindex location, input
6968 @cindex input location
6969 To make it possible to specify the location of an error, three
6970 utility builtins exist:
6972 @deffn Builtin __file__
6973 @deffnx Builtin __line__
6974 @deffnx Builtin __program__
6975 Expand to the quoted name of the current input file, the
6976 current input line number in that file, and the quoted name of the
6977 current invocation of @code{m4}.
6981 errprint(__program__:__file__:__line__: `input error
6983 @error{}m4:stdin:1: input error
6987 Line numbers start at 1 for each file. If the file was found due to the
6988 @option{-I} option or @env{M4PATH} environment variable, that is
6989 reflected in the file name. The syncline option (@option{-s},
6990 @pxref{Preprocessor features, , Invoking m4}), and the
6991 @samp{f} and @samp{l} flags of @code{debugmode} (@pxref{Debug Levels}),
6992 also use this notion of current file and line. Redefining the three
6993 location macros has no effect on syncline, debug, warning, or error
6996 This example reuses the file @file{incl.m4} mentioned earlier
7001 $ @kbd{m4 -I examples}
7002 define(`foo', ``$0' called at __file__:__line__')
7005 @result{}foo called at stdin:2
7007 @result{}Include file start
7008 @result{}foo called at examples/incl.m4:2
7009 @result{}Include file end
7013 The location of macros invoked during the rescanning of macro expansion
7014 text corresponds to the location in the file where the expansion was
7015 triggered, regardless of how many newline characters the expansion text
7016 contains. As of GNU M4 1.4.8, the location of text wrapped
7017 with @code{m4wrap} (@pxref{M4wrap}) is the point at which the
7018 @code{m4wrap} was invoked. Previous versions, however, behaved as
7019 though wrapped text came from line 0 of the file ``''.
7022 define(`echo', `$@@')
7024 define(`foo', `echo(__line__
7034 foo(errprint(__line__
7052 The @code{@w{__program__}} macro behaves like @samp{$0} in shell
7053 terminology. If you invoke @code{m4} through an absolute path or a link
7054 with a different spelling, rather than by relying on a @env{PATH} search
7055 for plain @samp{m4}, it will affect how @code{@w{__program__}} expands.
7056 The intent is that you can use it to produce error messages with the
7057 same formatting that @code{m4} produces internally. It can also be used
7058 within @code{syscmd} (@pxref{Syscmd}) to pick the same version of
7059 @code{m4} that is currently running, rather than whatever version of
7060 @code{m4} happens to be first in @env{PATH}. It was first introduced in
7064 @section Exiting from @code{m4}
7066 @cindex exiting from @code{m4}
7067 @cindex status, setting @code{m4} exit
7068 If you need to exit from @code{m4} before the entire input has been
7069 read, you can use @code{m4exit}:
7071 @deffn Builtin m4exit (@dvar{code, 0})
7072 Causes @code{m4} to exit, with exit status @var{code}. If @var{code} is
7073 left out, the exit status is zero. If @var{code} cannot be parsed, or
7074 is outside the range of 0 to 255, the exit status is one. No further
7075 input is read, and all wrapped and diverted text is discarded.
7079 m4wrap(`This text is lost due to `m4exit'.')
7081 divert(`1') So is this.
7084 m4exit And this is never read.
7087 A common use of this is to abort processing:
7089 @deffn Composite fatal_error (@var{message})
7090 Abort processing with an error message and non-zero status. Prefix
7091 @var{message} with details about where the error occurred, and print the
7092 resulting string to standard error.
7097 define(`fatal_error',
7098 `errprint(__program__:__file__:__line__`: fatal error: $*
7101 fatal_error(`this is a BAD one, buster')
7102 @error{}m4:stdin:4: fatal error: this is a BAD one, buster
7105 After this macro call, @code{m4} will exit with exit status 1. This macro
7106 is only intended for error exits, since the normal exit procedures are
7107 not followed, i.e., diverted text is not undiverted, and saved text
7108 (@pxref{M4wrap}) is not reread. (This macro could be made more robust
7109 to earlier versions of @code{m4}. You should try to see if you can find
7110 weaknesses and correct them; or @pxref{Improved fatal_error, , Answers}).
7112 Note that it is still possible for the exit status to be different than
7113 what was requested by @code{m4exit}. If @code{m4} detects some other
7114 error, such as a write error on standard output, the exit status will be
7115 non-zero even if @code{m4exit} requested zero.
7117 If standard input is seekable, then the file will be positioned at the
7118 next unread character. If it is a pipe or other non-seekable file,
7119 then there are no guarantees how much data @code{m4} might have read
7120 into buffers, and thus discarded.
7123 @chapter Fast loading of frozen state
7125 Some bigger @code{m4} applications may be built over a common base
7126 containing hundreds of definitions and other costly initializations.
7127 Usually, the common base is kept in one or more declarative files,
7128 which files are listed on each @code{m4} invocation prior to the
7129 user's input file, or else each input file uses @code{include}.
7131 Reading the common base of a big application, over and over again, may
7132 be time consuming. GNU @code{m4} offers some machinery to
7133 speed up the start of an application using lengthy common bases.
7136 * Using frozen files:: Using frozen files
7137 * Frozen file format:: Frozen file format
7140 @node Using frozen files
7141 @section Using frozen files
7143 @cindex fast loading of frozen files
7144 @cindex frozen files for fast loading
7145 @cindex initialization, frozen state
7146 @cindex dumping into frozen file
7147 @cindex reloading a frozen file
7148 @cindex GNU extensions
7149 Suppose a user has a library of @code{m4} initializations in
7150 @file{base.m4}, which is then used with multiple input files:
7154 $ @kbd{m4 base.m4 input1.m4}
7155 $ @kbd{m4 base.m4 input2.m4}
7156 $ @kbd{m4 base.m4 input3.m4}
7159 Rather than spending time parsing the fixed contents of @file{base.m4}
7160 every time, the user might rather execute:
7164 $ @kbd{m4 -F base.m4f base.m4}
7168 once, and further execute, as often as needed:
7172 $ @kbd{m4 -R base.m4f input1.m4}
7173 $ @kbd{m4 -R base.m4f input2.m4}
7174 $ @kbd{m4 -R base.m4f input3.m4}
7178 with the varying input. The first call, containing the @option{-F}
7179 option, only reads and executes file @file{base.m4}, defining
7180 various application macros and computing other initializations.
7181 Once the input file @file{base.m4} has been completely processed, GNU
7182 @code{m4} produces in @file{base.m4f} a @dfn{frozen} file, that is, a
7183 file which contains a kind of snapshot of the @code{m4} internal state.
7185 Later calls, containing the @option{-R} option, are able to reload
7186 the internal state of @code{m4}, from @file{base.m4f},
7187 @emph{prior} to reading any other input files. This means
7188 instead of starting with a virgin copy of @code{m4}, input will be
7189 read after having effectively recovered the effect of a prior run.
7190 In our example, the effect is the same as if file @file{base.m4} has
7191 been read anew. However, this effect is achieved a lot faster.
7193 Only one frozen file may be created or read in any one @code{m4}
7194 invocation. It is not possible to recover two frozen files at once.
7195 However, frozen files may be updated incrementally, through using
7196 @option{-R} and @option{-F} options simultaneously. For example, if
7197 some care is taken, the command:
7201 $ @kbd{m4 file1.m4 file2.m4 file3.m4 file4.m4}
7205 could be broken down in the following sequence, accumulating the same
7210 $ @kbd{m4 -F file1.m4f file1.m4}
7211 $ @kbd{m4 -R file1.m4f -F file2.m4f file2.m4}
7212 $ @kbd{m4 -R file2.m4f -F file3.m4f file3.m4}
7213 $ @kbd{m4 -R file3.m4f file4.m4}
7216 Some care is necessary because not every effort has been made for
7217 this to work in all cases. In particular, the trace attribute of
7218 macros is not handled, nor the current setting of @code{changeword}.
7219 Currently, @code{m4wrap} and @code{sysval} also have problems.
7220 Also, interactions for some options of @code{m4}, being used in one call
7221 and not in the next, have not been fully analyzed yet. On the other
7222 end, you may be confident that stacks of @code{pushdef} definitions
7223 are handled correctly, as well as undefined or renamed builtins, and
7224 changed strings for quotes or comments. And future releases of
7225 GNU M4 will improve on the utility of frozen files.
7228 @c This example is not worth putting in the manual, but caused core
7229 @c dumps in all versions prior to 1.4.11.
7231 @comment options: -F /dev/null
7233 traceon(`undefined')dnl
7236 @c Make sure freezing is successful.
7240 `errprint(` skipping: syscmd does not have unix semantics
7242 changequote(`[', `]')dnl
7243 syscmd([echo 'changequote([,])pushdef([divnum],[hi])dnl' \
7244 | ']__program__[' -F in.m4f \
7245 && echo 'divnum popdef([divnum])divnum' \
7246 | ']__program__[' -R in.m4f \
7247 && rm in.m4f])status sysval
7252 @c Detect inability to freeze.
7253 @c Some systems harden /, and fail with EACCES rather than ENOENT.
7255 @comment options: -F /none/such
7256 @comment xerr: ignore
7259 $ @kbd{m4 -F /none/such}
7261 @error{}m4: cannot open `/none/such': No such file or directory
7265 When an @code{m4} run is to be frozen, the automatic undiversion
7266 which takes place at end of execution is inhibited. Instead, all
7267 positively numbered diversions are saved into the frozen file.
7268 The active diversion number is also transmitted.
7270 A frozen file to be reloaded need not reside in the current directory.
7271 It is looked up the same way as an @code{include} file (@pxref{Search
7274 If the frozen file was generated with a newer version of @code{m4}, and
7275 contains directives that an older @code{m4} cannot parse, attempting to
7276 load the frozen file with option @option{-R} will cause @code{m4} to
7277 exit with status 63 to indicate version mismatch.
7279 @node Frozen file format
7280 @section Frozen file format
7282 @cindex frozen file format
7283 @cindex file format, frozen file
7284 Frozen files are sharable across architectures. It is safe to write
7285 a frozen file on one machine and read it on another, given that the
7286 second machine uses the same or newer version of GNU @code{m4}.
7287 It is conventional, but not required, to give a frozen file the suffix
7290 These are simple (editable) text files, made up of directives,
7291 each starting with a capital letter and ending with a newline
7292 (@key{NL}). Wherever a directive is expected, the character
7293 @samp{#} introduces a comment line; empty lines are also ignored if they
7294 are not part of an embedded string.
7295 In the following descriptions, each @var{len} refers to the length of
7296 the corresponding strings @var{str} in the next line of input. Numbers
7297 are always expressed in decimal. There are no escape characters. The
7301 @item C @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
7302 Uses @var{str1} and @var{str2} as the begin-comment and
7303 end-comment strings. If omitted, then @samp{#} and @key{NL} are the
7306 @item D @var{number}, @var{len} @key{NL} @var{str} @key{NL}
7307 Selects diversion @var{number}, making it current, then copy
7308 @var{str} in the current diversion. @var{number} may be a negative
7309 number for a non-existing diversion. To merely specify an active
7310 selection, use this command with an empty @var{str}. With 0 as the
7311 diversion @var{number}, @var{str} will be issued on standard output
7312 at reload time. GNU @code{m4} will not produce the @samp{D}
7313 directive with non-zero length for diversion 0, but this can be done
7314 with manual edits. This directive may
7315 appear more than once for the same diversion, in which case the
7316 diversion is the concatenation of the various uses. If omitted, then
7317 diversion 0 is current.
7319 @item F @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
7320 Defines, through @code{pushdef}, a definition for @var{str1}
7321 expanding to the function whose builtin name is @var{str2}. If the
7322 builtin does not exist (for example, if the frozen file was produced by
7323 a copy of @code{m4} compiled with changeword support, but the version
7324 of @code{m4} reloading was compiled without it), the reload is silent,
7325 but any subsequent use of the definition of @var{str1} will result in
7326 a warning. This directive may appear more than once for the same name,
7327 and its order, along with @samp{T}, is important. If omitted, you will
7328 have no access to any builtins.
7330 @item Q @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
7331 Uses @var{str1} and @var{str2} as the begin-quote and end-quote
7332 strings. If omitted, then @samp{`} and @samp{'} are the quote
7335 @item T @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
7336 Defines, though @code{pushdef}, a definition for @var{str1}
7337 expanding to the text given by @var{str2}. This directive may appear
7338 more than once for the same name, and its order, along with @samp{F}, is
7341 @item V @var{number} @key{NL}
7342 Confirms the format of the file. @code{m4} @value{VERSION} only creates
7343 and understands frozen files where @var{number} is 1. This directive
7344 must be the first non-comment in the file, and may not appear more than
7349 @chapter Compatibility with other versions of @code{m4}
7351 @cindex compatibility
7352 This chapter describes the many of the differences between this
7353 implementation of @code{m4}, and of other implementations found under
7354 UNIX, such as System V Release 4, Solaris, and BSD flavors.
7355 In particular, it lists the known differences and extensions to
7356 POSIX. However, the list is not necessarily comprehensive.
7358 At the time of this writing, POSIX 2001 (also known as IEEE
7359 Std 1003.1-2001) is the latest standard, although a new version of
7360 POSIX is under development and includes several proposals for
7361 modifying what @code{m4} is required to do. The requirements for
7362 @code{m4} are shared between SUSv3 and POSIX, and
7364 @uref{https://www.opengroup.org/onlinepubs/@/000095399/@/utilities/@/m4.html}.
7367 * Extensions:: Extensions in GNU M4
7368 * Incompatibilities:: Facilities in System V m4 not in GNU M4
7369 * Other Incompatibilities:: Other incompatibilities
7373 @section Extensions in GNU M4
7375 @cindex GNU extensions
7377 This version of @code{m4} contains a few facilities that do not exist
7378 in System V @code{m4}. These extra facilities are all suppressed by
7379 using the @option{-G} command line option (@pxref{Limits control, ,
7380 Invoking m4}), unless overridden by other command line options.
7384 In the @code{$@var{n}} notation for macro arguments, @var{n} can contain
7385 several digits, while the System V @code{m4} only accepts one digit.
7386 This allows macros in GNU @code{m4} to take any number of
7387 arguments, and not only nine (@pxref{Arguments}).
7389 This means that @code{define(`foo', `$11')} is ambiguous between
7390 implementations. To portably choose between grabbing the first
7391 parameter and appending 1 to the expansion, or grabbing the eleventh
7392 parameter, you can do the following:
7397 dnl First argument, concatenated with 1
7398 define(`_1', `$1')define(`first1', `_1($@@)1')
7400 dnl Eleventh argument, portable
7401 define(`_9', `$9')define(`eleventh', `_9(shift(shift($@@)))')
7403 dnl Eleventh argument, GNU style
7404 define(`Eleventh', `$11')
7406 first1(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k')
7408 eleventh(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k')
7410 Eleventh(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k')
7415 Also see the @code{argn} macro (@pxref{Shift}).
7418 The @code{divert} (@pxref{Divert}) macro can manage more than 9
7419 diversions. GNU @code{m4} treats all positive numbers as valid
7420 diversions, rather than discarding diversions greater than 9.
7423 Files included with @code{include} and @code{sinclude} are sought in a
7424 user specified search path, if they are not found in the working
7425 directory. The search path is specified by the @option{-I} option and the
7426 @env{M4PATH} environment variable (@pxref{Search Path}).
7429 Arguments to @code{undivert} can be non-numeric, in which case the named
7430 file will be included uninterpreted in the output (@pxref{Undivert}).
7433 Formatted output is supported through the @code{format} builtin, which
7434 is modeled after the C library function @code{printf} (@pxref{Format}).
7437 Searches and text substitution through basic regular expressions are
7438 supported by the @code{regexp} (@pxref{Regexp}) and @code{patsubst}
7439 (@pxref{Patsubst}) builtins. Some BSD implementations use
7440 extended regular expressions instead.
7443 The output of shell commands can be read into @code{m4} with
7444 @code{esyscmd} (@pxref{Esyscmd}).
7447 There is indirect access to any builtin macro with @code{builtin}
7451 Macros can be called indirectly through @code{indir} (@pxref{Indir}).
7454 The name of the program, the current input file, and the current input
7455 line number are accessible through the builtins @code{@w{__program__}},
7456 @code{@w{__file__}}, and @code{@w{__line__}} (@pxref{Location}).
7459 The format of the output from @code{dumpdef} and macro tracing can be
7460 controlled with @code{debugmode} (@pxref{Debug Levels}).
7463 The destination of trace and debug output can be controlled with
7464 @code{debugfile} (@pxref{Debug Output}).
7467 The @code{maketemp} (@pxref{Mkstemp}) macro behaves like @code{mkstemp},
7468 creating a new file with a unique name on every invocation, rather than
7469 following the insecure behavior of replacing the trailing @samp{X}
7470 characters with the @code{m4} process id.
7473 POSIX only requires support for the command line options
7474 @option{-s}, @option{-D}, and @option{-U}, so all other options accepted
7475 by GNU M4 are extensions. @xref{Invoking m4}, for a
7476 description of these options.
7478 The debugging and tracing facilities in GNU @code{m4} are much
7479 more extensive than in most other versions of @code{m4}.
7482 @node Incompatibilities
7483 @section Facilities in System V @code{m4} not in GNU @code{m4}
7485 The version of @code{m4} from System V contains a few facilities that
7486 have not been implemented in GNU @code{m4} yet. Additionally,
7487 POSIX requires some behaviors that GNU @code{m4} has not
7488 implemented yet. Relying on these behaviors is non-portable, as a
7489 future release of GNU @code{m4} may change.
7493 POSIX requires support for multiple arguments to @code{defn},
7494 without any clarification on how @code{defn} behaves when one of the
7495 multiple arguments names a builtin. System V @code{m4} and some other
7496 implementations allow mixing builtins and text macros into a single
7497 macro. GNU @code{m4} only supports joining multiple text
7498 arguments, although a future implementation may lift this restriction to
7499 behave more like System V@. The only portable way to join text macros
7500 with builtins is via helper macros and implicit concatenation of macro
7504 POSIX requires an application to exit with non-zero status if
7505 it wrote an error message to stderr. This has not yet been consistently
7506 implemented for the various builtins that are required to issue an error
7507 (such as @code{eval} (@pxref{Eval}) when an argument cannot be parsed).
7510 Some traditional implementations only allow reading standard input
7511 once, but GNU @code{m4} correctly handles multiple instances
7512 of @samp{-} on the command line.
7515 POSIX requires @code{m4wrap} (@pxref{M4wrap}) to act in FIFO
7516 (first-in, first-out) order, but GNU @code{m4} currently uses
7517 LIFO order. Furthermore, POSIX states that only the first
7518 argument to @code{m4wrap} is saved for later evaluation, but
7519 GNU @code{m4} saves and processes all arguments, with output
7520 separated by spaces.
7523 POSIX states that builtins that require arguments, but are
7524 called without arguments, have undefined behavior. Traditional
7525 implementations simply behave as though empty strings had been passed.
7526 For example, @code{a`'define`'b} would expand to @code{ab}. But
7527 GNU @code{m4} ignores certain builtins if they have missing
7528 arguments, giving @code{adefineb} for the above example.
7531 Traditional implementations handle @code{define(`f',`1')} (@pxref{Define})
7532 by undefining the entire stack of previous definitions, and if doing
7533 @code{undefine(`f')} first. GNU @code{m4} replaces just the top
7534 definition on the stack, as if doing @code{popdef(`f')} followed by
7535 @code{pushdef(`f',`1')}. POSIX allows either behavior.
7538 POSIX 2001 requires @code{syscmd} (@pxref{Syscmd}) to evaluate
7539 command output for macro expansion, but this was a mistake that is
7540 anticipated to be corrected in the next version of POSIX.
7541 GNU @code{m4} follows traditional behavior in @code{syscmd}
7542 where output is not rescanned, and provides the extension @code{esyscmd}
7543 that does scan the output.
7546 At one point, POSIX required @code{changequote(@var{arg})}
7547 (@pxref{Changequote}) to use newline as the close quote, but this was a
7548 bug, and the next version of POSIX is anticipated to state
7549 that using empty strings or just one argument is unspecified.
7550 Meanwhile, the GNU @code{m4} behavior of treating an empty
7551 end-quote delimiter as @samp{'} is not portable, as Solaris treats it as
7552 repeating the start-quote delimiter, and BSD treats it as leaving the
7553 previous end-quote delimiter unchanged. For predictable results, never
7554 call changequote with just one argument, or with empty strings for
7558 At one point, POSIX required @code{changecom(@var{arg},)}
7559 (@pxref{Changecom}) to make it impossible to end a comment, but this is
7560 a bug, and the next version of POSIX is anticipated to state
7561 that using empty strings is unspecified. Meanwhile, the GNU
7562 @code{m4} behavior of treating an empty end-comment delimiter as newline
7563 is not portable, as BSD treats it as leaving the previous end-comment
7564 delimiter unchanged. It is also impossible in BSD implementations to
7565 disable comments, even though that is required by POSIX. For
7566 predictable results, never call changecom with empty strings for
7570 Most implementations of @code{m4} give macros a higher precedence than
7571 comments when parsing, meaning that if the start delimiter given to
7572 @code{changecom} (@pxref{Changecom}) starts with a macro name, comments
7573 are effectively disabled. POSIX does not specify what the
7574 precedence is, so this version of GNU @code{m4} parser
7575 recognizes comments, then macros, then quoted strings.
7578 Traditional implementations allow argument collection, but not string
7579 and comment processing, to span file boundaries. Thus, if @file{a.m4}
7580 contains @samp{len(}, and @file{b.m4} contains @samp{abc)},
7581 @kbd{m4 a.m4 b.m4} outputs @samp{3} with traditional @code{m4}, but
7582 gives an error message that the end of file was encountered inside a
7583 macro with GNU @code{m4}. On the other hand, traditional
7584 implementations do end of file processing for files included with
7585 @code{include} or @code{sinclude} (@pxref{Include}), while GNU
7586 @code{m4} seamlessly integrates the content of those files. Thus
7587 @code{include(`a.m4')include(`b.m4')} will output @samp{3} instead of
7591 Traditional @code{m4} treats @code{traceon} (@pxref{Trace}) without
7592 arguments as a global variable, independent of named macro tracing.
7593 Also, once a macro is undefined, named tracing of that macro is lost.
7594 On the other hand, when GNU @code{m4} encounters
7595 @code{traceon} without
7596 arguments, it turns tracing on for all existing definitions at the time,
7597 but does not trace future definitions; @code{traceoff} without arguments
7598 turns tracing off for all definitions regardless of whether they were
7599 also traced by name; and tracing by name, such as with @option{-tfoo} at
7600 the command line or @code{traceon(`foo')} in the input, is an attribute
7601 that is preserved even if the macro is currently undefined.
7603 Additionally, while POSIX requires trace output, it makes no
7604 demands on the formatting of that output. Parsing trace output is not
7605 guaranteed to be reliable, even between different releases of
7606 GNU M4; however, the intent is that any future changes in
7607 trace output will only occur under the direction of additional
7608 @code{debugmode} flags (@pxref{Debug Levels}).
7611 POSIX requires @code{eval} (@pxref{Eval}) to treat all
7612 operators with the same precedence as C@. However, earlier versions of
7613 GNU @code{m4} followed the traditional behavior of other
7614 @code{m4} implementations, where bitwise and logical negation (@samp{~}
7615 and @samp{!}) have lower precedence than equality operators; and where
7616 equality operators (@samp{==} and @samp{!=}) had the same precedence as
7617 relational operators (such as @samp{<}). Use explicit parentheses to
7618 ensure proper precedence. As extensions to POSIX,
7619 GNU @code{m4} gives well-defined semantics to operations that
7620 C leaves undefined, such as when overflow occurs, when shifting negative
7621 numbers, or when performing division by zero. POSIX also
7622 requires @samp{=} to cause an error, but many traditional
7623 implementations allowed it as an alias for @samp{==}.
7626 POSIX 2001 requires @code{translit} (@pxref{Translit}) to
7627 treat each character of the second and third arguments literally.
7628 However, it is anticipated that the next version of POSIX will
7629 allow the GNU @code{m4} behavior of treating @samp{-} as a
7633 POSIX requires @code{m4} to honor the locale environment
7634 variables of @env{LANG}, @env{LC_ALL}, @env{LC_CTYPE},
7635 @env{LC_MESSAGES}, and @env{NLSPATH}, but this has not yet been
7636 implemented in GNU @code{m4}.
7639 POSIX states that only unquoted leading newlines and blanks
7640 (that is, space and tab) are ignored when collecting macro arguments.
7641 However, this appears to be a bug in POSIX, since most
7642 traditional implementations also ignore all whitespace (formfeed,
7643 carriage return, and vertical tab). GNU @code{m4} follows
7644 tradition and ignores all leading unquoted whitespace.
7647 @cindex @env{POSIXLY_CORRECT}
7648 A strictly-compliant POSIX client is not allowed to use
7649 command-line arguments not specified by POSIX. However, since
7650 this version of M4 ignores @env{POSIXLY_CORRECT} and enables the option
7651 @code{--gnu} by default (@pxref{Limits control, , Invoking m4}), a
7652 client desiring to be strictly compliant has no way to disable
7653 GNU extensions that conflict with POSIX when
7654 directly invoking the compiled @code{m4}. A future version of
7655 @code{GNU} M4 will honor the environment variable @env{POSIXLY_CORRECT},
7656 implicitly enabling @option{--traditional} if it is set, in order to
7657 allow a strictly-compliant client. In the meantime, a client needing
7658 strict POSIX compliance can use the workaround of invoking a
7659 shell script wrapper, where the wrapper then adds @option{--traditional}
7660 to the arguments passed to the compiled @code{m4}.
7663 @node Other Incompatibilities
7664 @section Other incompatibilities
7666 There are a few other incompatibilities between this implementation of
7667 @code{m4}, and the System V version.
7671 GNU @code{m4} implements sync lines differently from System V
7672 @code{m4}, when text is being diverted. GNU @code{m4} outputs
7673 the sync lines when the text is being diverted, and System V @code{m4}
7674 when the diverted text is being brought back.
7676 The problem is which lines and file names should be attached to text
7677 that is being, or has been, diverted. System V @code{m4} regards all
7678 the diverted text as being generated by the source line containing the
7679 @code{undivert} call, whereas GNU @code{m4} regards the
7680 diverted text as being generated at the time it is diverted.
7682 The sync line option is used mostly when using @code{m4} as
7683 a front end to a compiler. If a diverted line causes a compiler error,
7684 the error messages should most probably refer to the place where the
7685 diversion was made, and not where it was inserted again.
7687 @comment options: -s
7692 @result{}#line 3 "stdin"
7695 @result{}#line 2 "stdin"
7697 @result{}#line 1 "stdin"
7701 The current @code{m4} implementation has a limitation that the syncline
7702 output at the start of each diversion occurs no matter what, even if the
7703 previous diversion did not end with a newline. This goes contrary to
7704 the claim that synclines appear on a line by themselves, so this
7705 limitation may be corrected in a future version of @code{m4}. In the
7706 meantime, when using @option{-s}, it is wisest to make sure all
7707 diversions end with newline.
7710 GNU @code{m4} makes no attempt at prohibiting self-referential
7721 There is nothing inherently wrong with defining @samp{x} to
7722 return @samp{x}. The wrong thing is to expand @samp{x} unquoted,
7723 because that would cause an infinite rescan loop.
7724 In @code{m4}, one might use macros to hold strings, as we do for
7725 variables in other programming languages, further checking them with:
7729 ifelse(defn(`@var{holder}'), `@var{value}', @dots{})
7733 In cases like this one, an interdiction for a macro to hold its own name
7734 would be a useless limitation. Of course, this leaves more rope for the
7735 GNU @code{m4} user to hang himself! Rescanning hangs may be
7736 avoided through careful programming, a little like for endless loops in
7737 traditional programming languages.
7741 @chapter Correct version of some examples
7743 Some of the examples in this manuals are buggy or not very robust, for
7744 demonstration purposes. Improved versions of these composite macros are
7748 * Improved exch:: Solution for @code{exch}
7749 * Improved forloop:: Solution for @code{forloop}
7750 * Improved foreach:: Solution for @code{foreach}
7751 * Improved copy:: Solution for @code{copy}
7752 * Improved m4wrap:: Solution for @code{m4wrap}
7753 * Improved cleardivert:: Solution for @code{cleardivert}
7754 * Improved capitalize:: Solution for @code{capitalize}
7755 * Improved fatal_error:: Solution for @code{fatal_error}
7759 @section Solution for @code{exch}
7761 The @code{exch} macro (@pxref{Arguments}) as presented requires clients
7762 to double quote their arguments. A nicer definition, which lets
7763 clients follow the rule of thumb of one level of quoting per level of
7764 parentheses, involves adding quotes in the definition of @code{exch}, as
7768 define(`exch', ``$2', `$1'')
7770 define(exch(`expansion text', `macro'))
7773 @result{}expansion text
7776 @node Improved forloop
7777 @section Solution for @code{forloop}
7779 The @code{forloop} macro (@pxref{Forloop}) as presented earlier can go
7780 into an infinite loop if given an iterator that is not parsed as a macro
7781 name. It does not do any sanity checking on its numeric bounds, and
7782 only permits decimal numbers for bounds. Here is an improved version,
7783 shipped as @file{m4-@value{VERSION}/@/examples/@/forloop2.m4}; this
7784 version also optimizes overhead by calling four macros instead of six
7785 per iteration (excluding those in @var{text}), by not dereferencing the
7786 @var{iterator} in the helper @code{@w{_forloop}}.
7790 $ @kbd{m4 -d -I examples}
7791 undivert(`forloop2.m4')dnl
7792 @result{}divert(`-1')
7793 @result{}# forloop(var, from, to, stmt) - improved version:
7794 @result{}# works even if VAR is not a strict macro name
7795 @result{}# performs sanity check that FROM is larger than TO
7796 @result{}# allows complex numerical expressions in TO and FROM
7797 @result{}define(`forloop', `ifelse(eval(`($2) <= ($3)'), `1',
7798 @result{} `pushdef(`$1')_$0(`$1', eval(`$2'),
7799 @result{} eval(`$3'), `$4')popdef(`$1')')')
7800 @result{}define(`_forloop',
7801 @result{} `define(`$1', `$2')$4`'ifelse(`$2', `$3', `',
7802 @result{} `$0(`$1', incr(`$2'), `$3', `$4')')')
7803 @result{}divert`'dnl
7804 include(`forloop2.m4')
7806 forloop(`i', `2', `1', `no iteration occurs')
7808 forloop(`', `1', `2', ` odd iterator name')
7809 @result{} odd iterator name odd iterator name
7810 forloop(`i', `5 + 5', `0xc', ` 0x`'eval(i, `16')')
7811 @result{} 0xa 0xb 0xc
7812 forloop(`i', `a', `b', `non-numeric bounds')
7813 @error{}m4:stdin:6: bad expression in eval (bad input): (a) <= (b)
7817 One other change to notice is that the improved version used @samp{_$0}
7818 rather than @samp{_foreach} to invoke the helper routine. In general,
7819 this is a good practice to follow, because then the set of macros can be
7820 uniformly transformed. The following example shows a transformation
7821 that doubles the current quoting and appends a suffix @samp{2} to each
7822 transformed macro. If @code{foreach} refers to the literal
7823 @samp{_foreach}, then @code{foreach2} invokes @code{_foreach} instead of
7824 the intended @code{_foreach2}, and the mixing of quoting paradigms leads
7825 to an infinite recursion loop in this example.
7827 @comment options: -L9
7831 $ @kbd{m4 -d -L 9 -I examples}
7832 define(`arg1', `$1')include(`forloop2.m4')include(`quote.m4')
7834 define(`double', `define(`$1'`2',
7835 arg1(patsubst(dquote(defn(`$1')), `[`']', `\&\&')))')
7837 double(`forloop')double(`_forloop')defn(`forloop2')
7838 @result{}ifelse(eval(``($2) <= ($3)''), ``1'',
7839 @result{} ``pushdef(``$1'')_$0(``$1'', eval(``$2''),
7840 @result{} eval(``$3''), ``$4'')popdef(``$1'')'')
7841 forloop(i, 1, 5, `ifelse(')forloop(i, 1, 5, `)')
7843 changequote(`[', `]')changequote([``], [''])
7845 forloop2(i, 1, 5, ``ifelse('')forloop2(i, 1, 5, ``)'')
7847 changequote`'include(`forloop.m4')
7849 double(`forloop')double(`_forloop')defn(`forloop2')
7850 @result{}pushdef(``$1'', ``$2'')_forloop($@@)popdef(``$1'')
7851 forloop(i, 1, 5, `ifelse(')forloop(i, 1, 5, `)')
7853 changequote(`[', `]')changequote([``], [''])
7855 forloop2(i, 1, 5, ``ifelse('')forloop2(i, 1, 5, ``)'')
7856 @error{}m4:stdin:12: recursion limit of 9 exceeded, use -L<N> to change it
7859 One more optimization is still possible. Instead of repeatedly
7860 assigning a variable then invoking or dereferencing it, it is possible
7861 to pass the current iterator value as a single argument. Coupled with
7862 @code{curry} if other arguments are needed (@pxref{Composition}), or
7863 with helper macros if the argument is needed in more than one place in
7864 the expansion, the output can be generated with three, rather than four,
7865 macros of overhead per iteration. Notice how the file
7866 @file{m4-@value{VERSION}/@/examples/@/forloop3.m4} rearranges the
7867 arguments of the helper @code{_forloop} to take two arguments that are
7868 placed around the current value. By splitting a balanced set of
7869 parantheses across multiple arguments, the helper macro can now be
7870 shared by @code{forloop} and the new @code{forloop_arg}.
7874 $ @kbd{m4 -I examples}
7875 include(`forloop3.m4')
7877 undivert(`forloop3.m4')dnl
7878 @result{}divert(`-1')
7879 @result{}# forloop_arg(from, to, macro) - invoke MACRO(value) for
7880 @result{}# each value between FROM and TO, without define overhead
7881 @result{}define(`forloop_arg', `ifelse(eval(`($1) <= ($2)'), `1',
7882 @result{} `_forloop(`$1', eval(`$2'), `$3(', `)')')')
7883 @result{}# forloop(var, from, to, stmt) - refactored to share code
7884 @result{}define(`forloop', `ifelse(eval(`($2) <= ($3)'), `1',
7885 @result{} `pushdef(`$1')_forloop(eval(`$2'), eval(`$3'),
7886 @result{} `define(`$1',', `)$4')popdef(`$1')')')
7887 @result{}define(`_forloop',
7888 @result{} `$3`$1'$4`'ifelse(`$1', `$2', `',
7889 @result{} `$0(incr(`$1'), `$2', `$3', `$4')')')
7890 @result{}divert`'dnl
7891 forloop(`i', `1', `3', ` i')
7893 define(`echo', `$@@')
7895 forloop_arg(`1', `3', ` echo')
7899 forloop_arg(`1', `3', `curry(`pushdef', `a')')
7911 Of course, it is possible to make even more improvements, such as
7912 adding an optional step argument, or allowing iteration through
7913 descending sequences. GNU Autoconf provides some of these
7914 additional bells and whistles in its @code{m4_for} macro.
7916 @node Improved foreach
7917 @section Solution for @code{foreach}
7919 The @code{foreach} and @code{foreachq} macros (@pxref{Foreach}) as
7920 presented earlier each have flaws. First, we will examine and fix the
7921 quadratic behavior of @code{foreachq}:
7925 $ @kbd{m4 -I examples}
7926 include(`foreachq.m4')
7928 traceon(`shift')debugmode(`aq')
7930 foreachq(`x', ``1', `2', `3', `4'', `x
7933 @error{}m4trace: -3- shift(`1', `2', `3', `4')
7934 @error{}m4trace: -2- shift(`1', `2', `3', `4')
7936 @error{}m4trace: -4- shift(`1', `2', `3', `4')
7937 @error{}m4trace: -3- shift(`2', `3', `4')
7938 @error{}m4trace: -3- shift(`1', `2', `3', `4')
7939 @error{}m4trace: -2- shift(`2', `3', `4')
7941 @error{}m4trace: -5- shift(`1', `2', `3', `4')
7942 @error{}m4trace: -4- shift(`2', `3', `4')
7943 @error{}m4trace: -3- shift(`3', `4')
7944 @error{}m4trace: -4- shift(`1', `2', `3', `4')
7945 @error{}m4trace: -3- shift(`2', `3', `4')
7946 @error{}m4trace: -2- shift(`3', `4')
7948 @error{}m4trace: -6- shift(`1', `2', `3', `4')
7949 @error{}m4trace: -5- shift(`2', `3', `4')
7950 @error{}m4trace: -4- shift(`3', `4')
7951 @error{}m4trace: -3- shift(`4')
7954 @cindex quadratic behavior, avoiding
7955 @cindex avoiding quadratic behavior
7956 Each successive iteration was adding more quoted @code{shift}
7957 invocations, and the entire list contents were passing through every
7958 iteration. In general, when recursing, it is a good idea to make the
7959 recursion use fewer arguments, rather than adding additional quoted
7960 uses of @code{shift}. By doing so, @code{m4} uses less memory, invokes
7961 fewer macros, is less likely to run into machine limits, and most
7962 importantly, performs faster. The fixed version of @code{foreachq} can
7963 be found in @file{m4-@value{VERSION}/@/examples/@/foreachq2.m4}:
7967 $ @kbd{m4 -I examples}
7968 include(`foreachq2.m4')
7970 undivert(`foreachq2.m4')dnl
7971 @result{}include(`quote.m4')dnl
7972 @result{}divert(`-1')
7973 @result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
7974 @result{}# quoted list, improved version
7975 @result{}define(`foreachq', `pushdef(`$1')_$0($@@)popdef(`$1')')
7976 @result{}define(`_arg1q', ``$1'')
7977 @result{}define(`_rest', `ifelse(`$#', `1', `', `dquote(shift($@@))')')
7978 @result{}define(`_foreachq', `ifelse(`$2', `', `',
7979 @result{} `define(`$1', _arg1q($2))$3`'$0(`$1', _rest($2), `$3')')')
7980 @result{}divert`'dnl
7981 traceon(`shift')debugmode(`aq')
7983 foreachq(`x', ``1', `2', `3', `4'', `x
7986 @error{}m4trace: -3- shift(`1', `2', `3', `4')
7988 @error{}m4trace: -3- shift(`2', `3', `4')
7990 @error{}m4trace: -3- shift(`3', `4')
7994 Note that the fixed version calls unquoted helper macros in
7995 @code{@w{_foreachq}} to trim elements immediately; those helper macros
7996 in turn must re-supply the layer of quotes lost in the macro invocation.
7997 Contrast the use of @code{@w{_arg1q}}, which quotes the first list
7998 element, with @code{@w{_arg1}} of the earlier implementation that
7999 returned the first list element directly. Additionally, by calling the
8000 helper method immediately, the @samp{defn(`@var{iterator}')} no longer
8001 contains unexpanded macros.
8003 The astute m4 programmer might notice that the solution above still uses
8004 more memory and macro invocations, and thus more time, than strictly
8005 necessary. Note that @samp{$2}, which contains an arbitrarily long
8006 quoted list, is expanded and rescanned three times per iteration of
8007 @code{_foreachq}. Furthermore, every iteration of the algorithm
8008 effectively unboxes then reboxes the list, which costs a couple of macro
8009 invocations. It is possible to rewrite the algorithm for a bit more
8010 speed by swapping the order of the arguments to @code{_foreachq} in
8011 order to operate on an unboxed list in the first place, and by using the
8012 fixed-length @samp{$#} instead of an arbitrary length list as the key to
8013 end recursion. The result is an overhead of six macro invocations per
8014 loop (excluding any macros in @var{text}), instead of eight. This
8015 alternative approach is available as
8016 @file{m4-@value{VERSION}/@/examples/@/foreach3.m4}:
8020 $ @kbd{m4 -I examples}
8021 include(`foreachq3.m4')
8023 undivert(`foreachq3.m4')dnl
8024 @result{}divert(`-1')
8025 @result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
8026 @result{}# quoted list, alternate improved version
8027 @result{}define(`foreachq', `ifelse(`$2', `', `',
8028 @result{} `pushdef(`$1')_$0(`$1', `$3', `', $2)popdef(`$1')')')
8029 @result{}define(`_foreachq', `ifelse(`$#', `3', `',
8030 @result{} `define(`$1', `$4')$2`'$0(`$1', `$2',
8031 @result{} shift(shift(shift($@@))))')')
8032 @result{}divert`'dnl
8033 traceon(`shift')debugmode(`aq')
8035 foreachq(`x', ``1', `2', `3', `4'', `x
8038 @error{}m4trace: -4- shift(`x', `x
8039 @error{}', `', `1', `2', `3', `4')
8040 @error{}m4trace: -3- shift(`x
8041 @error{}', `', `1', `2', `3', `4')
8042 @error{}m4trace: -2- shift(`', `1', `2', `3', `4')
8044 @error{}m4trace: -4- shift(`x', `x
8045 @error{}', `1', `2', `3', `4')
8046 @error{}m4trace: -3- shift(`x
8047 @error{}', `1', `2', `3', `4')
8048 @error{}m4trace: -2- shift(`1', `2', `3', `4')
8050 @error{}m4trace: -4- shift(`x', `x
8051 @error{}', `2', `3', `4')
8052 @error{}m4trace: -3- shift(`x
8053 @error{}', `2', `3', `4')
8054 @error{}m4trace: -2- shift(`2', `3', `4')
8056 @error{}m4trace: -4- shift(`x', `x
8057 @error{}', `3', `4')
8058 @error{}m4trace: -3- shift(`x
8059 @error{}', `3', `4')
8060 @error{}m4trace: -2- shift(`3', `4')
8063 In the current version of M4, every instance of @samp{$@@} is rescanned
8064 as it is encountered. Thus, the @file{foreachq3.m4} alternative uses
8065 much less memory than @file{foreachq2.m4}, and executes as much as 10%
8066 faster, since each iteration encounters fewer @samp{$@@}. However, the
8067 implementation of rescanning every byte in @samp{$@@} is quadratic in
8068 the number of bytes scanned (for example, making the broken version in
8069 @file{foreachq.m4} cubic, rather than quadratic, in behavior). A future
8070 release of M4 will improve the underlying implementation by reusing
8071 results of previous scans, so that both styles of @code{foreachq} can
8072 become linear in the number of bytes scanned. Notice how the
8073 implementation injects an empty argument prior to expanding @samp{$2}
8074 within @code{foreachq}; the helper macro @code{_foreachq} then ignores
8075 the third argument altogether, and ends recursion when there are three
8076 arguments left because there was nothing left to pass through
8077 @code{shift}. Thus, each iteration only needs one @code{ifelse}, rather
8078 than the two conditionals used in the version from @file{foreachq2.m4}.
8080 @cindex nine arguments, more than
8081 @cindex more than nine arguments
8082 @cindex arguments, more than nine
8083 So far, all of the implementations of @code{foreachq} presented have
8084 been quadratic with M4 1.4.x. But @code{forloop} is linear, because
8085 each iteration parses a constant amount of arguments. So, it is
8086 possible to design a variant that uses @code{forloop} to do the
8087 iteration, then uses @samp{$@@} only once at the end, giving a linear
8088 result even with older M4 implementations. This implementation relies
8089 on the GNU extension that @samp{$10} expands to the tenth
8090 argument rather than the first argument concatenated with @samp{0}. The
8091 trick is to define an intermediate macro that repeats the text
8092 @code{m4_define(`$1', `$@var{n}')$2`'}, with @samp{n} set to successive
8093 integers corresponding to each argument. The helper macro
8094 @code{_foreachq_} is needed in order to generate the literal sequences
8095 such as @samp{$1} into the intermediate macro, rather than expanding
8096 them as the arguments of @code{_foreachq}. With this approach, no
8097 @code{shift} calls are even needed! Even though there are seven macros
8098 of overhead per iteration instead of six in @file{foreachq3.m4}, the
8099 linear scaling is apparent at relatively small list sizes. However,
8100 this approach will need adjustment when a future version of M4 follows
8101 POSIX by no longer treating @samp{$10} as the tenth argument;
8102 the anticipation is that @samp{$@{10@}} can be used instead, although
8103 that alternative syntax is not yet supported.
8107 $ @kbd{m4 -I examples}
8108 include(`foreachq4.m4')
8110 undivert(`foreachq4.m4')dnl
8111 @result{}include(`forloop2.m4')dnl
8112 @result{}divert(`-1')
8113 @result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
8114 @result{}# quoted list, version based on forloop
8115 @result{}define(`foreachq',
8116 @result{}`ifelse(`$2', `', `', `_$0(`$1', `$3', $2)')')
8117 @result{}define(`_foreachq',
8118 @result{}`pushdef(`$1', forloop(`$1', `3', `$#',
8119 @result{} `$0_(`1', `2', indir(`$1'))')`popdef(
8120 @result{} `$1')')indir(`$1', $@@)')
8121 @result{}define(`_foreachq_',
8122 @result{}``define(`$$1', `$$3')$$2`''')
8123 @result{}divert`'dnl
8124 traceon(`shift')debugmode(`aq')
8126 foreachq(`x', ``1', `2', `3', `4'', `x
8134 For yet another approach, the improved version of @code{foreach},
8135 available in @file{m4-@value{VERSION}/@/examples/@/foreach2.m4}, simply
8136 overquotes the arguments to @code{@w{_foreach}} to begin with, using
8137 @code{dquote_elt}. Then @code{@w{_foreach}} can just use
8138 @code{@w{_arg1}} to remove the extra layer of quoting that was added up
8143 $ @kbd{m4 -I examples}
8144 include(`foreach2.m4')
8146 undivert(`foreach2.m4')dnl
8147 @result{}include(`quote.m4')dnl
8148 @result{}divert(`-1')
8149 @result{}# foreach(x, (item_1, item_2, ..., item_n), stmt)
8150 @result{}# parenthesized list, improved version
8151 @result{}define(`foreach', `pushdef(`$1')_$0(`$1',
8152 @result{} (dquote(dquote_elt$2)), `$3')popdef(`$1')')
8153 @result{}define(`_arg1', `$1')
8154 @result{}define(`_foreach', `ifelse(`$2', `(`')', `',
8155 @result{} `define(`$1', _arg1$2)$3`'$0(`$1', (dquote(shift$2)), `$3')')')
8156 @result{}divert`'dnl
8157 traceon(`shift')debugmode(`aq')
8159 foreach(`x', `(`1', `2', `3', `4')', `x
8161 @error{}m4trace: -4- shift(`1', `2', `3', `4')
8162 @error{}m4trace: -4- shift(`2', `3', `4')
8163 @error{}m4trace: -4- shift(`3', `4')
8165 @error{}m4trace: -3- shift(``1'', ``2'', ``3'', ``4'')
8167 @error{}m4trace: -3- shift(``2'', ``3'', ``4'')
8169 @error{}m4trace: -3- shift(``3'', ``4'')
8171 @error{}m4trace: -3- shift(``4'')
8174 It is likewise possible to write a variant of @code{foreach} that
8175 performs in linear time on M4 1.4.x; the easiest method is probably
8176 writing a version of @code{foreach} that unboxes its list, then invokes
8177 @code{_foreachq} as previously defined in @file{foreachq4.m4}.
8179 In summary, recursion over list elements is trickier than it appeared at
8180 first glance, but provides a powerful idiom within @code{m4} processing.
8181 As a final demonstration, both list styles are now able to handle
8182 several scenarios that would wreak havoc on one or both of the original
8183 implementations. This points out one other difference between the
8184 list styles. @code{foreach} evaluates unquoted list elements only once,
8185 in preparation for calling @code{@w{_foreach}}, similary for
8186 @code{foreachq} as provided by @file{foreachq3.m4} or
8187 @file{foreachq4.m4}. But
8188 @code{foreachq}, as provided by @file{foreachq2.m4},
8189 evaluates unquoted list elements twice while visiting the first list
8190 element, once in @code{@w{_arg1q}} and once in @code{@w{_rest}}. When
8191 deciding which list style to use, one must take into account whether
8192 repeating the side effects of unquoted list elements will have any
8193 detrimental effects.
8197 $ @kbd{m4 -I examples}
8198 include(`foreach2.m4')
8200 include(`foreachq2.m4')
8203 foreach(`x', `', `<x>') / foreachq(`x', `', `<x>')
8205 dnl 1-element list of empty element
8206 foreach(`x', `()', `<x>') / foreachq(`x', ``'', `<x>')
8208 dnl 2-element list of empty elements
8209 foreach(`x', `(`',`')', `<x>') / foreachq(`x', ``',`'', `<x>')
8210 @result{}<><> / <><>
8211 dnl 1-element list of a comma
8212 foreach(`x', `(`,')', `<x>') / foreachq(`x', ``,'', `<x>')
8214 dnl 2-element list of unbalanced parentheses
8215 foreach(`x', `(`(', `)')', `<x>') / foreachq(`x', ``(', `)'', `<x>')
8216 @result{}<(><)> / <(><)>
8217 define(`ab', `oops')dnl using defn(`iterator')
8218 foreach(`x', `(`a', `b')', `defn(`x')') /dnl
8219 foreachq(`x', ``a', `b'', `defn(`x')')
8221 define(`active', `ACT, IVE')
8225 dnl list of unquoted macros; expansion occurs before recursion
8226 foreach(`x', `(active, active)', `<x>
8228 @error{}m4trace: -4- active -> `ACT, IVE'
8229 @error{}m4trace: -4- active -> `ACT, IVE'
8234 foreachq(`x', `active, active', `<x>
8236 @error{}m4trace: -3- active -> `ACT, IVE'
8237 @error{}m4trace: -3- active -> `ACT, IVE'
8239 @error{}m4trace: -3- active -> `ACT, IVE'
8240 @error{}m4trace: -3- active -> `ACT, IVE'
8244 dnl list of quoted macros; expansion occurs during recursion
8245 foreach(`x', `(`active', `active')', `<x>
8247 @error{}m4trace: -1- active -> `ACT, IVE'
8249 @error{}m4trace: -1- active -> `ACT, IVE'
8251 foreachq(`x', ``active', `active'', `<x>
8253 @error{}m4trace: -1- active -> `ACT, IVE'
8255 @error{}m4trace: -1- active -> `ACT, IVE'
8257 dnl list of double-quoted macro names; no expansion
8258 foreach(`x', `(``active'', ``active'')', `<x>
8262 foreachq(`x', ```active'', ``active''', `<x>
8269 @comment Not worth putting in the manual, but make sure that foreach
8270 @comment implementations behave, and that final implementation is
8273 @comment boxed recursion
8276 @comment options: -Dlimit=10 -Dverbose
8278 $ @kbd {m4 -I examples -Dlimit=10 -Dverbose}
8279 include(`loop.m4')dnl
8280 @result{} 1 2 3 4 5 6 7 8 9 10
8283 @comment unboxed recursion
8286 @comment options: -Dlimit=10 -Dverbose -Dalt
8288 $ @kbd {m4 -I examples -Dlimit=10 -Dverbose -Dalt}
8289 include(`loop.m4')dnl
8290 @result{} 1 2 3 4 5 6 7 8 9 10
8293 @comment foreach via forloop recursion
8296 @comment options: -Dlimit=10 -Dverbose -Dalt=4
8298 $ @kbd {m4 -I examples -Dlimit=10 -Dverbose -Dalt=4}
8299 include(`loop.m4')dnl
8300 @result{} 1 2 3 4 5 6 7 8 9 10
8304 @comment options: -Dlimit=2500 -Dalt=4
8306 $ @kbd {m4 -I examples -Dlimit=2500 -Dalt=4}
8307 include(`loop.m4')dnl
8311 @comment options: -Dlimit=10000 -Dalt=4
8313 $ @kbd {m4 -I examples -Dlimit=10000 -Dalt=4}
8314 define(`foo', `divert`'len(popdef(`_foreachq')_foreachq($@@))')dnl
8315 define(`debug', `pushdef(`_foreachq', defn(`foo'))')
8317 include(`loop.m4')dnl
8324 @section Solution for @code{copy}
8326 The macro @code{copy} presented above
8327 is unable to handle builtin tokens with M4 1.4.x, because it tries to
8328 pass the builtin token through the macro @code{curry}, where it is
8329 silently flattened to an empty string (@pxref{Composition}). Rather
8330 than using the problematic @code{curry} to work around the limitation
8331 that @code{stack_foreach} expects to invoke a macro that takes exactly
8332 one argument, we can write a new macro that lets us form the exact
8333 two-argument @code{pushdef} call sequence needed, so that we are no
8334 longer passing a builtin token through a text macro.
8336 @deffn Composite stack_foreach_sep (@var{macro}, @var{pre}, @var{post}, @
8338 @deffnx Composite stack_foreach_sep_lifo (@var{macro}, @var{pre}, @
8339 @var{post}, @var{sep})
8340 For each of the @code{pushdef} definitions associated with @var{macro},
8341 expand the sequence @samp{@var{pre}`'definition`'@var{post}}.
8342 Additionally, expand @var{sep} between definitions.
8343 @code{stack_foreach_sep} visits the oldest definition first, while
8344 @code{stack_foreach_sep_lifo} visits the current definition first. The
8345 expansion may dereference @var{macro}, but should not modify it. There
8346 are a few special macros, such as @code{defn}, which cannot be used as
8347 the @var{macro} parameter.
8350 Note that @code{stack_foreach(`@var{macro}', `@var{action}')} is
8351 equivalent to @code{stack_foreach_sep(`@var{macro}', `@var{action}(',
8352 `)')}. By supplying explicit parentheses, split among the @var{pre} and
8353 @var{post} arguments to @code{stack_foreach_sep}, it is now possible to
8354 construct macro calls with more than one argument, without passing
8355 builtin tokens through a macro call. It is likewise possible to
8356 directly reference the stack definitions without a macro call, by
8357 leaving @var{pre} and @var{post} empty. Thus, in addition to fixing
8358 @code{copy} on builtin tokens, it also executes with fewer macro
8361 The new macro also adds a separator that is only output after the first
8362 iteration of the helper @code{_stack_reverse_sep}, implemented by
8363 prepending the original @var{sep} to @var{pre} and omitting a @var{sep}
8364 argument in subsequent iterations. Note that the empty string that
8365 separates @var{sep} from @var{pre} is provided as part of the fourth
8366 argument when originally calling @code{_stack_reverse_sep}, and not by
8367 writing @code{$4`'$3} as the third argument in the recursive call; while
8368 the other approach would give the same output, it does so at the expense
8369 of increasing the argument size on each iteration of
8370 @code{_stack_reverse_sep}, which results in quadratic instead of linear
8371 execution time. The improved stack walking macros are available in
8372 @file{m4-@value{VERSION}/@/examples/@/stack_sep.m4}:
8376 $ @kbd{m4 -I examples}
8377 include(`stack_sep.m4')
8379 define(`copy', `ifdef(`$2', `errprint(`$2 already defined
8381 `stack_foreach_sep(`$1', `pushdef(`$2',', `)')')')dnl
8382 pushdef(`a', `1')pushdef(`a', defn(`divnum'))
8392 pushdef(`c', `1')pushdef(`c', `2')
8394 stack_foreach_sep_lifo(`c', `', `', `, ')
8396 undivert(`stack_sep.m4')dnl
8397 @result{}divert(`-1')
8398 @result{}# stack_foreach_sep(macro, pre, post, sep)
8399 @result{}# Invoke PRE`'defn`'POST with a single argument of each definition
8400 @result{}# from the definition stack of MACRO, starting with the oldest, and
8401 @result{}# separated by SEP between definitions.
8402 @result{}define(`stack_foreach_sep',
8403 @result{}`_stack_reverse_sep(`$1', `tmp-$1')'dnl
8404 @result{}`_stack_reverse_sep(`tmp-$1', `$1', `$2`'defn(`$1')$3', `$4`'')')
8405 @result{}# stack_foreach_sep_lifo(macro, pre, post, sep)
8406 @result{}# Like stack_foreach_sep, but starting with the newest definition.
8407 @result{}define(`stack_foreach_sep_lifo',
8408 @result{}`_stack_reverse_sep(`$1', `tmp-$1', `$2`'defn(`$1')$3', `$4`'')'dnl
8409 @result{}`_stack_reverse_sep(`tmp-$1', `$1')')
8410 @result{}define(`_stack_reverse_sep',
8411 @result{}`ifdef(`$1', `pushdef(`$2', defn(`$1'))$3`'popdef(`$1')$0(
8412 @result{} `$1', `$2', `$4$3')')')
8413 @result{}divert`'dnl
8417 @comment Not worth putting in the manual, but make sure that
8418 @comment stack_foreach_sep has linear performance.
8422 $ @kbd {m4 -I examples}
8423 include(`forloop3.m4')include(`stack_sep.m4')dnl
8424 forloop(`i', `1', `10000', `pushdef(`s', i)')
8426 define(`colon', `:')define(`dash', `-')
8428 len(stack_foreach_sep(`s', `dash', `', `colon'))
8433 @node Improved m4wrap
8434 @section Solution for @code{m4wrap}
8436 The replacement @code{m4wrap} versions presented above, designed to
8437 guarantee FIFO or LIFO order regardless of the underlying M4
8438 implementation, share a bug when dealing with wrapped text that looks
8439 like parameter expansion. Note how the invocation of
8440 @code{m4wrap@var{n}} interprets these parameters, while using the
8441 builtin preserves them for their intended use.
8445 $ @kbd{m4 -I examples}
8446 include(`wraplifo.m4')
8448 m4wrap(`define(`foo', ``$0:'-$1-$*-$#-')foo(`a', `b')
8451 builtin(`m4wrap', ``'define(`bar', ``$0:'-$1-$*-$#-')bar(`a', `b')
8455 @result{}bar:-a-a,b-2-
8456 @result{}m4wrap0:---0-
8459 Additionally, the computation of @code{_m4wrap_level} and creation of
8460 multiple @code{m4wrap@var{n}} placeholders in the original examples is
8461 more expensive in time and memory than strictly necessary. Notice how
8462 the improved version grabs the wrapped text via @code{defn} to avoid
8463 parameter expansion, then undefines @code{_m4wrap_text}, before
8464 stripping a level of quotes with @code{_arg1} to expand the text. That
8465 way, each level of wrapping reuses the single placeholder, which starts
8466 each nesting level in an undefined state.
8468 Finally, it is worth emulating the GNU M4 extension of saving
8469 all arguments to @code{m4wrap}, separated by a space, rather than saving
8470 just the first argument. This is done with the @code{join} macro
8471 documented previously (@pxref{Shift}). The improved LIFO example is
8472 shipped as @file{m4-@value{VERSION}/@/examples/@/wraplifo2.m4}, and can
8473 easily be converted to a FIFO solution by swapping the adjacent
8474 invocations of @code{joinall} and @code{defn}.
8478 $ @kbd{m4 -I examples}
8479 include(`wraplifo2.m4')
8481 undivert(`wraplifo2.m4')dnl
8482 @result{}dnl Redefine m4wrap to have LIFO semantics, improved example.
8483 @result{}include(`join.m4')dnl
8484 @result{}define(`_m4wrap', defn(`m4wrap'))dnl
8485 @result{}define(`_arg1', `$1')dnl
8486 @result{}define(`m4wrap',
8487 @result{}`ifdef(`_$0_text',
8488 @result{} `define(`_$0_text', joinall(` ', $@@)defn(`_$0_text'))',
8489 @result{} `_$0(`_arg1(defn(`_$0_text')undefine(`_$0_text'))')dnl
8490 @result{}define(`_$0_text', joinall(` ', $@@))')')dnl
8491 m4wrap(`define(`foo', ``$0:'-$1-$*-$#-')foo(`a', `b')
8495 m4wrap(`nested', `', `$@@
8500 @result{}foo:-a-a,b-2-
8504 @node Improved cleardivert
8505 @section Solution for @code{cleardivert}
8507 The @code{cleardivert} macro (@pxref{Cleardivert}) cannot, as it stands, be
8508 called without arguments to clear all pending diversions. That is
8509 because using undivert with an empty string for an argument is different
8510 than using it with no arguments at all. Compare the earlier definition
8511 with one that takes the number of arguments into account:
8514 define(`cleardivert',
8515 `pushdef(`_n', divnum)divert(`-1')undivert($@@)divert(_n)popdef(`_n')')
8525 define(`cleardivert',
8526 `pushdef(`_num', divnum)divert(`-1')ifelse(`$#', `0',
8527 `undivert`'', `undivert($@@)')divert(_num)popdef(`_num')')
8538 @node Improved capitalize
8539 @section Solution for @code{capitalize}
8541 The @code{capitalize} macro (@pxref{Patsubst}) as presented earlier does
8542 not allow clients to follow the quoting rule of thumb. Consider the
8543 three macros @code{active}, @code{Active}, and @code{ACTIVE}, and the
8544 difference between calling @code{capitalize} with the expansion of a
8545 macro, expanding the result of a case change, and changing the case of a
8546 double-quoted string:
8550 $ @kbd{m4 -I examples}
8551 include(`capitalize.m4')dnl
8552 define(`active', `act1, ive')dnl
8553 define(`Active', `Act2, Ive')dnl
8554 define(`ACTIVE', `ACT3, IVE')dnl
8565 downcase(``ACTIVE'')
8569 capitalize(`active')
8571 capitalize(``active'')
8572 @result{}_capitalize(`active')
8577 capitalize(`active')
8581 First, when @code{capitalize} is called with more than one argument, it
8582 was throwing away later arguments, whereas @code{upcase} and
8583 @code{downcase} used @samp{$*} to collect them all. The fix is simple:
8584 use @samp{$*} consistently.
8586 Next, with single-quoting, @code{capitalize} outputs a single character,
8587 a set of quotes, then the rest of the characters, making it impossible
8588 to invoke @code{Active} after the fact, and allowing the alternate macro
8589 @code{A} to interfere. Here, the solution is to use additional quoting
8590 in the helper macros, then pass the final over-quoted output string
8591 through @code{_arg1} to remove the extra quoting and finally invoke the
8592 concatenated portions as a single string.
8594 Finally, when passed a double-quoted string, the nested macro
8595 @code{_capitalize} is never invoked because it ended up nested inside
8596 quotes. This one is the toughest to fix. In short, we have no idea how
8597 many levels of quotes are in effect on the substring being altered by
8598 @code{patsubst}. If the replacement string cannot be expressed entirely
8599 in terms of literal text and backslash substitutions, then we need a
8600 mechanism to guarantee that the helper macros are invoked outside of
8601 quotes. In other words, this sounds like a job for @code{changequote}
8602 (@pxref{Changequote}). By changing the active quoting characters, we
8603 can guarantee that replacement text injected by @code{patsubst} always
8604 occurs in the middle of a string that has exactly one level of
8605 over-quoting using alternate quotes; so the replacement text closes the
8606 quoted string, invokes the helper macros, then reopens the quoted
8607 string. In turn, that means the replacement text has unbalanced quotes,
8608 necessitating another round of @code{changequote}.
8610 In the fixed version below, (also shipped as
8611 @file{m4-@value{VERSION}/@/examples/@/capitalize2.m4}), @code{capitalize}
8612 uses the alternate quotes of @samp{<<[} and @samp{]>>} (the longer
8613 strings are chosen so as to be less likely to appear in the text being
8614 converted). The helpers @code{_to_alt} and @code{_from_alt} merely
8615 reduce the number of characters required to perform a
8616 @code{changequote}, since the definition changes twice. The outermost
8617 pair means that @code{patsubst} and @code{_capitalize_alt} are invoked
8618 with alternate quoting; the innermost pair is used so that the third
8619 argument to @code{patsubst} can contain an unbalanced
8620 @samp{]>>}/@samp{<<[} pair. Note that @code{upcase} and @code{downcase}
8621 must be redefined as @code{_upcase_alt} and @code{_downcase_alt}, since
8622 they contain nested quotes but are invoked with the alternate quoting
8627 $ @kbd{m4 -I examples}
8628 include(`capitalize2.m4')dnl
8629 define(`active', `act1, ive')dnl
8630 define(`Active', `Act2, Ive')dnl
8631 define(`ACTIVE', `ACT3, IVE')dnl
8632 define(`A', `OOPS')dnl
8633 capitalize(active; `active'; ``active''; ```actIVE''')
8634 @result{}Act1,Ive; Act2, Ive; Active; `Active'
8635 undivert(`capitalize2.m4')dnl
8636 @result{}divert(`-1')
8637 @result{}# upcase(text)
8638 @result{}# downcase(text)
8639 @result{}# capitalize(text)
8640 @result{}# change case of text, improved version
8641 @result{}define(`upcase', `translit(`$*', `a-z', `A-Z')')
8642 @result{}define(`downcase', `translit(`$*', `A-Z', `a-z')')
8643 @result{}define(`_arg1', `$1')
8644 @result{}define(`_to_alt', `changequote(`<<[', `]>>')')
8645 @result{}define(`_from_alt', `changequote(<<[`]>>, <<[']>>)')
8646 @result{}define(`_upcase_alt', `translit(<<[$*]>>, <<[a-z]>>, <<[A-Z]>>)')
8647 @result{}define(`_downcase_alt', `translit(<<[$*]>>, <<[A-Z]>>, <<[a-z]>>)')
8648 @result{}define(`_capitalize_alt',
8649 @result{} `regexp(<<[$1]>>, <<[^\(\w\)\(\w*\)]>>,
8650 @result{} <<[_upcase_alt(<<[<<[\1]>>]>>)_downcase_alt(<<[<<[\2]>>]>>)]>>)')
8651 @result{}define(`capitalize',
8652 @result{} `_arg1(_to_alt()patsubst(<<[<<[$*]>>]>>, <<[\w+]>>,
8653 @result{} _from_alt()`]>>_$0_alt(<<[\&]>>)<<['_to_alt())_from_alt())')
8654 @result{}divert`'dnl
8657 @node Improved fatal_error
8658 @section Solution for @code{fatal_error}
8660 The @code{fatal_error} macro (@pxref{M4exit}) is not robust to versions
8661 of GNU M4 earlier than 1.4.8, where invoking
8662 @code{@w{__file__}} (@pxref{Location}) inside @code{m4wrap} would result
8663 in an empty string, and @code{@w{__line__}} resulted in @samp{0} even
8664 though all files start at line 1. Furthermore, versions earlier than
8665 1.4.6 did not support the @code{@w{__program__}} macro. If you want
8666 @code{fatal_error} to work across the entire 1.4.x release series, a
8667 better implementation would be:
8671 define(`fatal_error',
8672 `errprint(ifdef(`__program__', `__program__', ``m4'')'dnl
8673 `:ifelse(__line__, `0', `',
8674 `__file__:__line__:')` fatal error: $*
8677 m4wrap(`divnum(`demo of internal message')
8678 fatal_error(`inside wrapped text')')
8681 @error{}m4:stdin:6: Warning: excess arguments to builtin `divnum' ignored
8683 @error{}m4:stdin:6: fatal error: inside wrapped text
8686 @c ========================================================== Appendices
8688 @node Copying This Package
8689 @appendix How to make copies of the overall M4 package
8690 @cindex License, code
8692 This appendix covers the license for copying the source code of the
8693 overall M4 package. This manual is under a different set of
8694 restrictions, covered later (@pxref{Copying This Manual}).
8697 * GNU General Public License:: License for copying the M4 package
8700 @node GNU General Public License
8701 @appendixsec License for copying the M4 package
8702 @cindex GPL, GNU General Public License
8703 @cindex GNU General Public License
8704 @cindex General Public License (GPL), GNU
8705 @include gpl-3.0.texi
8707 @node Copying This Manual
8708 @appendix How to make copies of this manual
8709 @cindex License, manual
8711 This appendix covers the license for copying this manual. Note that
8712 some of the longer examples in this manual are also distributed in the
8713 directory @file{m4-@value{VERSION}/@/examples/}, where a more
8714 permissive license is in effect when copying just the examples.
8717 * GNU Free Documentation License:: License for copying this manual
8720 @node GNU Free Documentation License
8721 @appendixsec License for copying this manual
8722 @cindex FDL, GNU Free Documentation License
8723 @cindex GNU Free Documentation License
8724 @cindex Free Documentation License (FDL), GNU
8725 @include fdl-1.3.texi
8728 @appendix Indices of concepts and macros
8731 * Macro index:: Index for all @code{m4} macros
8732 * Concept index:: Index for many concepts
8736 @appendixsec Index for all @code{m4} macros
8738 This index covers all @code{m4} builtins, as well as several useful
8739 composite macros. References are exclusively to the places where a
8740 macro is introduced the first time.
8745 @appendixsec Index for many concepts
8754 @c ispell-local-dictionary: "american"
8755 @c indent-tabs-mode: nil
8756 @c whitespace-check-buffer-indent: nil