8980 lines
297 KiB
Plaintext
8980 lines
297 KiB
Plaintext
\input texinfo @c -*- texinfo -*-
|
|
@comment ========================================================
|
|
@comment %**start of header
|
|
@setfilename m4.info
|
|
@include version.texi
|
|
@settitle GNU M4 @value{VERSION} macro processor
|
|
@documentencoding UTF-8
|
|
@set txicodequoteundirected
|
|
@set txicodequotebacktick
|
|
@set txidefnamenospace
|
|
@setchapternewpage odd
|
|
@finalout
|
|
|
|
@c @tabchar{}
|
|
@c ----------
|
|
@c The testsuite expects literal tab output in some examples, but
|
|
@c literal tabs in texinfo lead to formatting issues.
|
|
@macro tabchar
|
|
@ @c
|
|
|
|
@end macro
|
|
|
|
@c @ovar{ARG}
|
|
@c -------------------
|
|
@c The ARG is an optional argument. To be used for macro arguments in
|
|
@c their documentation (@defmac).
|
|
@macro ovar{varname}
|
|
@r{[}@var{\varname\}@r{]}
|
|
@end macro
|
|
|
|
@c @dvar{ARG, DEFAULT}
|
|
@c -------------------
|
|
@c The ARG is an optional argument, defaulting to DEFAULT. To be used
|
|
@c for macro arguments in their documentation (@defmac).
|
|
@macro dvar{varname, default}
|
|
@r{[}@var{\varname\} = @samp{\default\}@r{]}
|
|
@end macro
|
|
|
|
@comment %**end of header
|
|
@comment ========================================================
|
|
|
|
@copying
|
|
|
|
This manual (@value{UPDATED}) is for GNU M4 (version
|
|
@value{VERSION}), a package containing an implementation of the m4 macro
|
|
language.
|
|
|
|
Copyright @copyright{} 1989--1994, 2004--2014, 2016--2017, 2020--2026
|
|
Free Software Foundation, Inc.
|
|
|
|
@quotation
|
|
Permission is granted to copy, distribute and/or modify this document
|
|
under the terms of the GNU Free Documentation License,
|
|
Version 1.3 or any later version published by the Free Software
|
|
Foundation; with no Invariant Sections, no Front-Cover Texts, and no
|
|
Back-Cover Texts. A copy of the license is included in the section
|
|
entitled ``GNU Free Documentation License.''
|
|
@end quotation
|
|
@end copying
|
|
|
|
@dircategory Text creation and manipulation
|
|
@direntry
|
|
* M4: (m4). A powerful macro processor.
|
|
@end direntry
|
|
|
|
@titlepage
|
|
@title GNU M4, version @value{VERSION}
|
|
@subtitle A powerful macro processor
|
|
@subtitle Edition @value{EDITION}, @value{UPDATED}
|
|
@author by Ren@'e Seindal, Fran@,{c}ois Pinard,
|
|
@author Gary V. Vaughan, and Eric Blake
|
|
@author (@email{bug-m4@@gnu.org})
|
|
|
|
@page
|
|
@vskip 0pt plus 1filll
|
|
@insertcopying
|
|
@end titlepage
|
|
|
|
@contents
|
|
|
|
@ifnottex
|
|
@node Top
|
|
@top GNU M4
|
|
@insertcopying
|
|
@end ifnottex
|
|
|
|
GNU @code{m4} is an implementation of the traditional UNIX macro
|
|
processor. It is mostly SVR4 compatible, although it has some
|
|
extensions (for example, handling more than 9 positional parameters
|
|
to macros). @code{m4} also has builtin functions for including
|
|
files, running shell commands, doing arithmetic, etc. Autoconf needs
|
|
GNU @code{m4} for generating @file{configure} scripts, but not for
|
|
running them.
|
|
|
|
GNU @code{m4} was originally written by Ren@'e Seindal, with
|
|
subsequent changes by Fran@,{c}ois Pinard and other volunteers
|
|
on the Internet. All names and email addresses can be found in the
|
|
files @file{m4-@value{VERSION}/@/AUTHORS} and
|
|
@file{m4-@value{VERSION}/@/THANKS} from the GNU M4
|
|
distribution.
|
|
|
|
This is release @value{VERSION}. It is now considered stable: future
|
|
releases in the 1.4.x series are only meant to fix bugs, increase speed,
|
|
or improve documentation. However@dots{}
|
|
|
|
An experimental feature, which would improve @code{m4} usefulness,
|
|
allows for changing the syntax for what is a @dfn{word} in @code{m4}.
|
|
You should use:
|
|
@comment ignore
|
|
@example
|
|
./configure --enable-changeword
|
|
@end example
|
|
@noindent
|
|
if you want this feature compiled in. The current implementation
|
|
slows down @code{m4} considerably and is hardly acceptable. In the
|
|
future, @code{m4} 2.0 will come with a different set of new features
|
|
that provide similar capabilities, but without the inefficiencies, so
|
|
changeword will go away and @emph{you should not count on it}.
|
|
|
|
@menu
|
|
* Preliminaries:: Introduction and preliminaries
|
|
* Invoking m4:: Invoking @code{m4}
|
|
* Syntax:: Lexical and syntactic conventions
|
|
|
|
* Macros:: How to invoke macros
|
|
* Definitions:: How to define new macros
|
|
* Conditionals:: Conditionals, loops, and recursion
|
|
|
|
* Debugging:: How to debug macros and input
|
|
|
|
* Input Control:: Input control
|
|
* File Inclusion:: File inclusion
|
|
* Diversions:: Diverting and undiverting output
|
|
|
|
* Text handling:: Macros for text handling
|
|
* Arithmetic:: Macros for doing arithmetic
|
|
* Shell commands:: Macros for running shell commands
|
|
* Miscellaneous:: Miscellaneous builtin macros
|
|
* Frozen files:: Fast loading of frozen state
|
|
|
|
* Compatibility:: Compatibility with other versions of @code{m4}
|
|
* Answers:: Correct version of some examples
|
|
|
|
* Copying This Package:: How to make copies of the overall M4 package
|
|
* Copying This Manual:: How to make copies of this manual
|
|
* Indices:: Indices of concepts and macros
|
|
|
|
@detailmenu
|
|
--- The Detailed Node Listing ---
|
|
|
|
Introduction and preliminaries
|
|
|
|
* Intro:: Introduction to @code{m4}
|
|
* History:: Historical references
|
|
* Bugs:: Problems and bugs
|
|
* Manual:: Using this manual
|
|
|
|
Invoking @code{m4}
|
|
|
|
* Operation modes:: Command line options for operation modes
|
|
* Preprocessor features:: Command line options for preprocessor features
|
|
* Limits control:: Command line options for limits control
|
|
* Frozen state:: Command line options for frozen state
|
|
* Debugging options:: Command line options for debugging
|
|
* Command line files:: Specifying input files on the command line
|
|
|
|
Lexical and syntactic conventions
|
|
|
|
* Names:: Macro names
|
|
* Quoted strings:: Quoting input to @code{m4}
|
|
* Comments:: Comments in @code{m4} input
|
|
* Other tokens:: Other kinds of input tokens
|
|
* Input processing:: How @code{m4} copies input to output
|
|
|
|
How to invoke macros
|
|
|
|
* Invocation:: Macro invocation
|
|
* Inhibiting Invocation:: Preventing macro invocation
|
|
* Macro Arguments:: Macro arguments
|
|
* Quoting Arguments:: On Quoting Arguments to macros
|
|
* Macro expansion:: Expanding macros
|
|
|
|
How to define new macros
|
|
|
|
* Define:: Defining a new macro
|
|
* Arguments:: Arguments to macros
|
|
* Pseudo Arguments:: Special arguments to macros
|
|
* Undefine:: Deleting a macro
|
|
* Defn:: Renaming macros
|
|
* Pushdef:: Temporarily redefining macros
|
|
|
|
* Indir:: Indirect call of macros
|
|
* Builtin:: Indirect call of builtins
|
|
|
|
Conditionals, loops, and recursion
|
|
|
|
* Ifdef:: Testing if a macro is defined
|
|
* Ifelse:: If-else construct, or multibranch
|
|
* Shift:: Recursion in @code{m4}
|
|
* Forloop:: Iteration by counting
|
|
* Foreach:: Iteration by list contents
|
|
* Stacks:: Working with definition stacks
|
|
* Composition:: Building macros with macros
|
|
|
|
How to debug macros and input
|
|
|
|
* Dumpdef:: Displaying macro definitions
|
|
* Trace:: Tracing macro calls
|
|
* Debug Levels:: Controlling debugging output
|
|
* Debug Output:: Saving debugging output
|
|
|
|
Input control
|
|
|
|
* Dnl:: Deleting whitespace in input
|
|
* Changequote:: Changing the quote characters
|
|
* Changecom:: Changing the comment delimiters
|
|
* Changeword:: Changing the lexical structure of words
|
|
* M4wrap:: Saving text until end of input
|
|
|
|
File inclusion
|
|
|
|
* Include:: Including named files
|
|
* Search Path:: Searching for include files
|
|
|
|
Diverting and undiverting output
|
|
|
|
* Divert:: Diverting output
|
|
* Undivert:: Undiverting output
|
|
* Divnum:: Diversion numbers
|
|
* Cleardivert:: Discarding diverted text
|
|
|
|
Macros for text handling
|
|
|
|
* Len:: Calculating length of strings
|
|
* Index macro:: Searching for substrings
|
|
* Regexp:: Searching for regular expressions
|
|
* Substr:: Extracting substrings
|
|
* Translit:: Translating characters
|
|
* Patsubst:: Substituting text by regular expression
|
|
* Format:: Formatting strings (printf-like)
|
|
|
|
Macros for doing arithmetic
|
|
|
|
* Incr:: Decrement and increment operators
|
|
* Eval:: Evaluating integer expressions
|
|
|
|
Macros for running shell commands
|
|
|
|
* Platform macros:: Determining the platform
|
|
* Syscmd:: Executing simple commands
|
|
* Esyscmd:: Reading the output of commands
|
|
* Sysval:: Exit status
|
|
* Mkstemp:: Making temporary files
|
|
|
|
Miscellaneous builtin macros
|
|
|
|
* Errprint:: Printing error messages
|
|
* Location:: Printing current location
|
|
* M4exit:: Exiting from @code{m4}
|
|
|
|
Fast loading of frozen state
|
|
|
|
* Using frozen files:: Using frozen files
|
|
* Frozen file format:: Frozen file format
|
|
|
|
Compatibility with other versions of @code{m4}
|
|
|
|
* Extensions:: Extensions in GNU M4
|
|
* Incompatibilities:: Facilities in System V m4 not in GNU M4
|
|
* Other Incompatibilities:: Other incompatibilities
|
|
|
|
Correct version of some examples
|
|
|
|
* Improved exch:: Solution for @code{exch}
|
|
* Improved forloop:: Solution for @code{forloop}
|
|
* Improved foreach:: Solution for @code{foreach}
|
|
* Improved copy:: Solution for @code{copy}
|
|
* Improved m4wrap:: Solution for @code{m4wrap}
|
|
* Improved cleardivert:: Solution for @code{cleardivert}
|
|
* Improved capitalize:: Solution for @code{capitalize}
|
|
* Improved fatal_error:: Solution for @code{fatal_error}
|
|
|
|
How to make copies of the overall M4 package
|
|
|
|
* GNU General Public License:: License for copying the M4 package
|
|
|
|
How to make copies of this manual
|
|
|
|
* GNU Free Documentation License:: License for copying this manual
|
|
|
|
Indices of concepts and macros
|
|
|
|
* Macro index:: Index for all @code{m4} macros
|
|
* Concept index:: Index for many concepts
|
|
|
|
@end detailmenu
|
|
@end menu
|
|
|
|
@node Preliminaries
|
|
@chapter Introduction and preliminaries
|
|
|
|
This first chapter explains what GNU @code{m4} is, where @code{m4}
|
|
comes from, how to read and use this documentation, how to call the
|
|
@code{m4} program, and how to report bugs about it. It concludes by
|
|
giving tips for reading the remainder of the manual.
|
|
|
|
The following chapters then detail all the features of the @code{m4}
|
|
language.
|
|
|
|
@menu
|
|
* Intro:: Introduction to @code{m4}
|
|
* History:: Historical references
|
|
* Bugs:: Problems and bugs
|
|
* Manual:: Using this manual
|
|
@end menu
|
|
|
|
@node Intro
|
|
@section Introduction to @code{m4}
|
|
|
|
@cindex overview of @code{m4}
|
|
@code{m4} is a macro processor, in the sense that it copies its
|
|
input to the output, expanding macros as it goes. Macros are either
|
|
builtin or user-defined, and can take any number of arguments.
|
|
Besides just doing macro expansion, @code{m4} has builtin functions
|
|
for including named files, running shell commands, doing integer
|
|
arithmetic, manipulating text in various ways, performing recursion,
|
|
etc.@dots{} @code{m4} can be used either as a front-end to a compiler,
|
|
or as a macro processor in its own right.
|
|
|
|
The @code{m4} macro processor is widely available on all UNIXes, and has
|
|
been standardized by POSIX.
|
|
Usually, only a small percentage of users are aware of its existence.
|
|
However, those who find it often become committed users. The
|
|
popularity of GNU Autoconf, which requires GNU
|
|
@code{m4} for @emph{generating} @file{configure} scripts, is an incentive
|
|
for many to install it, while these people will not themselves
|
|
program in @code{m4}. GNU @code{m4} is mostly compatible with the
|
|
System V, Release 4 version, except for some minor differences.
|
|
@xref{Compatibility}, for more details.
|
|
|
|
Some people find @code{m4} to be fairly addictive. They first use
|
|
@code{m4} for simple problems, then take bigger and bigger challenges,
|
|
learning how to write complex sets of @code{m4} macros along the way.
|
|
Once really addicted, users pursue writing of sophisticated @code{m4}
|
|
applications even to solve simple problems, devoting more time
|
|
debugging their @code{m4} scripts than doing real work. Beware that
|
|
@code{m4} may be dangerous for the health of compulsive programmers.
|
|
|
|
@node History
|
|
@section Historical references
|
|
|
|
@cindex history of @code{m4}
|
|
@cindex GNU M4, history of
|
|
Macro languages were invented early in the history of computing. In the
|
|
1950s Alan Perlis suggested that the macro language be independent of the
|
|
language being processed. Techniques such as conditional and recursive
|
|
macros, and using macros to define other macros, were described by Doug
|
|
McIlroy of Bell Labs in ``Macro Instruction Extensions of Compiler
|
|
Languages'', @emph{Communications of the ACM} 3, 4 (1960), 214--20,
|
|
@url{https://dl.acm.org/doi/10.1145/367177.367223}.
|
|
|
|
An important precursor of @code{m4} was GPM; see C. Strachey,
|
|
@c The title uses lower case and has no space between "macro" and "generator".
|
|
``A general purpose macrogenerator'', @emph{Computer Journal} 8, 3
|
|
(1965), 225--41,
|
|
@url{https://academic.oup.com/comjnl/article/8/3/225/336044}. GPM is
|
|
also succinctly described in David Gries's book @emph{Compiler
|
|
Construction for Digital Computers}, Wiley (1971). Strachey was a
|
|
brilliant programmer: GPM fit into 250 machine instructions!
|
|
|
|
Inspired by GPM while visiting Strachey's Lab in 1968, McIlroy wrote a
|
|
model preprocessor in that fit into a page of Snobol 3 code, and McIlroy
|
|
and Robert Morris developed a series of further models at Bell Labs.
|
|
Andrew D. Hall followed up with M6, a general purpose macro processor
|
|
used to port the Fortran source code of the Altran computer algebra
|
|
system; see Hall's ``The M6 Macro Processor'', Computing Science
|
|
Technical Report #2, Bell Labs (1972),
|
|
@url{http://cm.bell-labs.com/cm/cs/cstr/2.pdf}. M6's source code
|
|
consisted of about 600 Fortran statements. Its name was the first of
|
|
the @code{m4} line.
|
|
|
|
The Brian Kernighan and P.J. Plauger book @emph{Software Tools},
|
|
Addison-Wesley (1976), describes and implements a Unix
|
|
macro-processor language, which inspired Dennis Ritchie to write
|
|
@code{m3}, a macro processor for the AP-3 minicomputer.
|
|
|
|
Kernighan and Ritchie then joined forces to develop the original
|
|
@code{m4}, described in ``The M4 Macro Processor'', Bell Laboratories
|
|
(1977), @url{https://wolfram.schneider.org/bsd/7thEdManVol2/m4/m4.pdf}.
|
|
It had only 21 builtin macros.
|
|
|
|
While @code{GPM} was more @emph{pure}, @code{m4} is meant to deal with
|
|
the true intricacies of real life: macros can be recognized without
|
|
being pre-announced, skipping whitespace or end-of-lines is easier,
|
|
more constructs are builtin instead of derived, etc.
|
|
|
|
Originally, the Kernighan and Plauger macro-processor, and then
|
|
@code{m3}, formed the engine for the Rational FORTRAN preprocessor,
|
|
that is, the @code{Ratfor} equivalent of @code{cpp}. Later, @code{m4}
|
|
was used as a front-end for @code{Ratfor}, @code{C} and @code{Cobol}.
|
|
|
|
Ren@'e Seindal released his implementation of @code{m4}, GNU
|
|
@code{m4},
|
|
in 1990, with the aim of removing the artificial limitations in many
|
|
of the traditional @code{m4} implementations, such as maximum line
|
|
length, macro size, or number of macros.
|
|
|
|
The late Professor A. Dain Samples described and implemented a further
|
|
evolution in the form of @code{M5}: ``User's Guide to the M5 Macro
|
|
Language: 2nd edition'', Electronic Announcement on comp.compilers
|
|
newsgroup (1992).
|
|
|
|
Fran@,{c}ois Pinard took over maintenance of GNU @code{m4} in
|
|
1992, until 1994 when he released GNU @code{m4} 1.4, which was
|
|
the stable release for 10 years. It was at this time that GNU
|
|
Autoconf decided to require GNU @code{m4} as its underlying
|
|
engine, since all other implementations of @code{m4} had too many
|
|
limitations.
|
|
|
|
More recently, in 2004, Paul Eggert released 1.4.1 and 1.4.2 which
|
|
addressed some long standing bugs in the venerable 1.4 release. Then in
|
|
2005, Gary V. Vaughan collected together the many patches to
|
|
GNU @code{m4} 1.4 that were floating around the net and
|
|
released 1.4.3 and 1.4.4. And in 2006, Eric Blake joined the team and
|
|
prepared patches for the release of 1.4.5, with subsequent releases
|
|
through intervening years, as recent as 1.4.20 in 2025.
|
|
|
|
Meanwhile, development has continued on new features for @code{m4}, such
|
|
as dynamic module loading and additional builtins. When complete,
|
|
GNU @code{m4} 2.0 will start a new series of releases.
|
|
|
|
@node Bugs
|
|
@section Problems and bugs
|
|
|
|
@cindex reporting bugs
|
|
@cindex bug reports
|
|
@cindex suggestions, reporting
|
|
If you have problems with GNU M4 or think you've found a bug,
|
|
please report it. Before reporting a bug, make sure you've actually
|
|
found a real bug. Carefully reread the documentation and see if it
|
|
really says you can do what you're trying to do. If it's not clear
|
|
whether you should be able to do something or not, report that too; it's
|
|
a bug in the documentation!
|
|
|
|
Before reporting a bug or trying to fix it yourself, try to isolate it
|
|
to the smallest possible input file that reproduces the problem. Then
|
|
send us the input file and the exact results @code{m4} gave you. Also
|
|
say what you expected to occur; this will help us decide whether the
|
|
problem was really in the documentation.
|
|
|
|
Once you've got a precise problem, send e-mail to
|
|
@email{bug-m4@@gnu.org}. Please include the version number of @code{m4}
|
|
you are using. You can get this information with the command
|
|
@kbd{m4 --version}. Also provide details about the platform you are
|
|
executing on.
|
|
|
|
Non-bug suggestions are always welcome as well. If you have questions
|
|
about things that are unclear in the documentation or are just obscure
|
|
features, please report them too.
|
|
|
|
@node Manual
|
|
@section Using this manual
|
|
|
|
@cindex examples, understanding
|
|
This manual contains a number of examples of @code{m4} input and output,
|
|
and a simple notation is used to distinguish input, output and error
|
|
messages from @code{m4}. Examples are set out from the normal text, and
|
|
shown in a fixed width font, like this
|
|
|
|
@comment ignore
|
|
@example
|
|
This is an example of an example!
|
|
@end example
|
|
|
|
To distinguish input from output, all output from @code{m4} is prefixed
|
|
by the string @samp{@result{}}, and all error messages by the string
|
|
@samp{@error{}}. When showing how command line options affect matters,
|
|
the command line is shown with a prompt @samp{$ @kbd{like this}},
|
|
otherwise, you can assume that a simple @kbd{m4} invocation will work.
|
|
Thus:
|
|
|
|
@comment ignore
|
|
@example
|
|
$ @kbd{command line to invoke m4}
|
|
Example of input line
|
|
@result{}Output line from m4
|
|
@error{}and an error message
|
|
@end example
|
|
|
|
The sequence @samp{^D} in an example indicates the end of the input
|
|
file. The sequence @samp{@key{NL}} refers to the newline character.
|
|
The majority of these examples are self-contained, and you can run them
|
|
with similar results by invoking @kbd{m4 -d}. In fact, the testsuite
|
|
that is bundled in the GNU M4 package consists of the examples
|
|
in this document! Some of the examples assume that your current
|
|
directory is located where you unpacked the installation, so if you plan
|
|
on following along, you may find it helpful to do this now:
|
|
|
|
@comment ignore
|
|
@example
|
|
$ @kbd{cd m4-@value{VERSION}}
|
|
@end example
|
|
|
|
As each of the predefined macros in @code{m4} is described, a prototype
|
|
call of the macro will be shown, giving descriptive names to the
|
|
arguments, e.g.,
|
|
|
|
@deffn Composite example (@var{string}, @dvar{count, 1}, @
|
|
@ovar{argument}@dots{})
|
|
This is a sample prototype. There is not really a macro named
|
|
@code{example}, but this documents that if there were, it would be a
|
|
Composite macro, rather than a Builtin. It requires at least one
|
|
argument, @var{string}. Remember that in @code{m4}, there must not be a
|
|
space between the macro name and the opening parenthesis, unless it was
|
|
intended to call the macro without any arguments. The brackets around
|
|
@var{count} and @var{argument} show that these arguments are optional.
|
|
If @var{count} is omitted, the macro behaves as if count were @samp{1},
|
|
whereas if @var{argument} is omitted, the macro behaves as if it were
|
|
the empty string. A blank argument is not the same as an omitted
|
|
argument. For example, @samp{example(`a')}, @samp{example(`a',`1')},
|
|
and @samp{example(`a',`1',)} would behave identically with @var{count}
|
|
set to @samp{1}; while @samp{example(`a',)} and @samp{example(`a',`')}
|
|
would explicitly pass the empty string for @var{count}. The ellipses
|
|
(@samp{@dots{}}) show that the macro processes additional arguments
|
|
after @var{argument}, rather than ignoring them.
|
|
@end deffn
|
|
|
|
@cindex numbers
|
|
All macro arguments in @code{m4} are strings, but some are given
|
|
special interpretation, e.g., as numbers, file names, regular
|
|
expressions, etc. The documentation for each macro will state how the
|
|
parameters are interpreted, and what happens if the argument cannot be
|
|
parsed according to the desired interpretation. Unless specified
|
|
otherwise, a parameter specified to be a number is parsed as a decimal,
|
|
even if the argument has leading zeros; and parsing the empty string as
|
|
a number results in 0 rather than an error, although a warning will be
|
|
issued.
|
|
|
|
This document consistently writes and uses @dfn{builtin}, without a
|
|
hyphen, as if it were an English word. This is how the @code{builtin}
|
|
primitive is spelled within @code{m4}.
|
|
|
|
@node Invoking m4
|
|
@chapter Invoking @code{m4}
|
|
|
|
@cindex command line
|
|
@cindex invoking @code{m4}
|
|
The format of the @code{m4} command is:
|
|
|
|
@comment ignore
|
|
@example
|
|
@code{m4} @r{[}@var{option}@dots{}@r{]} @r{[}@var{file}@dots{}@r{]}
|
|
@end example
|
|
|
|
@cindex command line, options
|
|
@cindex options, command line
|
|
@cindex @env{POSIXLY_CORRECT}
|
|
All options begin with @samp{-}, or if long option names are used, with
|
|
@samp{--}. A long option name need not be written completely, any
|
|
unambiguous prefix is sufficient. POSIX requires @code{m4} to
|
|
recognize arguments intermixed with files, even when
|
|
@env{POSIXLY_CORRECT} is set in the environment. Most options take
|
|
effect at startup regardless of their position, but some are documented
|
|
below as taking effect after any files that occurred earlier in the
|
|
command line. The argument @option{--} is a marker to denote the end of
|
|
options.
|
|
|
|
With short options, options that do not take arguments may be combined
|
|
into a single command line argument with subsequent options, options
|
|
with mandatory arguments may be provided either as a single command line
|
|
argument or as two arguments, and options with optional arguments must
|
|
be provided as a single argument. In other words,
|
|
@kbd{m4 -QPDfoo -d a -df} is equivalent to
|
|
@kbd{m4 -Q -P -D foo -d -df -- ./a}, although the latter form is
|
|
considered canonical.
|
|
|
|
With long options, options with mandatory arguments may be provided with
|
|
an equal sign (@samp{=}) in a single argument, or as two arguments, and
|
|
options with optional arguments must be provided as a single argument.
|
|
In other words, @kbd{m4 --def foo --debug a} is equivalent to
|
|
@kbd{m4 --define=foo --debug= -- ./a}, although the latter form is
|
|
considered canonical (not to mention more robust, in case a future
|
|
version of @code{m4} introduces an option named @option{--default}).
|
|
|
|
@code{m4} understands the following options, grouped by functionality.
|
|
|
|
@menu
|
|
* Operation modes:: Command line options for operation modes
|
|
* Preprocessor features:: Command line options for preprocessor features
|
|
* Limits control:: Command line options for limits control
|
|
* Frozen state:: Command line options for frozen state
|
|
* Debugging options:: Command line options for debugging
|
|
* Command line files:: Specifying input files on the command line
|
|
@end menu
|
|
|
|
@node Operation modes
|
|
@section Command line options for operation modes
|
|
|
|
Several options control the overall operation of @code{m4}:
|
|
|
|
@table @code
|
|
@item --help
|
|
Print a help summary on standard output, then immediately exit
|
|
@code{m4} without reading any input files or performing any other
|
|
actions.
|
|
|
|
@item --version
|
|
Print the version number of the program on standard output, then
|
|
immediately exit @code{m4} without reading any input files or
|
|
performing any other actions.
|
|
|
|
@item -E
|
|
@itemx --fatal-warnings
|
|
@cindex errors, fatal
|
|
@cindex fatal errors
|
|
Controls the effect of warnings. If unspecified, then execution
|
|
continues and exit status is unaffected when a warning is printed. If
|
|
specified exactly once, warnings become fatal; when one is issued,
|
|
execution continues, but the exit status will be non-zero. If specified
|
|
multiple times, then execution halts with non-zero status the first time
|
|
a warning is issued. The introduction of behavior levels is new to M4
|
|
1.4.9; for behavior consistent with earlier versions, you should specify
|
|
@option{-E} twice.
|
|
|
|
@item -i
|
|
@itemx --interactive
|
|
@itemx -e
|
|
Makes this invocation of @code{m4} interactive. This means that all
|
|
output will be unbuffered, and interrupts will be ignored. The
|
|
spelling @option{-e} exists for compatibility with other @code{m4}
|
|
implementations, and issues a warning because it may be withdrawn in a
|
|
future version of GNU M4.
|
|
|
|
@item -P
|
|
@itemx --prefix-builtins
|
|
Internally modify @emph{all} builtin macro names so they all start with
|
|
the prefix @samp{m4_}. For example, using this option, one should write
|
|
@samp{m4_define} instead of @samp{define}, and @samp{m4___file__}
|
|
instead of @samp{__file__}. This option has no effect if @option{-R}
|
|
is also specified.
|
|
|
|
@item -Q
|
|
@itemx --quiet
|
|
@itemx --silent
|
|
Suppress warnings, such as missing or superfluous arguments in macro
|
|
calls, or treating the empty string as zero.
|
|
|
|
@item --warn-macro-sequence@r{[}=@var{regexp}@r{]}
|
|
Issue a warning if the regular expression @var{regexp} has a non-empty
|
|
match in any macro definition (either by @code{define} or
|
|
@code{pushdef}). Empty matches are ignored; therefore, supplying the
|
|
empty string as @var{regexp} disables any warning. If the optional
|
|
@var{regexp} is not supplied, then the default regular expression is
|
|
@samp{\$\(@{[^@}]*@}\|[0-9][0-9]+\)} (a literal @samp{$} followed by
|
|
multiple digits or by an open brace), since these sequences will
|
|
change semantics in the default operation of GNU M4 2.0 (due
|
|
to a change in how more than 9 arguments in a macro definition will be
|
|
handled, @pxref{Arguments}). Providing an alternate regular
|
|
expression can provide a useful reverse lookup feature of finding
|
|
where a macro is defined to have a given definition.
|
|
|
|
@item -W @var{regexp}
|
|
@itemx --word-regexp=@var{regexp}
|
|
Use @var{regexp} as an alternative syntax for macro names. This
|
|
experimental option will not be present in all GNU @code{m4}
|
|
implementations (@pxref{Changeword}).
|
|
@end table
|
|
|
|
@node Preprocessor features
|
|
@section Command line options for preprocessor features
|
|
|
|
@cindex macro definitions, on the command line
|
|
@cindex command line, macro definitions on the
|
|
@cindex preprocessor features
|
|
Several options allow @code{m4} to behave more like a preprocessor.
|
|
Macro definitions and deletions can be made on the command line, the
|
|
search path can be altered, and the output file can track where the
|
|
input came from. These features occur with the following options:
|
|
|
|
@table @code
|
|
@item -D @var{name}@r{[}=@var{value}@r{]}
|
|
@itemx --define=@var{name}@r{[}=@var{value}@r{]}
|
|
This enters @var{name} into the symbol table. If @samp{=@var{value}} is
|
|
missing, the value is taken to be the empty string. The @var{value} can
|
|
be any string, and the macro can be defined to take arguments, just as
|
|
if it was defined from within the input. This option may be given more
|
|
than once; order with respect to file names is significant, and
|
|
redefining the same @var{name} loses the previous value.
|
|
|
|
@item -I @var{directory}
|
|
@itemx --include=@var{directory}
|
|
Make @code{m4} search @var{directory} for included files that are not
|
|
found in the current working directory. @xref{Search Path}, for more
|
|
details. This option may be given more than once.
|
|
|
|
@item -s
|
|
@itemx --synclines
|
|
@cindex synchronization lines
|
|
@cindex location, input
|
|
@cindex input location
|
|
Generate synchronization lines, for use by the C preprocessor or other
|
|
similar tools. Order is significant with respect to file names. This
|
|
option is useful, for example, when @code{m4} is used as a
|
|
front end to a compiler. Source file name and line number information
|
|
is conveyed by directives of the form @samp{#line @var{linenum}
|
|
"@var{file}"}, which are inserted as needed into the middle of the
|
|
output. Such directives mean that the following line originated or was
|
|
expanded from the contents of input file @var{file} at line
|
|
@var{linenum}. The @samp{"@var{file}"} part is often omitted when
|
|
the file name did not change from the previous directive.
|
|
|
|
Synchronization directives are always given on complete lines by
|
|
themselves. When a synchronization discrepancy occurs in the middle of
|
|
an output line, the associated synchronization directive is delayed
|
|
until the next newline that does not occur in the middle of a quoted
|
|
string or comment.
|
|
|
|
@comment options: -s
|
|
@example
|
|
define(`twoline', `1
|
|
2')
|
|
@result{}#line 2 "stdin"
|
|
@result{}
|
|
changecom(`/*', `*/')
|
|
@result{}
|
|
define(`comment', `/*1
|
|
2*/')
|
|
@result{}#line 5
|
|
@result{}
|
|
dnl no line
|
|
hello
|
|
@result{}#line 7
|
|
@result{}hello
|
|
twoline
|
|
@result{}1
|
|
@result{}#line 8
|
|
@result{}2
|
|
comment
|
|
@result{}/*1
|
|
@result{}2*/
|
|
one comment `two
|
|
three'
|
|
@result{}#line 10
|
|
@result{}one /*1
|
|
@result{}2*/ two
|
|
@result{}three
|
|
goodbye
|
|
@result{}#line 12
|
|
@result{}goodbye
|
|
@end example
|
|
|
|
@item -U @var{name}
|
|
@itemx --undefine=@var{name}
|
|
This deletes any predefined meaning @var{name} might have. Obviously,
|
|
only predefined macros can be deleted in this way. This option may be
|
|
given more than once; undefining a @var{name} that does not have a
|
|
definition is silently ignored. Order is significant with respect to
|
|
file names.
|
|
@end table
|
|
|
|
@node Limits control
|
|
@section Command line options for limits control
|
|
|
|
There are some limits within @code{m4} that can be tuned. For
|
|
compatibility, @code{m4} also accepts some options that control limits
|
|
in other implementations, but which are automatically unbounded (limited
|
|
only by your hardware and operating system constraints) in GNU
|
|
@code{m4}.
|
|
|
|
@table @code
|
|
@item -g
|
|
@itemx --gnu
|
|
Enable all the extensions in this implementation. In this release of
|
|
M4, this option is always on by default; it is currently only useful
|
|
when overriding a prior use of @option{--traditional}. However, having
|
|
GNU behavior as default makes it impossible to write a
|
|
strictly POSIX-compliant client that avoids all incompatible
|
|
GNU M4 extensions, since such a client would have to use the
|
|
non-POSIX command-line option to force full POSIX
|
|
behavior. Thus, a future version of M4 will be changed to implicitly
|
|
use the option @option{--traditional} if the environment variable
|
|
@env{POSIXLY_CORRECT} is set. Projects that intentionally use
|
|
GNU extensions should consider using @option{--gnu} to state
|
|
their intentions, so that the project will not mysteriously break if the
|
|
user upgrades to a newer M4 and has @env{POSIXLY_CORRECT} set in their
|
|
environment.
|
|
|
|
@item -G
|
|
@itemx --traditional
|
|
Suppress all the extensions made in this implementation, compared to the
|
|
System V version. @xref{Compatibility}, for a list of these.
|
|
|
|
@item -H @var{num}
|
|
@itemx --hashsize=@var{num}
|
|
Make the internal hash table for symbol lookup be @var{num} entries big.
|
|
For better performance, the number should be prime, but this is not
|
|
checked. The default is 65537 entries. It should not be necessary to
|
|
increase this value, unless you define an excessive number of macros.
|
|
|
|
@item -L @var{num}
|
|
@itemx --nesting-limit=@var{num}
|
|
@cindex nesting limit
|
|
@cindex limit, nesting
|
|
Artificially limit the nesting of macro calls to @var{num} levels,
|
|
stopping program execution if this limit is ever exceeded. When not
|
|
specified, nesting defaults to unlimited on platforms that can detect
|
|
stack overflow, and to 1024 levels otherwise. A value of zero means
|
|
unlimited; but then heavily nested code could potentially cause a stack
|
|
overflow.
|
|
|
|
The precise effect of this option is more correctly associated
|
|
with textual nesting than dynamic recursion. It has been useful
|
|
when some complex @code{m4} input was generated by mechanical means, and
|
|
also in diagnosing recursive algorithms that do not scale well.
|
|
Most users never need to change this option from its default.
|
|
|
|
@cindex rescanning
|
|
This option does @emph{not} have the ability to break endless
|
|
rescanning loops, since these do not necessarily consume much memory
|
|
or stack space. Through clever usage of rescanning loops, one can
|
|
request complex, time-consuming computations from @code{m4} with useful
|
|
results. Putting limitations in this area would break @code{m4} power.
|
|
There are many pathological cases: @w{@samp{define(`a', `a')a}} is
|
|
only the simplest example (but @pxref{Compatibility}). Expecting GNU
|
|
@code{m4} to detect these would be a little like expecting a compiler
|
|
system to detect and diagnose endless loops: it is a quite @emph{hard}
|
|
problem in general, if not undecidable!
|
|
|
|
@item -B @var{num}
|
|
@itemx -S @var{num}
|
|
@itemx -T @var{num}
|
|
These options are present for compatibility with System V @code{m4}, but
|
|
do nothing in this implementation. They may disappear in future
|
|
releases, and issue a warning to that effect.
|
|
|
|
@item -N @var{num}
|
|
@itemx --diversions=@var{num}
|
|
These options are present only for compatibility with previous
|
|
versions of GNU @code{m4}, and were controlling the number of
|
|
possible diversions which could be used at the same time. They do nothing,
|
|
because there is no fixed limit anymore. They may disappear in future
|
|
releases, and issue a warning to that effect.
|
|
@end table
|
|
|
|
@node Frozen state
|
|
@section Command line options for frozen state
|
|
|
|
GNU @code{m4} comes with a feature of freezing internal state
|
|
(@pxref{Frozen files}). This can be used to speed up @code{m4}
|
|
execution when reusing a common initialization script.
|
|
|
|
@table @code
|
|
@item -F @var{file}
|
|
@itemx --freeze-state=@var{file}
|
|
Once execution is finished, write out the frozen state on the specified
|
|
@var{file}. It is conventional, but not required, for @var{file} to end
|
|
in @samp{.m4f}.
|
|
|
|
@item -R @var{file}
|
|
@itemx --reload-state=@var{file}
|
|
Before execution starts, recover the internal state from the specified
|
|
frozen @var{file}. The options @option{-D}, @option{-U}, and
|
|
@option{-t} take effect after state is reloaded, but before the input
|
|
files are read.
|
|
@end table
|
|
|
|
@node Debugging options
|
|
@section Command line options for debugging
|
|
|
|
Finally, there are several options for aiding in debugging @code{m4}
|
|
scripts.
|
|
|
|
@table @code
|
|
@item -d@r{[}@var{flags}@r{]}
|
|
@itemx --debug@r{[}=@var{flags}@r{]}
|
|
Set the debug-level according to the flags @var{flags}. The debug-level
|
|
controls the format and amount of information presented by the debugging
|
|
functions. @xref{Debug Levels}, for more details on the format and
|
|
meaning of @var{flags}. If omitted, @var{flags} defaults to @samp{aeq}.
|
|
|
|
@item --debugfile@r{[}=@var{file}@r{]}
|
|
@itemx -o @var{file}
|
|
@itemx --error-output=@var{file}
|
|
Redirect @code{dumpdef} output, debug messages, and trace output to the
|
|
named @var{file}. Warnings, error messages, and @code{errprint} output
|
|
are still printed to standard error. If these options are not used, or
|
|
if @var{file} is unspecified (only possible for @option{--debugfile}),
|
|
debug output goes to standard error; if @var{file} is the empty string,
|
|
debug output is discarded. @xref{Debug Output}, for more details. The
|
|
option @option{--debugfile} may be given more than once, and order is
|
|
significant with respect to file names. The spellings @option{-o} and
|
|
@option{--error-output} are misleading and inconsistent with other
|
|
GNU tools; for now they are silently accepted as synonyms of
|
|
@option{--debugfile} and only recognized once, but in a future version
|
|
of M4, using them will cause a warning to be issued.
|
|
|
|
@ignore
|
|
@comment not worth including in the manual, but provides a good test
|
|
|
|
@comment examples
|
|
@comment options: -Dbar=hello -tbar --debugfile= foo --debugfile -
|
|
@example
|
|
$ @kbd{m4 -d -Iexamples -Dbar=hello -tbar --debugfile= foo --debugfile -
|
|
@result{}hello
|
|
errprint(`hi
|
|
')dnl
|
|
@error{}hi
|
|
bar
|
|
@error{}m4trace: -1- bar -> `hello'
|
|
@result{}hello
|
|
@end example
|
|
@end ignore
|
|
|
|
@item -l @var{num}
|
|
@itemx --arglength=@var{num}
|
|
Restrict the size of the output generated by macro tracing to @var{num}
|
|
characters per trace line. If unspecified or zero, output is
|
|
unlimited. @xref{Debug Levels}, for more details.
|
|
|
|
@item -t @var{name}
|
|
@itemx --trace=@var{name}
|
|
This enables tracing for the macro @var{name}, at any point where it is
|
|
defined. @var{name} need not be defined when this option is given.
|
|
This option may be given more than once, and order is significant with
|
|
respect to file names. @xref{Trace}, for more details.
|
|
@end table
|
|
|
|
@node Command line files
|
|
@section Specifying input files on the command line
|
|
|
|
@cindex command line, file names on the
|
|
@cindex file names, on the command line
|
|
The remaining arguments on the command line are taken to be input file
|
|
names. If no names are present, standard input is read. A file
|
|
name of @file{-} is taken to mean standard input. It is
|
|
conventional, but not required, for input files to end in @samp{.m4}.
|
|
|
|
The input files are read in the sequence given. Standard input can be
|
|
read more than once, so the file name @file{-} may appear multiple times
|
|
on the command line; this makes a difference when input is from a
|
|
terminal or other special file type. It is an error if an input file
|
|
ends in the middle of argument collection, a comment, or a quoted
|
|
string.
|
|
|
|
The options @option{--define} (@option{-D}), @option{--undefine}
|
|
(@option{-U}), @option{--synclines} (@option{-s}), and @option{--trace}
|
|
(@option{-t}) only take effect after processing input from any file
|
|
names that occur earlier on the command line. For example, assume the
|
|
file @file{foo} contains:
|
|
|
|
@comment ignore
|
|
@example
|
|
$ @kbd{cat foo}
|
|
bar
|
|
@end example
|
|
|
|
The text @samp{bar} can then be redefined over multiple uses of
|
|
@file{foo}:
|
|
|
|
@comment options: -Dbar=hello foo -Dbar=world foo
|
|
@example
|
|
$ @kbd{m4 -Dbar=hello foo -Dbar=world foo}
|
|
@result{}hello
|
|
@result{}world
|
|
@end example
|
|
|
|
If none of the input files invoked @code{m4exit} (@pxref{M4exit}), the
|
|
exit status of @code{m4} will be 0 for success, 1 for general failure
|
|
(such as problems with reading an input file), and 63 for version
|
|
mismatch (@pxref{Using frozen files}).
|
|
|
|
If you need to read a file whose name starts with a @file{-}, you can
|
|
specify it as @samp{./-file}, or use @option{--} to mark the end of
|
|
options.
|
|
|
|
@ignore
|
|
@comment Test that 'm4 file/' detects that file is not a directory; we
|
|
@comment can assume that the current directory contains a Makefile.
|
|
@comment mingw fails with EINVAL rather than ENOTDIR.
|
|
|
|
@comment status: 1
|
|
@comment xerr: ignore
|
|
@comment options: Makefile/
|
|
@example
|
|
@error{}m4: cannot open `Makefile/': Not a directory
|
|
@end example
|
|
|
|
@comment Test that closed stderr does not cause a crash. Not all
|
|
@comment systems have the same message for EBADF.
|
|
|
|
@comment xerr: ignore
|
|
@example
|
|
ifdef(`__unix__', ,
|
|
`errprint(` skipping: syscmd does not have unix semantics
|
|
')m4exit(`77')')dnl
|
|
syscmd(`echo | cat >&- 2>/dev/null')ifelse(sysval, `0',
|
|
`errprint(` skipping: system does not allow closing stdout
|
|
')m4exit(`77')')dnl
|
|
changequote(`[', `]')dnl
|
|
syscmd([echo | ']__program__[' >&-])dnl
|
|
@error{}m4: write error: Bad file descriptor
|
|
sysval
|
|
@result{}1
|
|
@end example
|
|
|
|
@example
|
|
ifdef(`__unix__', ,
|
|
`errprint(` skipping: syscmd does not have unix semantics
|
|
')m4exit(`77')')dnl
|
|
syscmd(`echo | cat >&- 2>/dev/null')ifelse(sysval, `0',
|
|
`errprint(` skipping: system does not allow closing stdout
|
|
')m4exit(`77')')dnl
|
|
changequote(`[', `]')dnl
|
|
syscmd([echo 'esyscmd(echo hi >&2 && echo err"print(bye
|
|
)d"nl)dnl' > tmp.m4 \
|
|
&& ']__program__[' tmp.m4 <&- >&- \
|
|
&& rm tmp.m4])sysval
|
|
@error{}hi
|
|
@error{}bye
|
|
@result{}0
|
|
@end example
|
|
|
|
@comment Test that we obey POSIX semantics with -D interspersed with
|
|
@comment files, even with POSIXLY_CORRECT (BSD getopt gets it wrong).
|
|
|
|
$ @kbd{m4 }
|
|
@example
|
|
ifdef(`__unix__', ,
|
|
`errprint(` skipping: syscmd does not have unix semantics
|
|
')m4exit(`77')')dnl
|
|
changequote(`[', `]')dnl
|
|
syscmd([POSIXLY_CORRECT=1 ']__program__[' -Dbar=hello foo -Dbar=world foo])dnl
|
|
@result{}hello
|
|
@result{}world
|
|
sysval
|
|
@result{}0
|
|
@end example
|
|
@end ignore
|
|
|
|
@node Syntax
|
|
@chapter Lexical and syntactic conventions
|
|
|
|
@cindex input tokens
|
|
@cindex tokens
|
|
As @code{m4} reads its input, it separates it into @dfn{tokens}. A
|
|
token is either a name, a quoted string, or any single character, that
|
|
is not a part of either a name or a string. Input to @code{m4} can also
|
|
contain comments. GNU @code{m4} does not yet understand
|
|
multibyte locales; all operations are byte-oriented rather than
|
|
character-oriented (although if your locale uses a single byte
|
|
encoding, such as @sc{ISO-8859-1}, you will not notice a difference).
|
|
However, @code{m4} is eight-bit clean, so you can
|
|
use non-@sc{ascii} characters in quoted strings (@pxref{Changequote}),
|
|
comments (@pxref{Changecom}), and macro names (@pxref{Indir}), with the
|
|
exception of the @sc{nul} character (the zero byte @samp{'\0'}).
|
|
|
|
@menu
|
|
* Names:: Macro names
|
|
* Quoted strings:: Quoting input to @code{m4}
|
|
* Comments:: Comments in @code{m4} input
|
|
* Other tokens:: Other kinds of input tokens
|
|
* Input processing:: How @code{m4} copies input to output
|
|
@end menu
|
|
|
|
@node Names
|
|
@section Macro names
|
|
|
|
@cindex names
|
|
@cindex words
|
|
A name is any sequence of letters, digits, and the character @samp{_}
|
|
(underscore), where the first character is not a digit. @code{m4} will
|
|
use the longest such sequence found in the input. If a name has a
|
|
macro definition, it will be subject to macro expansion
|
|
(@pxref{Macros}). Names are case-sensitive.
|
|
|
|
Examples of legal names are: @samp{foo}, @samp{_tmp}, and @samp{name01}.
|
|
|
|
@node Quoted strings
|
|
@section Quoting input to @code{m4}
|
|
|
|
@cindex quoted string
|
|
@cindex string, quoted
|
|
A quoted string is a sequence of characters surrounded by quote
|
|
strings, defaulting to
|
|
@samp{`} and @samp{'}, where the nested begin and end quotes within the
|
|
string are balanced. The value of a string token is the text, with one
|
|
level of quotes stripped off. Thus
|
|
|
|
@comment ignore
|
|
@example
|
|
`'
|
|
@result{}
|
|
@end example
|
|
|
|
@noindent
|
|
is the empty string, and double-quoting turns into single-quoting.
|
|
|
|
@comment ignore
|
|
@example
|
|
``quoted''
|
|
@result{}`quoted'
|
|
@end example
|
|
|
|
The quote characters can be changed at any time, using the builtin macro
|
|
@code{changequote}. @xref{Changequote}, for more information.
|
|
|
|
@node Comments
|
|
@section Comments in @code{m4} input
|
|
|
|
@cindex comments
|
|
Comments in @code{m4} are normally delimited by the characters @samp{#}
|
|
and newline. All characters between the comment delimiters are ignored,
|
|
but the entire comment (including the delimiters) is passed through to
|
|
the output---comments are @emph{not} discarded by @code{m4}.
|
|
|
|
Comments cannot be nested, so the first newline after a @samp{#} ends
|
|
the comment. The commenting effect of the begin-comment string
|
|
can be inhibited by quoting it.
|
|
|
|
@example
|
|
$ @kbd{m4}
|
|
`quoted text' # `commented text'
|
|
@result{}quoted text # `commented text'
|
|
`quoting inhibits' `#' `comments'
|
|
@result{}quoting inhibits # comments
|
|
@end example
|
|
|
|
The comment delimiters can be changed to any string at any time, using
|
|
the builtin macro @code{changecom}. @xref{Changecom}, for more
|
|
information.
|
|
|
|
@ignore
|
|
@comment Detect regression in 1.4.10b in regards to reparsing comments.
|
|
@comment Not worth including in the manual.
|
|
@example
|
|
define(`e', `$@@')define(`q', ``$@@'')define(`foo', `bar')
|
|
@result{}
|
|
q(e(`one
|
|
',#two ' foo
|
|
))
|
|
@result{}`one
|
|
@result{}',`#two bar
|
|
@result{}''
|
|
changecom(`<', `>')define(`n', `$#')
|
|
@result{}
|
|
n(e(<`>, <'>))
|
|
@result{}1
|
|
len(e(<`>, ,<'>))
|
|
@result{}12
|
|
@end example
|
|
@end ignore
|
|
|
|
@node Other tokens
|
|
@section Other kinds of input tokens
|
|
|
|
@cindex tokens, special
|
|
Any character, that is neither a part of a name, nor of a quoted string,
|
|
nor a comment, is a token by itself. When not in the context of macro
|
|
expansion, all of these tokens are just copied to output. However,
|
|
during macro expansion, whitespace characters (space, tab, newline,
|
|
formfeed, carriage return, vertical tab), parentheses (@samp{(} and
|
|
@samp{)}), comma (@samp{,}), and dollar (@samp{$}) have additional
|
|
roles, explained later.
|
|
|
|
@node Input processing
|
|
@section How @code{m4} copies input to output
|
|
|
|
As @code{m4} reads the input token by token, it will copy each token
|
|
directly to the output immediately.
|
|
|
|
The exception is when it finds a word with a macro definition. In that
|
|
case @code{m4} will calculate the macro's expansion, possibly reading
|
|
more input to get the arguments. It then inserts the expansion in front
|
|
of the remaining input. In other words, the resulting text from a macro
|
|
call will be read and parsed into tokens again.
|
|
|
|
@code{m4} expands a macro as soon as possible. If it finds a macro call
|
|
when collecting the arguments to another, it will expand the second call
|
|
first. This process continues until there are no more macro calls to
|
|
expand and all the input has been consumed.
|
|
|
|
For a running example, examine how @code{m4} handles this input:
|
|
|
|
@comment ignore
|
|
@example
|
|
format(`Result is %d', eval(`2**15'))
|
|
@end example
|
|
|
|
@noindent
|
|
First, @code{m4} sees that the token @samp{format} is a macro name, so
|
|
it collects the tokens @samp{(}, @samp{`Result is %d'}, @samp{,},
|
|
and @samp{@w{ }}, before encountering another potential macro. Sure
|
|
enough, @samp{eval} is a macro name, so the nested argument collection
|
|
picks up @samp{(}, @samp{`2**15'}, and @samp{)}, invoking the eval macro
|
|
with the lone argument of @samp{2**15}. The expansion of
|
|
@samp{eval(2**15)} is @samp{32768}, which is then rescanned as the five
|
|
tokens @samp{3}, @samp{2}, @samp{7}, @samp{6}, and @samp{8}; and
|
|
combined with the next @samp{)}, the format macro now has all its
|
|
arguments, as if the user had typed:
|
|
|
|
@comment ignore
|
|
@example
|
|
format(`Result is %d', 32768)
|
|
@end example
|
|
|
|
@noindent
|
|
The format macro expands to @samp{Result is 32768}, and we have another
|
|
round of scanning for the tokens @samp{Result}, @samp{@w{ }},
|
|
@samp{is}, @samp{@w{ }}, @samp{3}, @samp{2}, @samp{7}, @samp{6}, and
|
|
@samp{8}. None of these are macros, so the final output is
|
|
|
|
@comment ignore
|
|
@example
|
|
@result{}Result is 32768
|
|
@end example
|
|
|
|
As a more complicated example, we will contrast an actual code
|
|
example from the Gnulib project@footnote{Derived from a patch in
|
|
@uref{https://lists.gnu.org/archive/html/bug-gnulib/@/2007-01/@/msg00389.html},
|
|
and a followup patch in
|
|
@uref{https://lists.gnu.org/archive/html/bug-gnulib/@/2007-02/@/msg00000.html}},
|
|
showing both a buggy approach and the desired results. The user desires
|
|
to output a shell assignment statement that takes its argument and turns
|
|
it into a shell variable by converting it to uppercase and prepending a
|
|
prefix. The original attempt looks like this:
|
|
|
|
@example
|
|
changequote([,])dnl
|
|
define([gl_STRING_MODULE_INDICATOR],
|
|
[
|
|
dnl comment
|
|
GNULIB_]translit([$1],[a-z],[A-Z])[=1
|
|
])dnl
|
|
gl_STRING_MODULE_INDICATOR([strcase])
|
|
@result{} @w{ }
|
|
@result{} GNULIB_strcase=1
|
|
@result{} @w{ }
|
|
@end example
|
|
|
|
Oops -- the argument did not get capitalized. And although the manual
|
|
is not able to easily show it, both lines that appear empty actually
|
|
contain two trailing spaces. By stepping through the parse, it is easy
|
|
to see what happened. First, @code{m4} sees the token
|
|
@samp{changequote}, which it recognizes as a macro, followed by
|
|
@samp{(}, @samp{[}, @samp{,}, @samp{]}, and @samp{)} to form the
|
|
argument list. The macro expands to the empty string, but changes the
|
|
quoting characters to something more useful for generating shell code
|
|
(unbalanced @samp{`} and @samp{'} appear all the time in shell scripts,
|
|
but unbalanced @samp{[]} tend to be rare). Also in the first line,
|
|
@code{m4} sees the token @samp{dnl}, which it recognizes as a builtin
|
|
macro that consumes the rest of the line, resulting in no output for
|
|
that line.
|
|
|
|
The second line starts a macro definition. @code{m4} sees the token
|
|
@samp{define}, which it recognizes as a macro, followed by a @samp{(},
|
|
@samp{[gl_STRING_MODULE_INDICATOR]}, and @samp{,}. Because an unquoted
|
|
comma was encountered, the first argument is known to be the expansion
|
|
of the single-quoted string token, or @samp{gl_STRING_MODULE_INDICATOR}.
|
|
Next, @code{m4} sees @samp{@key{NL}}, @samp{ }, and @samp{ }, but this
|
|
whitespace is discarded as part of argument collection. Then comes a
|
|
rather lengthy single-quoted string token, @samp{[@key{NL}@ @ @ @ dnl
|
|
comment@key{NL}@ @ @ @ GNULIB_]}. This is followed by the token
|
|
@samp{translit}, which @code{m4} recognizes as a macro name, so a nested
|
|
macro expansion has started.
|
|
|
|
The arguments to the @code{translit} are found by the tokens @samp{(},
|
|
@samp{[$1]}, @samp{,}, @samp{[a-z]}, @samp{,}, @samp{[A-Z]}, and finally
|
|
@samp{)}. All three string arguments are expanded (or in other words,
|
|
the quotes are stripped), and since neither @samp{$} nor @samp{1} need
|
|
capitalization, the result of the macro is @samp{$1}. This expansion is
|
|
rescanned, resulting in the two literal characters @samp{$} and
|
|
@samp{1}.
|
|
|
|
Scanning of the outer macro resumes, and picks up with
|
|
@samp{[=1@key{NL}@ @ ]}, and finally @samp{)}. The collected pieces of
|
|
expanded text are concatenated, with the end result that the macro
|
|
@samp{gl_STRING_MODULE_INDICATOR} is now defined to be the sequence
|
|
@samp{@key{NL}@ @ @ @ dnl comment@key{NL}@ @ @ @ GNULIB_$1=1@key{NL}@ @ }.
|
|
Once again, @samp{dnl} is recognized and avoids a newline in the output.
|
|
|
|
The final line is then parsed, beginning with @samp{ } and @samp{ }
|
|
that are output literally. Then @samp{gl_STRING_MODULE_INDICATOR} is
|
|
recognized as a macro name, with an argument list of @samp{(},
|
|
@samp{[strcase]}, and @samp{)}. Since the definition of the macro
|
|
contains the sequence @samp{$1}, that sequence is replaced with the
|
|
argument @samp{strcase} prior to starting the rescan. The rescan sees
|
|
@samp{@key{NL}} and four spaces, which are output literally, then
|
|
@samp{dnl}, which discards the text @samp{ comment@key{NL}}. Next
|
|
comes four more spaces, also output literally, and the token
|
|
@samp{GNULIB_strcase}, which resulted from the earlier parameter
|
|
substitution. Since that is not a macro name, it is output literally,
|
|
followed by the literal tokens @samp{=}, @samp{1}, @samp{@key{NL}}, and
|
|
two more spaces. Finally, the original @samp{@key{NL}} seen after the
|
|
macro invocation is scanned and output literally.
|
|
|
|
Now for a corrected approach. This rearranges the use of newlines and
|
|
whitespace so that less whitespace is output (which, although harmless
|
|
to shell scripts, can be visually unappealing), and fixes the quoting
|
|
issues so that the capitalization occurs when the macro
|
|
@samp{gl_STRING_MODULE_INDICATOR} is invoked, rather then when it is
|
|
defined. It also adds another layer of quoting to the first argument of
|
|
@code{translit}, to ensure that the output will be rescanned as a string
|
|
rather than a potential uppercase macro name needing further expansion.
|
|
|
|
@example
|
|
changequote([,])dnl
|
|
define([gl_STRING_MODULE_INDICATOR],
|
|
[dnl comment
|
|
GNULIB_[]translit([[$1]], [a-z], [A-Z])=1dnl
|
|
])dnl
|
|
gl_STRING_MODULE_INDICATOR([strcase])
|
|
@result{} GNULIB_STRCASE=1
|
|
@end example
|
|
|
|
The parsing of the first line is unchanged. The second line sees the
|
|
name of the macro to define, then sees the discarded @samp{@key{NL}}
|
|
and two spaces, as before. But this time, the next token is
|
|
@samp{[dnl comment@key{NL}@ @ GNULIB_[]translit([[$1]], [a-z],
|
|
[A-Z])=1dnl@key{NL}]}, which includes nested quotes, followed by
|
|
@samp{)} to end the macro definition and @samp{dnl} to skip the
|
|
newline. No early expansion of @code{translit} occurs, so the entire
|
|
string becomes the definition of the macro.
|
|
|
|
The final line is then parsed, beginning with two spaces that are
|
|
output literally, and an invocation of
|
|
@code{gl_STRING_MODULE_INDICATOR} with the argument @samp{strcase}.
|
|
Again, the @samp{$1} in the macro definition is substituted prior to
|
|
rescanning. Rescanning first encounters @samp{dnl}, and discards
|
|
@samp{ comment@key{NL}}. Then two spaces are output literally. Next
|
|
comes the token @samp{GNULIB_}, but that is not a macro, so it is
|
|
output literally. The token @samp{[]} is an empty string, so it does
|
|
not affect output. Then the token @samp{translit} is encountered.
|
|
|
|
This time, the arguments to @code{translit} are parsed as @samp{(},
|
|
@samp{[[strcase]]}, @samp{,}, @samp{ }, @samp{[a-z]}, @samp{,}, @samp{ },
|
|
@samp{[A-Z]}, and @samp{)}. The two spaces are discarded, and the
|
|
translit results in the desired result @samp{[STRCASE]}. This is
|
|
rescanned, but since it is a string, the quotes are stripped and the
|
|
only output is a literal @samp{STRCASE}.
|
|
Then the scanner sees @samp{=} and @samp{1}, which are output
|
|
literally, followed by @samp{dnl} which discards the rest of the
|
|
definition of @code{gl_STRING_MODULE_INDICATOR}. The newline at the
|
|
end of output is the literal @samp{@key{NL}} that appeared after the
|
|
invocation of the macro.
|
|
|
|
The order in which @code{m4} expands the macros can be further explored
|
|
using the trace facilities of GNU @code{m4} (@pxref{Trace}).
|
|
|
|
@node Macros
|
|
@chapter How to invoke macros
|
|
|
|
This chapter covers macro invocation, macro arguments and how macro
|
|
expansion is treated.
|
|
|
|
@menu
|
|
* Invocation:: Macro invocation
|
|
* Inhibiting Invocation:: Preventing macro invocation
|
|
* Macro Arguments:: Macro arguments
|
|
* Quoting Arguments:: On Quoting Arguments to macros
|
|
* Macro expansion:: Expanding macros
|
|
@end menu
|
|
|
|
@node Invocation
|
|
@section Macro invocation
|
|
|
|
@cindex macro invocation
|
|
@cindex invoking macros
|
|
Macro invocations has one of the forms
|
|
|
|
@comment ignore
|
|
@example
|
|
name
|
|
@end example
|
|
|
|
@noindent
|
|
which is a macro invocation without any arguments, or
|
|
|
|
@comment ignore
|
|
@example
|
|
name(arg1, arg2, @dots{}, arg@var{n})
|
|
@end example
|
|
|
|
@noindent
|
|
which is a macro invocation with @var{n} arguments. Macros can have any
|
|
number of arguments. All arguments are strings, but different macros
|
|
might interpret the arguments in different ways.
|
|
|
|
The opening parenthesis @emph{must} follow the @var{name} directly, with
|
|
no spaces in between. If it does not, the macro is called with no
|
|
arguments at all.
|
|
|
|
For a macro call to have no arguments, the parentheses @emph{must} be
|
|
left out. The macro call
|
|
|
|
@comment ignore
|
|
@example
|
|
name()
|
|
@end example
|
|
|
|
@noindent
|
|
is a macro call with one argument, which is the empty string, not a call
|
|
with no arguments.
|
|
|
|
@node Inhibiting Invocation
|
|
@section Preventing macro invocation
|
|
|
|
An innovation of the @code{m4} language, compared to some of its
|
|
predecessors (like Strachey's @code{GPM}, for example), is the ability
|
|
to recognize macro calls without resorting to any special, prefixed
|
|
invocation character. While generally useful, this feature might
|
|
sometimes be the source of spurious, unwanted macro calls. So, GNU
|
|
@code{m4} offers several mechanisms or techniques for inhibiting the
|
|
recognition of names as macro calls.
|
|
|
|
@cindex GNU extensions
|
|
@cindex blind macro
|
|
@cindex macro, blind
|
|
First of all, many builtin macros cannot meaningfully be called without
|
|
arguments. As a GNU extension, for any of these macros,
|
|
whenever an opening parenthesis does not immediately follow their name,
|
|
the builtin macro call is not triggered. This solves the most usual
|
|
cases, like for @samp{include} or @samp{eval}. Later in this document,
|
|
the sentence ``This macro is recognized only with parameters'' refers to
|
|
this specific provision of GNU M4, also known as a blind
|
|
builtin macro. For the builtins defined by POSIX that bear
|
|
this disclaimer, POSIX specifically states that invoking those
|
|
builtins without arguments is unspecified, because many other
|
|
implementations simply invoke the builtin as though it were given one
|
|
empty argument instead.
|
|
|
|
@example
|
|
$ @kbd{m4}
|
|
eval
|
|
@result{}eval
|
|
eval(`1')
|
|
@result{}1
|
|
@end example
|
|
|
|
There is also a command line option (@option{--prefix-builtins}, or
|
|
@option{-P}, @pxref{Operation modes, , Invoking m4}) that renames all
|
|
builtin macros with a prefix of @samp{m4_} at startup. The option has
|
|
no effect whatsoever on user defined macros. For example, with this option,
|
|
one has to write @code{m4_dnl} and even @code{m4_m4exit}. It also has
|
|
no effect on whether a macro requires parameters.
|
|
|
|
@comment options: -P
|
|
@example
|
|
$ @kbd{m4 -P}
|
|
eval
|
|
@result{}eval
|
|
eval(`1')
|
|
@result{}eval(1)
|
|
m4_eval
|
|
@result{}m4_eval
|
|
m4_eval(`1')
|
|
@result{}1
|
|
@end example
|
|
|
|
Another alternative is to redefine problematic macros to a name less
|
|
likely to cause conflicts, using @ref{Definitions}.
|
|
|
|
If your version of GNU @code{m4} has the @code{changeword} feature
|
|
compiled in, it offers far more flexibility in specifying the
|
|
syntax of macro names, both builtin or user-defined. @xref{Changeword},
|
|
for more information on this experimental feature.
|
|
|
|
Of course, the simplest way to prevent a name from being interpreted
|
|
as a call to an existing macro is to quote it. The remainder of
|
|
this section studies a little more deeply how quoting affects macro
|
|
invocation, and how quoting can be used to inhibit macro invocation.
|
|
|
|
Even if quoting is usually done over the whole macro name, it can also
|
|
be done over only a few characters of this name (provided, of course,
|
|
that the unquoted portions are not also a macro). It is also possible
|
|
to quote the empty string, but this works only @emph{inside} the name.
|
|
For example:
|
|
|
|
@example
|
|
`divert'
|
|
@result{}divert
|
|
`d'ivert
|
|
@result{}divert
|
|
di`ver't
|
|
@result{}divert
|
|
div`'ert
|
|
@result{}divert
|
|
@end example
|
|
|
|
@noindent
|
|
all yield the string @samp{divert}. While in both:
|
|
|
|
@example
|
|
`'divert
|
|
@result{}
|
|
divert`'
|
|
@result{}
|
|
@end example
|
|
|
|
@noindent
|
|
the @code{divert} builtin macro will be called, which expands to the
|
|
empty string.
|
|
|
|
@cindex rescanning
|
|
The output of macro evaluations is always rescanned. In the following
|
|
example, the input @samp{x`'y} yields the string @samp{bCD}, exactly as
|
|
if @code{m4}
|
|
has been given @w{@samp{substr(ab`'cde, `1', `3')}} as input:
|
|
|
|
@example
|
|
define(`cde', `CDE')
|
|
@result{}
|
|
define(`x', `substr(ab')
|
|
@result{}
|
|
define(`y', `cde, `1', `3')')
|
|
@result{}
|
|
x`'y
|
|
@result{}bCD
|
|
@end example
|
|
|
|
@ignore
|
|
@comment Similar, but with argument references, to ensure good test
|
|
@comment coverage.
|
|
@example
|
|
define(`x1', `len(`$1'')
|
|
@result{}
|
|
define(`y1', ``$1')')
|
|
@result{}
|
|
x1(`01234567890123456789')y1(`98765432109876543210')
|
|
@result{}40
|
|
@end example
|
|
@end ignore
|
|
|
|
Unquoted strings on either side of a quoted string are subject to
|
|
being recognized as macro names. In the following example, quoting the
|
|
empty string allows for the second @code{macro} to be recognized as such:
|
|
|
|
@example
|
|
define(`macro', `m')
|
|
@result{}
|
|
macro(`m')macro
|
|
@result{}mmacro
|
|
macro(`m')`'macro
|
|
@result{}mm
|
|
@end example
|
|
|
|
Quoting may prevent recognizing as a macro name the concatenation of a
|
|
macro expansion with the surrounding characters. In this example:
|
|
|
|
@example
|
|
define(`macro', `di$1')
|
|
@result{}
|
|
macro(`v')`ert'
|
|
@result{}divert
|
|
macro(`v')ert
|
|
@result{}
|
|
@end example
|
|
|
|
@noindent
|
|
the input will produce the string @samp{divert}. When the quotes were
|
|
removed, the @code{divert} builtin was called instead.
|
|
|
|
@node Macro Arguments
|
|
@section Macro arguments
|
|
|
|
@cindex macros, arguments to
|
|
@cindex arguments to macros
|
|
When a name is seen, and it has a macro definition, it will be expanded
|
|
as a macro.
|
|
|
|
If the name is followed by an opening parenthesis, the arguments will be
|
|
collected before the macro is called. If too few arguments are
|
|
supplied, the missing arguments are taken to be the empty string.
|
|
However, some builtins are documented to behave differently for a
|
|
missing optional argument than for an explicit empty string. If there
|
|
are too many arguments, the excess arguments are ignored. Unquoted
|
|
leading whitespace is stripped off all arguments, but whitespace
|
|
generated by a macro expansion or occurring after a macro that expanded
|
|
to an empty string remains intact. Whitespace includes space, tab,
|
|
newline, carriage return, vertical tab, and formfeed.
|
|
|
|
@example
|
|
define(`macro', `$1')
|
|
@result{}
|
|
macro( unquoted leading space lost)
|
|
@result{}unquoted leading space lost
|
|
macro(` quoted leading space kept')
|
|
@result{} quoted leading space kept
|
|
macro(
|
|
divert `unquoted space kept after expansion')
|
|
@result{} unquoted space kept after expansion
|
|
macro(macro(`
|
|
')`whitespace from expansion kept')
|
|
@result{}
|
|
@result{}whitespace from expansion kept
|
|
macro(`unquoted trailing whitespace kept'
|
|
)
|
|
@result{}unquoted trailing whitespace kept
|
|
@result{}
|
|
@end example
|
|
|
|
@cindex warnings, suppressing
|
|
@cindex suppressing warnings
|
|
Normally @code{m4} will issue warnings if a builtin macro is called
|
|
with an inappropriate number of arguments, but it can be suppressed with
|
|
the @option{--quiet} command line option (or @option{--silent}, or
|
|
@option{-Q}, @pxref{Operation modes, , Invoking m4}). For user
|
|
defined macros, there is no check of the number of arguments given.
|
|
|
|
@example
|
|
$ @kbd{m4}
|
|
index(`abc')
|
|
@error{}m4:stdin:1: Warning: too few arguments to builtin `index'
|
|
@result{}0
|
|
index(`abc',)
|
|
@result{}0
|
|
index(`abc', `b', `ignored')
|
|
@error{}m4:stdin:3: Warning: excess arguments to builtin `index' ignored
|
|
@result{}1
|
|
@end example
|
|
|
|
@comment options: -Q
|
|
@example
|
|
$ @kbd{m4 -Q}
|
|
index(`abc')
|
|
@result{}0
|
|
index(`abc',)
|
|
@result{}0
|
|
index(`abc', `b', `ignored')
|
|
@result{}1
|
|
@end example
|
|
|
|
Macros are expanded normally during argument collection, and whatever
|
|
commas, quotes and parentheses that might show up in the resulting
|
|
expanded text will serve to define the arguments as well. Thus, if
|
|
@var{foo} expands to @samp{, b, c}, the macro call
|
|
|
|
@comment ignore
|
|
@example
|
|
bar(a foo, d)
|
|
@end example
|
|
|
|
@noindent
|
|
is a macro call with four arguments, which are @samp{a }, @samp{b},
|
|
@samp{c} and @samp{d}. To understand why the first argument contains
|
|
whitespace, remember that unquoted leading whitespace is never part
|
|
of an argument, but trailing whitespace always is.
|
|
|
|
It is possible for a macro's definition to change during argument
|
|
collection, in which case the expansion uses the definition that was in
|
|
effect at the time the opening @samp{(} was seen.
|
|
|
|
@example
|
|
define(`f', `1')
|
|
@result{}
|
|
f(define(`f', `2'))
|
|
@result{}1
|
|
f
|
|
@result{}2
|
|
@end example
|
|
|
|
It is an error if the end of file occurs while collecting arguments.
|
|
|
|
@comment status: 1
|
|
@example
|
|
hello world
|
|
@result{}hello world
|
|
define(
|
|
^D
|
|
@error{}m4:stdin:2: ERROR: end of file in argument list
|
|
@end example
|
|
|
|
@node Quoting Arguments
|
|
@section On Quoting Arguments to macros
|
|
|
|
@cindex quoted macro arguments
|
|
@cindex macros, quoted arguments to
|
|
@cindex arguments, quoted macro
|
|
Each argument has unquoted leading whitespace removed. Within each
|
|
argument, all unquoted parentheses must match. For example, if
|
|
@var{foo} is a macro,
|
|
|
|
@comment ignore
|
|
@example
|
|
foo(() (`(') `(')
|
|
@end example
|
|
|
|
@noindent
|
|
is a macro call, with one argument, whose value is @samp{() (() (}.
|
|
Commas separate arguments, except when they occur inside quotes,
|
|
comments, or unquoted parentheses. @xref{Pseudo Arguments}, for
|
|
examples.
|
|
|
|
It is common practice to quote all arguments to macros, unless you are
|
|
sure you want the arguments expanded. Thus, in the above
|
|
example with the parentheses, the `right' way to do it is like this:
|
|
|
|
@comment ignore
|
|
@example
|
|
foo(`() (() (')
|
|
@end example
|
|
|
|
@cindex quoting rule of thumb
|
|
@cindex rule of thumb, quoting
|
|
It is, however, in certain cases necessary (because nested expansion
|
|
must occur to create the arguments for the outer macro) or convenient
|
|
(because it uses fewer characters) to leave out quotes for some
|
|
arguments, and there is nothing wrong in doing it. It just makes life a
|
|
bit harder, if you are not careful to follow a consistent quoting style.
|
|
For consistency, this manual follows the rule of thumb that each layer
|
|
of parentheses introduces another layer of single quoting, except when
|
|
showing the consequences of quoting rules. This is done even when the
|
|
quoted string cannot be a macro, such as with integers when you have not
|
|
changed the syntax via @code{changeword} (@pxref{Changeword}).
|
|
|
|
The quoting rule of thumb of one level of quoting per parentheses has a
|
|
nice property: when a macro name appears inside parentheses, you can
|
|
determine when it will be expanded. If it is not quoted, it will be
|
|
expanded prior to the outer macro, so that its expansion becomes the
|
|
argument. If it is single-quoted, it will be expanded after the outer
|
|
macro. And if it is double-quoted, it will be used as literal text
|
|
instead of a macro name.
|
|
|
|
@example
|
|
define(`active', `ACT, IVE')
|
|
@result{}
|
|
define(`show', `$1 $1')
|
|
@result{}
|
|
show(active)
|
|
@result{}ACT ACT
|
|
show(`active')
|
|
@result{}ACT, IVE ACT, IVE
|
|
show(``active'')
|
|
@result{}active active
|
|
@end example
|
|
|
|
@node Macro expansion
|
|
@section Macro expansion
|
|
|
|
@cindex macros, expansion of
|
|
@cindex expansion of macros
|
|
When the arguments, if any, to a macro call have been collected, the
|
|
macro is expanded, and the expansion text is pushed back onto the input
|
|
(unquoted), and reread. The expansion text from one macro call might
|
|
therefore result in more macros being called, if the calls are included,
|
|
completely or partially, in the first macro calls' expansion.
|
|
|
|
Taking a very simple example, if @var{foo} expands to @samp{bar}, and
|
|
@var{bar} expands to @samp{Hello}, the input
|
|
|
|
@comment options: -Dbar=Hello -Dfoo=bar
|
|
@example
|
|
$ @kbd{m4 -Dbar=Hello -Dfoo=bar}
|
|
foo
|
|
@result{}Hello
|
|
@end example
|
|
|
|
@noindent
|
|
will expand first to @samp{bar}, and when this is reread and
|
|
expanded, into @samp{Hello}.
|
|
|
|
@ignore
|
|
@comment not worth documenting, but test that the command line can
|
|
@comment define macros that take parameters
|
|
|
|
@comment options: -Dfoo -Decho=$@
|
|
@example
|
|
$ @kbd{m4 -Dfoo -Decho='$@'}
|
|
foo
|
|
@result{}
|
|
foo(`silently ignored')
|
|
@result{}
|
|
echo(`1', `2')
|
|
@result{}1,2
|
|
@end example
|
|
@end ignore
|
|
|
|
@node Definitions
|
|
@chapter How to define new macros
|
|
|
|
@cindex macros, how to define new
|
|
@cindex defining new macros
|
|
Macros can be defined, redefined and deleted in several different ways.
|
|
Also, it is possible to redefine a macro without losing a previous
|
|
value, and bring back the original value at a later time.
|
|
|
|
@menu
|
|
* Define:: Defining a new macro
|
|
* Arguments:: Arguments to macros
|
|
* Pseudo Arguments:: Special arguments to macros
|
|
* Undefine:: Deleting a macro
|
|
* Defn:: Renaming macros
|
|
* Pushdef:: Temporarily redefining macros
|
|
|
|
* Indir:: Indirect call of macros
|
|
* Builtin:: Indirect call of builtins
|
|
@end menu
|
|
|
|
@node Define
|
|
@section Defining a macro
|
|
|
|
The normal way to define or redefine macros is to use the builtin
|
|
@code{define}:
|
|
|
|
@deffn Builtin define (@var{name}, @ovar{expansion})
|
|
Defines @var{name} to expand to @var{expansion}. If
|
|
@var{expansion} is not given, it is taken to be empty.
|
|
|
|
The expansion of @code{define} is void.
|
|
The macro @code{define} is recognized only with parameters.
|
|
@end deffn
|
|
|
|
The following example defines the macro @var{foo} to expand to the text
|
|
@samp{Hello World.}.
|
|
|
|
@example
|
|
define(`foo', `Hello world.')
|
|
@result{}
|
|
foo
|
|
@result{}Hello world.
|
|
@end example
|
|
|
|
The empty line in the output is there because the newline is not
|
|
a part of the macro definition, and it is consequently copied to
|
|
the output. This can be avoided by use of the macro @code{dnl}.
|
|
@xref{Dnl}, for details.
|
|
|
|
The first argument to @code{define} should be quoted; otherwise, if the
|
|
macro is already defined, you will be defining a different macro. This
|
|
example shows the problems with underquoting, since we did not want to
|
|
redefine @code{one}:
|
|
|
|
@example
|
|
define(foo, one)
|
|
@result{}
|
|
define(foo, two)
|
|
@result{}
|
|
one
|
|
@result{}two
|
|
@end example
|
|
|
|
@cindex GNU extensions
|
|
GNU @code{m4} normally replaces only the @emph{topmost}
|
|
definition of a macro if it has several definitions from @code{pushdef}
|
|
(@pxref{Pushdef}). Some other implementations of @code{m4} replace all
|
|
definitions of a macro with @code{define}. @xref{Incompatibilities},
|
|
for more details.
|
|
|
|
As a GNU extension, the first argument to @code{define} does
|
|
not have to be a simple word.
|
|
It can be any text string, even the empty string. A macro with a
|
|
non-standard name cannot be invoked in the normal way, as the name is
|
|
not recognized. It can only be referenced by the builtins @code{indir}
|
|
(@pxref{Indir}) and @code{defn} (@pxref{Defn}).
|
|
|
|
@cindex arrays
|
|
Arrays and associative arrays can be simulated by using non-standard
|
|
macro names.
|
|
|
|
@deffn Composite array (@var{index})
|
|
@deffnx Composite array_set (@var{index}, @ovar{value})
|
|
Provide access to entries within an array. @code{array} reads the entry
|
|
at location @var{index}, and @code{array_set} assigns @var{value} to
|
|
location @var{index}.
|
|
@end deffn
|
|
|
|
@example
|
|
define(`array', `defn(format(``array[%d]'', `$1'))')
|
|
@result{}
|
|
define(`array_set', `define(format(``array[%d]'', `$1'), `$2')')
|
|
@result{}
|
|
array_set(`4', `array element no. 4')
|
|
@result{}
|
|
array_set(`17', `array element no. 17')
|
|
@result{}
|
|
array(`4')
|
|
@result{}array element no. 4
|
|
array(eval(`10 + 7'))
|
|
@result{}array element no. 17
|
|
@end example
|
|
|
|
Change the @samp{%d} to @samp{%s} and it is an associative array.
|
|
|
|
@node Arguments
|
|
@section Arguments to macros
|
|
|
|
@cindex macros, arguments to
|
|
@cindex arguments to macros
|
|
Macros can have arguments. The @var{n}th argument is denoted by
|
|
@code{$n} in the expansion text, and is replaced by the @var{n}th actual
|
|
argument, when the macro is expanded. Replacement of arguments happens
|
|
before rescanning, regardless of how many nesting levels of quoting
|
|
appear in the expansion. Here is an example of a macro with
|
|
two arguments.
|
|
|
|
@deffn Composite exch (@var{arg1}, @var{arg2})
|
|
Expands to @var{arg2} followed by @var{arg1}, effectively exchanging
|
|
their order.
|
|
@end deffn
|
|
|
|
@example
|
|
define(`exch', `$2, $1')
|
|
@result{}
|
|
exch(`arg1', `arg2')
|
|
@result{}arg2, arg1
|
|
@end example
|
|
|
|
This can be used, for example, if you like the arguments to
|
|
@code{define} to be reversed.
|
|
|
|
@example
|
|
define(`exch', `$2, $1')
|
|
@result{}
|
|
define(exch(``expansion text'', ``macro''))
|
|
@result{}
|
|
macro
|
|
@result{}expansion text
|
|
@end example
|
|
|
|
@xref{Quoting Arguments}, for an explanation of the double quotes.
|
|
(You should try and improve this example so that clients of @code{exch}
|
|
do not have to double quote; or @pxref{Improved exch, , Answers}).
|
|
|
|
As a special case, the zeroth argument, @code{$0}, is always the name
|
|
of the macro being expanded.
|
|
|
|
@example
|
|
define(`test', ``Macro name: $0'')
|
|
@result{}
|
|
test
|
|
@result{}Macro name: test
|
|
@end example
|
|
|
|
If you want quoted text to appear as part of the expansion text,
|
|
remember that quotes can be nested in quoted strings. Thus, in
|
|
|
|
@example
|
|
define(`foo', `This is macro `foo'.')
|
|
@result{}
|
|
foo
|
|
@result{}This is macro foo.
|
|
@end example
|
|
|
|
@noindent
|
|
The @samp{foo} in the expansion text is @emph{not} expanded, since it is
|
|
a quoted string, and not a name.
|
|
|
|
@cindex GNU extensions
|
|
@cindex nine arguments, more than
|
|
@cindex more than nine arguments
|
|
@cindex arguments, more than nine
|
|
@cindex positional parameters, more than nine
|
|
GNU @code{m4} allows the number following the @samp{$} to
|
|
consist of one or more digits, allowing macros to have any number of
|
|
arguments. The extension of accepting multiple digits is incompatible
|
|
with POSIX, and is different than traditional implementations
|
|
of @code{m4}, which only recognize one digit. Therefore, future
|
|
versions of GNU M4 will phase out this feature. To portably
|
|
access beyond the ninth argument, you can use the @code{argn} macro
|
|
documented later (@pxref{Shift}).
|
|
|
|
POSIX also states that @samp{$} followed immediately by
|
|
@samp{@{} in a macro definition is implementation-defined. This version
|
|
of M4 passes the literal characters @samp{$@{} through unchanged, but M4
|
|
2.0 will implement an optional feature similar to @command{sh}, where
|
|
@samp{$@{11@}} expands to the eleventh argument, to replace the current
|
|
recognition of @samp{$11}. Meanwhile, if you want to guarantee that you
|
|
will get a literal @samp{$@{} in output when expanding a macro, even
|
|
when you upgrade to M4 2.0, you can use nested quoting to your
|
|
advantage:
|
|
|
|
@example
|
|
define(`foo', `single quoted $`'@{1@} output')
|
|
@result{}
|
|
define(`bar', ``double quoted $'`@{2@} output'')
|
|
@result{}
|
|
foo(`a', `b')
|
|
@result{}single quoted $@{1@} output
|
|
bar(`a', `b')
|
|
@result{}double quoted $@{2@} output
|
|
@end example
|
|
|
|
To help you detect places in your M4 input files that might change in
|
|
behavior due to the changed behavior of M4 2.0, you can use the
|
|
@option{--warn-macro-sequence} command-line option (@pxref{Operation
|
|
modes, , Invoking m4}) with the default regular expression. This will
|
|
add a warning any time a macro definition includes @samp{$} followed by
|
|
multiple digits, or by @samp{@{}. The warning is not enabled by
|
|
default, because it triggers a number of warnings in Autoconf 2.61 (and
|
|
Autoconf uses @option{-E} to treat warnings as errors), and because it
|
|
will still be possible to restore older behavior in M4 2.0.
|
|
|
|
@comment options: --warn-macro-sequence
|
|
@example
|
|
$ @kbd{m4 --warn-macro-sequence}
|
|
define(`foo', `$001 $@{1@} $1')
|
|
@error{}m4:stdin:1: Warning: definition of `foo' contains sequence `$001'
|
|
@error{}m4:stdin:1: Warning: definition of `foo' contains sequence `$@{1@}'
|
|
@result{}
|
|
foo(`bar')
|
|
@result{}bar $@{1@} bar
|
|
@end example
|
|
|
|
@node Pseudo Arguments
|
|
@section Special arguments to macros
|
|
|
|
@cindex special arguments to macros
|
|
@cindex macros, special arguments to
|
|
@cindex arguments to macros, special
|
|
There is a special notation for the number of actual arguments supplied,
|
|
and for all the actual arguments.
|
|
|
|
The number of actual arguments in a macro call is denoted by @code{$#}
|
|
in the expansion text.
|
|
|
|
@deffn Composite nargs (@dots{})
|
|
Expands to a count of the number of arguments supplied.
|
|
@end deffn
|
|
|
|
@example
|
|
define(`nargs', `$#')
|
|
@result{}
|
|
nargs
|
|
@result{}0
|
|
nargs()
|
|
@result{}1
|
|
nargs(`arg1', `arg2', `arg3')
|
|
@result{}3
|
|
nargs(`commas can be quoted, like this')
|
|
@result{}1
|
|
nargs(arg1#inside comments, commas do not separate arguments
|
|
still arg1)
|
|
@result{}1
|
|
nargs((unquoted parentheses, like this, group arguments))
|
|
@result{}1
|
|
@end example
|
|
|
|
Remember that @samp{#} defaults to the comment character; if you forget
|
|
quotes to inhibit the comment behavior, your macro definition may not
|
|
end where you expected.
|
|
|
|
@example
|
|
dnl Attempt to define a macro to just `$#'
|
|
define(underquoted, $#)
|
|
oops)
|
|
@result{}
|
|
underquoted
|
|
@result{}0)
|
|
@result{}oops
|
|
@end example
|
|
|
|
The notation @code{$*} can be used in the expansion text to denote all
|
|
the actual arguments, unquoted, with commas in between. For example
|
|
|
|
@example
|
|
define(`echo', `$*')
|
|
@result{}
|
|
echo(arg1, arg2, arg3 , arg4)
|
|
@result{}arg1,arg2,arg3 ,arg4
|
|
@end example
|
|
|
|
Often each argument should be quoted, and the notation @code{$@@} handles
|
|
that. It is just like @code{$*}, except that it quotes each argument.
|
|
A simple example of that is:
|
|
|
|
@example
|
|
define(`echo', `$@@')
|
|
@result{}
|
|
echo(arg1, arg2, arg3 , arg4)
|
|
@result{}arg1,arg2,arg3 ,arg4
|
|
@end example
|
|
|
|
Where did the quotes go? Of course, they were eaten, when the expanded
|
|
text were reread by @code{m4}. To show the difference, try
|
|
|
|
@example
|
|
define(`echo1', `$*')
|
|
@result{}
|
|
define(`echo2', `$@@')
|
|
@result{}
|
|
define(`foo', `This is macro `foo'.')
|
|
@result{}
|
|
echo1(foo)
|
|
@result{}This is macro This is macro foo..
|
|
echo1(`foo')
|
|
@result{}This is macro foo.
|
|
echo2(foo)
|
|
@result{}This is macro foo.
|
|
echo2(`foo')
|
|
@result{}foo
|
|
@end example
|
|
|
|
@noindent
|
|
@xref{Trace}, if you do not understand this. As another example of the
|
|
difference, remember that comments encountered in arguments are passed
|
|
untouched to the macro, and that quoting disables comments.
|
|
|
|
@example
|
|
define(`echo1', `$*')
|
|
@result{}
|
|
define(`echo2', `$@@')
|
|
@result{}
|
|
define(`foo', `bar')
|
|
@result{}
|
|
echo1(#foo'foo
|
|
foo)
|
|
@result{}#foo'foo
|
|
@result{}bar
|
|
echo2(#foo'foo
|
|
foo)
|
|
@result{}#foobar
|
|
@result{}bar'
|
|
@end example
|
|
|
|
@ignore
|
|
@comment Not worth putting in the manual, but this example is needed for
|
|
@comment good test coverage of copying large strings across recursion
|
|
@comment levels.
|
|
|
|
@example
|
|
define(`echo', `$@@')dnl
|
|
echo(echo(`01234567890123456789', `01234567890123456789')
|
|
echo(`98765432109876543210', `98765432109876543210'))
|
|
@result{}01234567890123456789,01234567890123456789
|
|
@result{}98765432109876543210,98765432109876543210
|
|
len((echo(`01234567890123456789',
|
|
`01234567890123456789')echo(`98765432109876543210',
|
|
`98765432109876543210')))
|
|
@result{}84
|
|
indir(`echo', indir(`echo', `01234567890123456789',
|
|
`01234567890123456789')
|
|
indir(`echo', `98765432109876543210', `98765432109876543210'))
|
|
@result{}01234567890123456789,01234567890123456789
|
|
@result{}98765432109876543210,98765432109876543210
|
|
define(`argn', `$#')dnl
|
|
define(`echo1', `-$@@-')define(`echo2', `,$@@,')dnl
|
|
echo1(`1', `2', `3') argn(echo1(`1', `2', `3'))
|
|
@result{}-1,2,3- 3
|
|
echo2(`1', `2', `3') argn(echo2(`1', `2', `3'))
|
|
@result{},1,2,3, 5
|
|
@end example
|
|
@end ignore
|
|
|
|
A @samp{$} sign in the expansion text, that is not followed by anything
|
|
@code{m4} understands, is simply copied to the macro expansion, as any
|
|
other text is.
|
|
|
|
@example
|
|
define(`foo', `$$$ hello $$$')
|
|
@result{}
|
|
foo
|
|
@result{}$$$ hello $$$
|
|
@end example
|
|
|
|
@cindex rescanning
|
|
@cindex literal output
|
|
@cindex output, literal
|
|
If you want a macro to expand to something like @samp{$12}, the
|
|
judicious use of nested quoting can put a safe character between the
|
|
@code{$} and the next character, relying on the rescanning to remove the
|
|
nested quote. This will prevent @code{m4} from interpreting the
|
|
@code{$} sign as a reference to an argument.
|
|
|
|
@example
|
|
define(`foo', `no nested quote: $1')
|
|
@result{}
|
|
foo(`arg')
|
|
@result{}no nested quote: arg
|
|
define(`foo', `nested quote around $: `$'1')
|
|
@result{}
|
|
foo(`arg')
|
|
@result{}nested quote around $: $1
|
|
define(`foo', `nested empty quote after $: $`'1')
|
|
@result{}
|
|
foo(`arg')
|
|
@result{}nested empty quote after $: $1
|
|
define(`foo', `nested quote around next character: $`1'')
|
|
@result{}
|
|
foo(`arg')
|
|
@result{}nested quote around next character: $1
|
|
define(`foo', `nested quote around both: `$1'')
|
|
@result{}
|
|
foo(`arg')
|
|
@result{}nested quote around both: arg
|
|
@end example
|
|
|
|
@node Undefine
|
|
@section Deleting a macro
|
|
|
|
@cindex macros, how to delete
|
|
@cindex deleting macros
|
|
@cindex undefining macros
|
|
A macro definition can be removed with @code{undefine}:
|
|
|
|
@deffn Builtin undefine (@var{name}@dots{})
|
|
For each argument, remove the macro @var{name}. The macro names must
|
|
necessarily be quoted, since they will be expanded otherwise.
|
|
|
|
The expansion of @code{undefine} is void.
|
|
The macro @code{undefine} is recognized only with parameters.
|
|
@end deffn
|
|
|
|
@example
|
|
foo bar blah
|
|
@result{}foo bar blah
|
|
define(`foo', `some')define(`bar', `other')define(`blah', `text')
|
|
@result{}
|
|
foo bar blah
|
|
@result{}some other text
|
|
undefine(`foo')
|
|
@result{}
|
|
foo bar blah
|
|
@result{}foo other text
|
|
undefine(`bar', `blah')
|
|
@result{}
|
|
foo bar blah
|
|
@result{}foo bar blah
|
|
@end example
|
|
|
|
Undefining a macro inside that macro's expansion is safe; the macro
|
|
still expands to the definition that was in effect at the @samp{(}.
|
|
|
|
@example
|
|
define(`f', ``$0':$1')
|
|
@result{}
|
|
f(f(f(undefine(`f')`hello world')))
|
|
@result{}f:f:f:hello world
|
|
f(`bye')
|
|
@result{}f(bye)
|
|
@end example
|
|
|
|
@ignore
|
|
@comment This example is not worth putting in the manual, but triggers a
|
|
@comment memory corruption regression during tracing in 1.4.19.
|
|
|
|
@example
|
|
define(`a', `popdef(`a')1')pushdef(`a', `2$*')dnl
|
|
debugmode(`t')a(popdef(`a')a)
|
|
@error{}m4trace: -2- popdef
|
|
@error{}m4trace: -2- a
|
|
@error{}m4trace: -2- popdef
|
|
@error{}m4trace: -1- a
|
|
@result{}21
|
|
@end example
|
|
@end ignore
|
|
|
|
It is not an error for @var{name} to have no macro definition. In that
|
|
case, @code{undefine} does nothing.
|
|
|
|
@node Defn
|
|
@section Renaming macros
|
|
|
|
@cindex macros, how to rename
|
|
@cindex renaming macros
|
|
@cindex macros, displaying definitions
|
|
@cindex definitions, displaying macro
|
|
It is possible to rename an already defined macro. To do this, you need
|
|
the builtin @code{defn}:
|
|
|
|
@deffn Builtin defn (@var{name}@dots{})
|
|
Expands to the @emph{quoted definition} of each @var{name}. If an
|
|
argument is not a defined macro, the expansion for that argument is
|
|
empty.
|
|
|
|
If @var{name} is a user-defined macro, the quoted definition is simply
|
|
the quoted expansion text. If, instead, there is only one @var{name}
|
|
and it is a builtin, the
|
|
expansion is a special token, which points to the builtin's internal
|
|
definition. This token is only meaningful as the second argument to
|
|
@code{define} (and @code{pushdef}), and is silently converted to an
|
|
empty string in most other contexts. When defining a macro, combining a
|
|
builtin with anything else is not supported; a warning is issued and the
|
|
builtin is omitted from the final definition (other implementations may
|
|
do other unexpected things).
|
|
|
|
The macro @code{defn} is recognized only with parameters.
|
|
@end deffn
|
|
|
|
Its normal use is best understood through an example, which shows how to
|
|
rename @code{undefine} to @code{zap}:
|
|
|
|
@example
|
|
define(`zap', defn(`undefine'))
|
|
@result{}
|
|
zap(`undefine')
|
|
@result{}
|
|
undefine(`zap')
|
|
@result{}undefine(zap)
|
|
@end example
|
|
|
|
In this way, @code{defn} can be used to copy macro definitions, and also
|
|
definitions of builtin macros. Even if the original macro is removed,
|
|
the other name can still be used to access the definition.
|
|
|
|
The fact that macro definitions can be transferred also explains why you
|
|
should use @code{$0}, rather than retyping a macro's name in its
|
|
definition:
|
|
|
|
@example
|
|
define(`foo', `This is `$0'')
|
|
@result{}
|
|
define(`bar', defn(`foo'))
|
|
@result{}
|
|
bar
|
|
@result{}This is bar
|
|
@end example
|
|
|
|
Macros used as string variables should be referred through @code{defn},
|
|
to avoid unwanted expansion of the text:
|
|
|
|
@example
|
|
define(`string', `The macro dnl is very useful
|
|
')
|
|
@result{}
|
|
string
|
|
@result{}The macro@w{ }
|
|
defn(`string')
|
|
@result{}The macro dnl is very useful
|
|
@result{}
|
|
@end example
|
|
|
|
@cindex rescanning
|
|
However, it is important to remember that @code{m4} rescanning is purely
|
|
textual. If an unbalanced end-quote string occurs in a macro
|
|
definition, the rescan will see that embedded quote as the termination
|
|
of the quoted string, and the remainder of the macro's definition will
|
|
be rescanned unquoted. Thus it is a good idea to avoid unbalanced
|
|
end-quotes in macro definitions or arguments to macros.
|
|
|
|
@example
|
|
define(`foo', a'a)
|
|
@result{}
|
|
define(`a', `A')
|
|
@result{}
|
|
define(`echo', `$@@')
|
|
@result{}
|
|
foo
|
|
@result{}A'A
|
|
defn(`foo')
|
|
@result{}aA'
|
|
echo(foo)
|
|
@result{}AA'
|
|
@end example
|
|
|
|
On the other hand, it is possible to exploit the fact that @code{defn}
|
|
can concatenate multiple macros prior to the rescanning phase, in order
|
|
to join the definitions of macros that, in isolation, have unbalanced
|
|
quotes. This is particularly useful when one has used several macros to
|
|
accumulate text that M4 should rescan as a whole. In the example below,
|
|
note how the use of @code{defn} on @code{l} in isolation opens a string,
|
|
which is not closed until the next line; but used on @code{l} and
|
|
@code{r} together results in nested quoting.
|
|
|
|
@example
|
|
define(`l', `<[>')define(`r', `<]>')
|
|
@result{}
|
|
changequote(`[', `]')
|
|
@result{}
|
|
defn([l])defn([r])
|
|
])
|
|
@result{}<[>]defn([r])
|
|
@result{})
|
|
defn([l], [r])
|
|
@result{}<[>][<]>
|
|
@end example
|
|
|
|
@cindex builtins, special tokens
|
|
@cindex tokens, builtin macro
|
|
Using @code{defn} to generate special tokens for builtin macros outside
|
|
of expected contexts can sometimes trigger warnings. But most of the
|
|
time, such tokens are silently converted to the empty string.
|
|
|
|
@example
|
|
$ @kbd{m4 -d}
|
|
defn(`defn')
|
|
@result{}
|
|
define(defn(`divnum'), `cannot redefine a builtin token')
|
|
@error{}m4:stdin:2: Warning: define: invalid macro name ignored
|
|
@result{}
|
|
divnum
|
|
@result{}0
|
|
len(defn(`divnum'))
|
|
@result{}0
|
|
eval(`1'defn(`divnum')) eval(defn(`divnum')8)
|
|
@result{}1 8
|
|
ifelse(defn(`divnum'), defn(`define'), `indistinguishable')
|
|
@result{}indistinguishable
|
|
@end example
|
|
|
|
Also note that @code{defn} with multiple arguments can only join text
|
|
macros, not builtins, although a future version of GNU M4 may
|
|
lift this restriction.
|
|
|
|
@example
|
|
$ @kbd{m4 -d}
|
|
define(`a', `A')define(`AA', `b')
|
|
@result{}
|
|
traceon(`defn', `define')
|
|
@result{}
|
|
defn(`a', `divnum', `a')
|
|
@error{}m4:stdin:3: Warning: cannot concatenate builtin `divnum'
|
|
@error{}m4trace: -1- defn(`a', `divnum', `a') -> ``A'`A''
|
|
@result{}AA
|
|
define(`mydivnum', defn(`divnum', `divnum'))mydivnum
|
|
@error{}m4:stdin:4: Warning: cannot concatenate builtin `divnum'
|
|
@error{}m4:stdin:4: Warning: cannot concatenate builtin `divnum'
|
|
@error{}m4trace: -2- defn(`divnum', `divnum')
|
|
@error{}m4trace: -1- define(`mydivnum', `')
|
|
@result{}
|
|
define(`mydivnum+', defn(`divnum')`+'defn(`divnum'))mydivnum+
|
|
@error{}m4trace: -2- defn(`divnum')
|
|
@error{}m4trace: -2- defn(`divnum')
|
|
@error{}m4:stdin:5: Warning: cannot concatenate builtin tokens
|
|
@error{}m4trace: -1- define(`mydivnum+', `+')
|
|
@result{}+
|
|
traceoff(`defn', `define')
|
|
@result{}
|
|
@end example
|
|
|
|
@node Pushdef
|
|
@section Temporarily redefining macros
|
|
|
|
@cindex macros, temporary redefinition of
|
|
@cindex temporary redefinition of macros
|
|
@cindex redefinition of macros, temporary
|
|
@cindex definition stack
|
|
@cindex pushdef stack
|
|
@cindex stack, macro definition
|
|
It is possible to redefine a macro temporarily, reverting to the
|
|
previous definition at a later time. This is done with the builtins
|
|
@code{pushdef} and @code{popdef}:
|
|
|
|
@deffn Builtin pushdef (@var{name}, @ovar{expansion})
|
|
@deffnx Builtin popdef (@var{name}@dots{})
|
|
Analogous to @code{define} and @code{undefine}.
|
|
|
|
These macros work in a stack-like fashion. A macro is temporarily
|
|
redefined with @code{pushdef}, which replaces an existing definition of
|
|
@var{name}, while saving the previous definition, before the new one is
|
|
installed. If there is no previous definition, @code{pushdef} behaves
|
|
exactly like @code{define}.
|
|
|
|
If a macro has several definitions (of which only one is accessible),
|
|
the topmost definition can be removed with @code{popdef}. If there is
|
|
no previous definition, @code{popdef} behaves like @code{undefine}.
|
|
|
|
The expansion of both @code{pushdef} and @code{popdef} is void.
|
|
The macros @code{pushdef} and @code{popdef} are recognized only with
|
|
parameters.
|
|
@end deffn
|
|
|
|
@example
|
|
define(`foo', `Expansion one.')
|
|
@result{}
|
|
foo
|
|
@result{}Expansion one.
|
|
pushdef(`foo', `Expansion two.')
|
|
@result{}
|
|
foo
|
|
@result{}Expansion two.
|
|
pushdef(`foo', `Expansion three.')
|
|
@result{}
|
|
pushdef(`foo', `Expansion four.')
|
|
@result{}
|
|
popdef(`foo')
|
|
@result{}
|
|
foo
|
|
@result{}Expansion three.
|
|
popdef(`foo', `foo')
|
|
@result{}
|
|
foo
|
|
@result{}Expansion one.
|
|
popdef(`foo')
|
|
@result{}
|
|
foo
|
|
@result{}foo
|
|
@end example
|
|
|
|
If a macro with several definitions is redefined with @code{define}, the
|
|
topmost definition is @emph{replaced} with the new definition. If it is
|
|
removed with @code{undefine}, @emph{all} the definitions are removed,
|
|
and not only the topmost one. However, POSIX allows other
|
|
implementations that treat @code{define} as replacing an entire stack
|
|
of definitions with a single new definition, so to be portable to other
|
|
implementations, it may be worth explicitly using @code{popdef} and
|
|
@code{pushdef} rather than relying on the GNU behavior of
|
|
@code{define}.
|
|
|
|
@example
|
|
define(`foo', `Expansion one.')
|
|
@result{}
|
|
foo
|
|
@result{}Expansion one.
|
|
pushdef(`foo', `Expansion two.')
|
|
@result{}
|
|
foo
|
|
@result{}Expansion two.
|
|
define(`foo', `Second expansion two.')
|
|
@result{}
|
|
foo
|
|
@result{}Second expansion two.
|
|
undefine(`foo')
|
|
@result{}
|
|
foo
|
|
@result{}foo
|
|
@end example
|
|
|
|
@cindex local variables
|
|
@cindex variables, local
|
|
Local variables within macros are made with @code{pushdef} and
|
|
@code{popdef}. At the start of the macro a new definition is pushed,
|
|
within the macro it is manipulated and at the end it is popped,
|
|
revealing the former definition.
|
|
|
|
It is possible to temporarily redefine a builtin with @code{pushdef}
|
|
and @code{defn}.
|
|
|
|
@node Indir
|
|
@section Indirect call of macros
|
|
|
|
@cindex indirect call of macros
|
|
@cindex call of macros, indirect
|
|
@cindex macros, indirect call of
|
|
@cindex GNU extensions
|
|
Any macro can be called indirectly with @code{indir}:
|
|
|
|
@deffn Builtin indir (@var{name}, @ovar{args@dots{}})
|
|
Results in a call to the macro @var{name}, which is passed the
|
|
rest of the arguments @var{args}. If @var{name} is not defined, an
|
|
error message is printed, and the expansion is void.
|
|
|
|
The macro @code{indir} is recognized only with parameters.
|
|
@end deffn
|
|
|
|
This can be used to call macros with computed or ``invalid''
|
|
names (@code{define} allows such names to be defined):
|
|
|
|
@example
|
|
define(`$$internal$macro', `Internal macro (name `$0')')
|
|
@result{}
|
|
$$internal$macro
|
|
@result{}$$internal$macro
|
|
indir(`$$internal$macro')
|
|
@result{}Internal macro (name $$internal$macro)
|
|
@end example
|
|
|
|
The point is, here, that larger macro packages can have private macros
|
|
defined, that will not be called by accident. They can @emph{only} be
|
|
called through the builtin @code{indir}.
|
|
|
|
One other point to observe is that argument collection occurs before
|
|
@code{indir} invokes @var{name}, so if argument collection changes the
|
|
value of @var{name}, that will be reflected in the final expansion.
|
|
This is different than the behavior when invoking macros directly,
|
|
where the definition that was in effect before argument collection is
|
|
used.
|
|
|
|
@example
|
|
$ @kbd{m4 -d}
|
|
define(`f', `1')
|
|
@result{}
|
|
f(define(`f', `2'))
|
|
@result{}1
|
|
indir(`f', define(`f', `3'))
|
|
@result{}3
|
|
indir(`f', undefine(`f'))
|
|
@error{}m4:stdin:4: undefined macro `f'
|
|
@result{}
|
|
@end example
|
|
|
|
When handed the result of @code{defn} (@pxref{Defn}) as one of its
|
|
arguments, @code{indir} defers to the invoked @var{name} for whether a
|
|
token representing a builtin is recognized or flattened to the empty
|
|
string.
|
|
|
|
@example
|
|
$ @kbd{m4 -d}
|
|
indir(defn(`defn'), `divnum')
|
|
@error{}m4:stdin:1: Warning: indir: invalid macro name ignored
|
|
@result{}
|
|
indir(`define', defn(`defn'), `divnum')
|
|
@error{}m4:stdin:2: Warning: define: invalid macro name ignored
|
|
@result{}
|
|
indir(`define', `foo', defn(`divnum'))
|
|
@result{}
|
|
foo
|
|
@result{}0
|
|
indir(`divert', defn(`foo'))
|
|
@error{}m4:stdin:5: empty string treated as 0 in builtin `divert'
|
|
@result{}
|
|
@end example
|
|
|
|
@node Builtin
|
|
@section Indirect call of builtins
|
|
|
|
@cindex indirect call of builtins
|
|
@cindex call of builtins, indirect
|
|
@cindex builtins, indirect call of
|
|
@cindex GNU extensions
|
|
Builtin macros can be called indirectly with @code{builtin}:
|
|
|
|
@deffn Builtin builtin (@var{name}, @ovar{args@dots{}})
|
|
Results in a call to the builtin @var{name}, which is passed the
|
|
rest of the arguments @var{args}. If @var{name} does not name a
|
|
builtin, an error message is printed, and the expansion is void.
|
|
|
|
The macro @code{builtin} is recognized only with parameters.
|
|
@end deffn
|
|
|
|
This can be used even if @var{name} has been given another definition
|
|
that has covered the original, or been undefined so that no macro
|
|
maps to the builtin.
|
|
|
|
@example
|
|
pushdef(`define', `hidden')
|
|
@result{}
|
|
undefine(`undefine')
|
|
@result{}
|
|
define(`foo', `bar')
|
|
@result{}hidden
|
|
foo
|
|
@result{}foo
|
|
builtin(`define', `foo', defn(`divnum'))
|
|
@result{}
|
|
foo
|
|
@result{}0
|
|
builtin(`define', `foo', `BAR')
|
|
@result{}
|
|
foo
|
|
@result{}BAR
|
|
undefine(`foo')
|
|
@result{}undefine(foo)
|
|
foo
|
|
@result{}BAR
|
|
builtin(`undefine', `foo')
|
|
@result{}
|
|
foo
|
|
@result{}foo
|
|
@end example
|
|
|
|
The @var{name} argument only matches the original name of the builtin,
|
|
even when the @option{--prefix-builtins} option (or @option{-P},
|
|
@pxref{Operation modes, , Invoking m4}) is in effect. This is different
|
|
from @code{indir}, which only tracks current macro names.
|
|
|
|
@comment options: -P
|
|
@example
|
|
$ @kbd{m4 -P}
|
|
m4_builtin(`divnum')
|
|
@result{}0
|
|
m4_builtin(`m4_divnum')
|
|
@error{}m4:stdin:2: undefined builtin `m4_divnum'
|
|
@result{}
|
|
m4_indir(`divnum')
|
|
@error{}m4:stdin:3: undefined macro `divnum'
|
|
@result{}
|
|
m4_indir(`m4_divnum')
|
|
@result{}0
|
|
@end example
|
|
|
|
Note that @code{indir} and @code{builtin} can be used to invoke builtins
|
|
without arguments, even when they normally require parameters to be
|
|
recognized; but it will provoke a warning, and result in a void expansion.
|
|
|
|
@example
|
|
builtin
|
|
@result{}builtin
|
|
builtin()
|
|
@error{}m4:stdin:2: undefined builtin `'
|
|
@result{}
|
|
builtin(`builtin')
|
|
@error{}m4:stdin:3: Warning: too few arguments to builtin `builtin'
|
|
@result{}
|
|
builtin(`builtin',)
|
|
@error{}m4:stdin:4: undefined builtin `'
|
|
@result{}
|
|
builtin(`builtin', ``'
|
|
')
|
|
@error{}m4:stdin:5: undefined builtin ``'
|
|
@error{}'
|
|
@result{}
|
|
indir(`index')
|
|
@error{}m4:stdin:7: Warning: too few arguments to builtin `index'
|
|
@result{}
|
|
@end example
|
|
|
|
@ignore
|
|
@comment This example is not worth putting in the manual, but it is
|
|
@comment needed for full coverage. Autoconf's m4_include relies heavily
|
|
@comment on this feature.
|
|
|
|
@example
|
|
builtin(`include', `foo')dnl
|
|
@result{}bar
|
|
@end example
|
|
|
|
@comment And this example triggers a regression present in 1.4.10b.
|
|
|
|
@example
|
|
define(`s', `builtin(`shift', $@@)')dnl
|
|
define(`loop', `ifelse(`$2', `', `-', `$1$2: $0(`$1', s(s($@@)))')')dnl
|
|
loop(`1')
|
|
@result{}-
|
|
loop(`1', `2')
|
|
@result{}12: -
|
|
loop(`1', `2', `3')
|
|
@result{}12: 13: -
|
|
loop(`1', `2', `3', `4')
|
|
@result{}12: 13: 14: -
|
|
loop(`1', `2', `3', `4', `5')
|
|
@result{}12: 13: 14: 15: -
|
|
@end example
|
|
@end ignore
|
|
|
|
@node Conditionals
|
|
@chapter Conditionals, loops, and recursion
|
|
|
|
Macros, expanding to plain text, perhaps with arguments, are not quite
|
|
enough. We would like to have macros expand to different things, based
|
|
on decisions taken at run-time. For that, we need some kind of conditionals.
|
|
Also, we would like to have some kind of loop construct, so we could do
|
|
something a number of times, or while some condition is true.
|
|
|
|
@menu
|
|
* Ifdef:: Testing if a macro is defined
|
|
* Ifelse:: If-else construct, or multibranch
|
|
* Shift:: Recursion in @code{m4}
|
|
* Forloop:: Iteration by counting
|
|
* Foreach:: Iteration by list contents
|
|
* Stacks:: Working with definition stacks
|
|
* Composition:: Building macros with macros
|
|
@end menu
|
|
|
|
@node Ifdef
|
|
@section Testing if a macro is defined
|
|
|
|
@cindex conditionals
|
|
There are two different builtin conditionals in @code{m4}. The first is
|
|
@code{ifdef}:
|
|
|
|
@deffn Builtin ifdef (@var{name}, @var{string-1}, @ovar{string-2})
|
|
If @var{name} is defined as a macro, @code{ifdef} expands to
|
|
@var{string-1}, otherwise to @var{string-2}. If @var{string-2} is
|
|
omitted, it is taken to be the empty string (according to the normal
|
|
rules).
|
|
|
|
The macro @code{ifdef} is recognized only with parameters.
|
|
@end deffn
|
|
|
|
@example
|
|
ifdef(`foo', ``foo' is defined', ``foo' is not defined')
|
|
@result{}foo is not defined
|
|
define(`foo', `')
|
|
@result{}
|
|
ifdef(`foo', ``foo' is defined', ``foo' is not defined')
|
|
@result{}foo is defined
|
|
ifdef(`no_such_macro', `yes', `no', `extra argument')
|
|
@error{}m4:stdin:4: Warning: excess arguments to builtin `ifdef' ignored
|
|
@result{}no
|
|
@end example
|
|
|
|
@node Ifelse
|
|
@section If-else construct, or multibranch
|
|
|
|
@cindex comparing strings
|
|
@cindex discarding input
|
|
@cindex input, discarding
|
|
The other conditional, @code{ifelse}, is much more powerful. It can be
|
|
used as a way to introduce a long comment, as an if-else construct, or
|
|
as a multibranch, depending on the number of arguments supplied:
|
|
|
|
@deffn Builtin ifelse (@var{comment})
|
|
@deffnx Builtin ifelse (@var{string-1}, @var{string-2}, @var{equal}, @
|
|
@ovar{not-equal})
|
|
@deffnx Builtin ifelse (@var{string-1}, @var{string-2}, @var{equal-1}, @
|
|
@var{string-3}, @var{string-4}, @var{equal-2}, @dots{}, @ovar{not-equal})
|
|
Used with only one argument, the @code{ifelse} simply discards it and
|
|
produces no output.
|
|
|
|
If called with three or four arguments, @code{ifelse} expands into
|
|
@var{equal}, if @var{string-1} and @var{string-2} are equal (character
|
|
for character), otherwise it expands to @var{not-equal}. A final fifth
|
|
argument is ignored, after triggering a warning.
|
|
|
|
If called with six or more arguments, and @var{string-1} and
|
|
@var{string-2} are equal, @code{ifelse} expands into @var{equal-1},
|
|
otherwise the first three arguments are discarded and the processing
|
|
starts again.
|
|
|
|
The macro @code{ifelse} is recognized only with parameters.
|
|
@end deffn
|
|
|
|
Using only one argument is a common @code{m4} idiom for introducing a
|
|
block comment, as an alternative to repeatedly using @code{dnl}. This
|
|
special usage is recognized by GNU @code{m4}, so that in this
|
|
case, the warning about missing arguments is never triggered.
|
|
|
|
@example
|
|
ifelse(`some comments')
|
|
@result{}
|
|
ifelse(`foo', `bar')
|
|
@error{}m4:stdin:2: Warning: too few arguments to builtin `ifelse'
|
|
@result{}
|
|
@end example
|
|
|
|
Using three or four arguments provides decision points.
|
|
|
|
@example
|
|
ifelse(`foo', `bar', `true')
|
|
@result{}
|
|
ifelse(`foo', `foo', `true')
|
|
@result{}true
|
|
define(`foo', `bar')
|
|
@result{}
|
|
ifelse(foo, `bar', `true', `false')
|
|
@result{}true
|
|
ifelse(foo, `foo', `true', `false')
|
|
@result{}false
|
|
@end example
|
|
|
|
@cindex macro, blind
|
|
@cindex blind macro
|
|
Notice how the first argument was used unquoted; it is common to compare
|
|
the expansion of a macro with a string. With this macro, you can now
|
|
reproduce the behavior of blind builtins, where the macro is recognized
|
|
only with arguments.
|
|
|
|
@example
|
|
define(`foo', `ifelse(`$#', `0', ``$0'', `arguments:$#')')
|
|
@result{}
|
|
foo
|
|
@result{}foo
|
|
foo()
|
|
@result{}arguments:1
|
|
foo(`a', `b', `c')
|
|
@result{}arguments:3
|
|
@end example
|
|
|
|
For an example of a way to make defining blind macros easier, see
|
|
@ref{Composition}.
|
|
|
|
@cindex multibranches
|
|
@cindex switch statement
|
|
@cindex case statement
|
|
The macro @code{ifelse} can take more than four arguments. If given more
|
|
than four arguments, @code{ifelse} works like a @code{case} or @code{switch}
|
|
statement in traditional programming languages. If @var{string-1} and
|
|
@var{string-2} are equal, @code{ifelse} expands into @var{equal-1}, otherwise
|
|
the procedure is repeated with the first three arguments discarded. This
|
|
calls for an example:
|
|
|
|
@example
|
|
ifelse(`foo', `bar', `third', `gnu', `gnats')
|
|
@error{}m4:stdin:1: Warning: excess arguments to builtin `ifelse' ignored
|
|
@result{}gnu
|
|
ifelse(`foo', `bar', `third', `gnu', `gnats', `sixth')
|
|
@result{}
|
|
ifelse(`foo', `bar', `third', `gnu', `gnats', `sixth', `seventh')
|
|
@result{}seventh
|
|
ifelse(`foo', `bar', `3', `gnu', `gnats', `6', `7', `8')
|
|
@error{}m4:stdin:4: Warning: excess arguments to builtin `ifelse' ignored
|
|
@result{}7
|
|
@end example
|
|
|
|
@ignore
|
|
@comment Stress tests, not worth documenting.
|
|
|
|
@comment Ensure that references compared to strings work regardless of
|
|
@comment similar prefixes.
|
|
@example
|
|
define(`e', `$@@')define(`long', `01234567890123456789')
|
|
@result{}
|
|
ifelse(long, `01234567890123456789', `yes', `no')
|
|
@result{}yes
|
|
ifelse(`01234567890123456789', long, `yes', `no')
|
|
@result{}yes
|
|
ifelse(long, `01234567890123456789-', `yes', `no')
|
|
@result{}no
|
|
ifelse(`01234567890123456789-', long, `yes', `no')
|
|
@result{}no
|
|
ifelse(e(long), `01234567890123456789', `yes', `no')
|
|
@result{}yes
|
|
ifelse(`01234567890123456789', e(long), `yes', `no')
|
|
@result{}yes
|
|
ifelse(e(long), `01234567890123456789-', `yes', `no')
|
|
@result{}no
|
|
ifelse(`01234567890123456789-', e(long), `yes', `no')
|
|
@result{}no
|
|
ifelse(-e(long), `-01234567890123456789', `yes', `no')
|
|
@result{}yes
|
|
ifelse(-`01234567890123456789', -e(long), `yes', `no')
|
|
@result{}yes
|
|
ifelse(-e(long), `-01234567890123456789-', `yes', `no')
|
|
@result{}no
|
|
ifelse(`-01234567890123456789-', -e(long), `yes', `no')
|
|
@result{}no
|
|
ifelse(-e(long)-, `-01234567890123456789-', `yes', `no')
|
|
@result{}yes
|
|
ifelse(-`01234567890123456789-', -e(long)-, `yes', `no')
|
|
@result{}yes
|
|
ifelse(-e(long)-, `-01234567890123456789', `yes', `no')
|
|
@result{}no
|
|
ifelse(`-01234567890123456789', -e(long)-, `yes', `no')
|
|
@result{}no
|
|
ifelse(`-'e(long), `-01234567890123456789', `yes', `no')
|
|
@result{}yes
|
|
ifelse(-`01234567890123456789', `-'e(long), `yes', `no')
|
|
@result{}yes
|
|
ifelse(`-'e(long), `-01234567890123456789-', `yes', `no')
|
|
@result{}no
|
|
ifelse(`-01234567890123456789-', `-'e(long), `yes', `no')
|
|
@result{}no
|
|
ifelse(`-'e(long)`-', `-01234567890123456789-', `yes', `no')
|
|
@result{}yes
|
|
ifelse(-`01234567890123456789-', `-'e(long)`-', `yes', `no')
|
|
@result{}yes
|
|
ifelse(`-'e(long)`-', `-01234567890123456789', `yes', `no')
|
|
@result{}no
|
|
ifelse(`-01234567890123456789', `-'e(long)`-', `yes', `no')
|
|
@result{}no
|
|
@end example
|
|
@end ignore
|
|
|
|
Naturally, the normal case will be slightly more advanced than these
|
|
examples. A common use of @code{ifelse} is in macros implementing loops
|
|
of various kinds.
|
|
|
|
@node Shift
|
|
@section Recursion in @code{m4}
|
|
|
|
@cindex recursive macros
|
|
@cindex macros, recursive
|
|
There is no direct support for loops in @code{m4}, but macros can be
|
|
recursive. There is no limit on the number of recursion levels, other
|
|
than those enforced by your hardware and operating system.
|
|
|
|
@cindex loops
|
|
Loops can be programmed using recursion and the conditionals described
|
|
previously.
|
|
|
|
There is a builtin macro, @code{shift}, which can, among other things,
|
|
be used for iterating through the actual arguments to a macro:
|
|
|
|
@deffn Builtin shift (@var{arg1}, @dots{})
|
|
Takes any number of arguments, and expands to all its arguments except
|
|
@var{arg1}, separated by commas, with each argument quoted.
|
|
|
|
The macro @code{shift} is recognized only with parameters.
|
|
@end deffn
|
|
|
|
@example
|
|
shift
|
|
@result{}shift
|
|
shift(`bar')
|
|
@result{}
|
|
shift(`foo', `bar', `baz')
|
|
@result{}bar,baz
|
|
@end example
|
|
|
|
An example of the use of @code{shift} is this macro:
|
|
|
|
@cindex reversing arguments
|
|
@cindex arguments, reversing
|
|
@deffn Composite reverse (@dots{})
|
|
Takes any number of arguments, and reverses their order.
|
|
@end deffn
|
|
|
|
It is implemented as:
|
|
|
|
@example
|
|
define(`reverse', `ifelse(`$#', `0', , `$#', `1', ``$1'',
|
|
`reverse(shift($@@)), `$1'')')
|
|
@result{}
|
|
reverse
|
|
@result{}
|
|
reverse(`foo')
|
|
@result{}foo
|
|
reverse(`foo', `bar', `gnats', `and gnus')
|
|
@result{}and gnus, gnats, bar, foo
|
|
@end example
|
|
|
|
While not a very interesting macro, it does show how simple loops can be
|
|
made with @code{shift}, @code{ifelse} and recursion. It also shows
|
|
that @code{shift} is usually used with @samp{$@@}. Another example of
|
|
this is an implementation of a short-circuiting conditional operator.
|
|
|
|
@cindex short-circuiting conditional
|
|
@cindex conditional, short-circuiting
|
|
@deffn Composite cond (@var{test-1}, @var{string-1}, @var{equal-1}, @
|
|
@ovar{test-2}, @ovar{string-2}, @ovar{equal-2}, @dots{}, @ovar{not-equal})
|
|
Similar to @code{ifelse}, where an equal comparison between the first
|
|
two strings results in the third, otherwise the first three arguments
|
|
are discarded and the process repeats. The difference is that each
|
|
@var{test-<n>} is expanded only when it is encountered. This means that
|
|
every third argument to @code{cond} is normally given one more level of
|
|
quoting than the corresponding argument to @code{ifelse}.
|
|
@end deffn
|
|
|
|
Here is the implementation of @code{cond}, along with a demonstration of
|
|
how it can short-circuit the side effects in @code{side}. Notice how
|
|
all the unquoted side effects happen regardless of how many comparisons
|
|
are made with @code{ifelse}, compared with only the relevant effects
|
|
with @code{cond}.
|
|
|
|
@example
|
|
define(`cond',
|
|
`ifelse(`$#', `1', `$1',
|
|
`ifelse($1, `$2', `$3',
|
|
`$0(shift(shift(shift($@@))))')')')dnl
|
|
define(`side', `define(`counter', incr(counter))$1')dnl
|
|
define(`example1',
|
|
`define(`counter', `0')dnl
|
|
ifelse(side(`$1'), `yes', `one comparison: ',
|
|
side(`$1'), `no', `two comparisons: ',
|
|
side(`$1'), `maybe', `three comparisons: ',
|
|
`side(`default answer: ')')counter')dnl
|
|
define(`example2',
|
|
`define(`counter', `0')dnl
|
|
cond(`side(`$1')', `yes', `one comparison: ',
|
|
`side(`$1')', `no', `two comparisons: ',
|
|
`side(`$1')', `maybe', `three comparisons: ',
|
|
`side(`default answer: ')')counter')dnl
|
|
example1(`yes')
|
|
@result{}one comparison: 3
|
|
example1(`no')
|
|
@result{}two comparisons: 3
|
|
example1(`maybe')
|
|
@result{}three comparisons: 3
|
|
example1(`feeling rather indecisive today')
|
|
@result{}default answer: 4
|
|
example2(`yes')
|
|
@result{}one comparison: 1
|
|
example2(`no')
|
|
@result{}two comparisons: 2
|
|
example2(`maybe')
|
|
@result{}three comparisons: 3
|
|
example2(`feeling rather indecisive today')
|
|
@result{}default answer: 4
|
|
@end example
|
|
|
|
@cindex joining arguments
|
|
@cindex arguments, joining
|
|
@cindex concatenating arguments
|
|
Another common task that requires iteration is joining a list of
|
|
arguments into a single string.
|
|
|
|
@deffn Composite join (@ovar{separator}, @ovar{args@dots{}})
|
|
@deffnx Composite joinall (@ovar{separator}, @ovar{args@dots{}})
|
|
Generate a single-quoted string, consisting of each @var{arg} separated
|
|
by @var{separator}. While @code{joinall} always outputs a
|
|
@var{separator} between arguments, @code{join} avoids the
|
|
@var{separator} for an empty @var{arg}.
|
|
@end deffn
|
|
|
|
Here are some examples of its usage, based on the implementation
|
|
@file{m4-@value{VERSION}/@/examples/@/join.m4} distributed in this
|
|
package:
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
include(`join.m4')
|
|
@result{}
|
|
join,join(`-'),join(`-', `'),join(`-', `', `')
|
|
@result{},,,
|
|
joinall,joinall(`-'),joinall(`-', `'),joinall(`-', `', `')
|
|
@result{},,,-
|
|
join(`-', `1')
|
|
@result{}1
|
|
join(`-', `1', `2', `3')
|
|
@result{}1-2-3
|
|
join(`', `1', `2', `3')
|
|
@result{}123
|
|
join(`-', `', `1', `', `', `2', `')
|
|
@result{}1-2
|
|
joinall(`-', `', `1', `', `', `2', `')
|
|
@result{}-1---2-
|
|
join(`,', `1', `2', `3')
|
|
@result{}1,2,3
|
|
define(`nargs', `$#')dnl
|
|
nargs(join(`,', `1', `2', `3'))
|
|
@result{}1
|
|
@end example
|
|
|
|
Examining the implementation shows some interesting points about several
|
|
m4 programming idioms.
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
undivert(`join.m4')dnl
|
|
@result{}divert(`-1')
|
|
@result{}# join(sep, args) - join each non-empty ARG into a single
|
|
@result{}# string, with each element separated by SEP
|
|
@result{}define(`join',
|
|
@result{}`ifelse(`$#', `2', ``$2'',
|
|
@result{} `ifelse(`$2', `', `', ``$2'_')$0(`$1', shift(shift($@@)))')')
|
|
@result{}define(`_join',
|
|
@result{}`ifelse(`$#$2', `2', `',
|
|
@result{} `ifelse(`$2', `', `', ``$1$2'')$0(`$1', shift(shift($@@)))')')
|
|
@result{}# joinall(sep, args) - join each ARG, including empty ones,
|
|
@result{}# into a single string, with each element separated by SEP
|
|
@result{}define(`joinall', ``$2'_$0(`$1', shift($@@))')
|
|
@result{}define(`_joinall',
|
|
@result{}`ifelse(`$#', `2', `', ``$1$3'$0(`$1', shift(shift($@@)))')')
|
|
@result{}divert`'dnl
|
|
@end example
|
|
|
|
First, notice that this implementation creates helper macros
|
|
@code{_join} and @code{_joinall}. This division of labor makes it
|
|
easier to output the correct number of @var{separator} instances:
|
|
@code{join} and @code{joinall} are responsible for the first argument,
|
|
without a separator, while @code{_join} and @code{_joinall} are
|
|
responsible for all remaining arguments, always outputting a separator
|
|
when outputting an argument.
|
|
|
|
Next, observe how @code{join} decides to iterate to itself, because the
|
|
first @var{arg} was empty, or to output the argument and swap over to
|
|
@code{_join}. If the argument is non-empty, then the nested
|
|
@code{ifelse} results in an unquoted @samp{_}, which is concatenated
|
|
with the @samp{$0} to form the next macro name to invoke. The
|
|
@code{joinall} implementation is simpler since it does not have to
|
|
suppress empty @var{arg}; it always executes once then defers to
|
|
@code{_joinall}.
|
|
|
|
Another important idiom is the idea that @var{separator} is reused for
|
|
each iteration. Each iteration has one less argument, but rather than
|
|
discarding @samp{$1} by iterating with @code{$0(shift($@@))}, the macro
|
|
discards @samp{$2} by using @code{$0(`$1', shift(shift($@@)))}.
|
|
|
|
Next, notice that it is possible to compare more than one condition in a
|
|
single @code{ifelse} test. The test of @samp{$#$2} against @samp{2}
|
|
allows @code{_join} to iterate for two separate reasons---either there
|
|
are still more than two arguments, or there are exactly two arguments
|
|
but the last argument is not empty.
|
|
|
|
Finally, notice that these macros require exactly two arguments to
|
|
terminate recursion, but that they still correctly result in empty
|
|
output when given no @var{args} (i.e., zero or one macro argument). On
|
|
the first pass when there are too few arguments, the @code{shift}
|
|
results in no output, but leaves an empty string to serve as the
|
|
required second argument for the second pass. Put another way,
|
|
@samp{`$1', shift($@@)} is not the same as @samp{$@@}, since only the
|
|
former guarantees at least two arguments.
|
|
|
|
@cindex quote manipulation
|
|
@cindex manipulating quotes
|
|
Sometimes, a recursive algorithm requires adding quotes to each element,
|
|
or treating multiple arguments as a single element:
|
|
|
|
@deffn Composite quote (@dots{})
|
|
@deffnx Composite dquote (@dots{})
|
|
@deffnx Composite dquote_elt (@dots{})
|
|
Takes any number of arguments, and adds quoting. With @code{quote},
|
|
only one level of quoting is added, effectively removing whitespace
|
|
after commas and turning multiple arguments into a single string. With
|
|
@code{dquote}, two levels of quoting are added, one around each element,
|
|
and one around the list. And with @code{dquote_elt}, two levels of
|
|
quoting are added around each element.
|
|
@end deffn
|
|
|
|
An actual implementation of these three macros is distributed as
|
|
@file{m4-@value{VERSION}/@/examples/@/quote.m4} in this package. First,
|
|
let's examine their usage:
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
include(`quote.m4')
|
|
@result{}
|
|
-quote-dquote-dquote_elt-
|
|
@result{}----
|
|
-quote()-dquote()-dquote_elt()-
|
|
@result{}--`'-`'-
|
|
-quote(`1')-dquote(`1')-dquote_elt(`1')-
|
|
@result{}-1-`1'-`1'-
|
|
-quote(`1', `2')-dquote(`1', `2')-dquote_elt(`1', `2')-
|
|
@result{}-1,2-`1',`2'-`1',`2'-
|
|
define(`n', `$#')dnl
|
|
-n(quote(`1', `2'))-n(dquote(`1', `2'))-n(dquote_elt(`1', `2'))-
|
|
@result{}-1-1-2-
|
|
dquote(dquote_elt(`1', `2'))
|
|
@result{}``1'',``2''
|
|
dquote_elt(dquote(`1', `2'))
|
|
@result{}``1',`2''
|
|
@end example
|
|
|
|
The last two lines show that when given two arguments, @code{dquote}
|
|
results in one string, while @code{dquote_elt} results in two. Now,
|
|
examine the implementation. Note that @code{quote} and
|
|
@code{dquote_elt} make decisions based on their number of arguments, so
|
|
that when called without arguments, they result in nothing instead of a
|
|
quoted empty string; this is so that it is possible to distinguish
|
|
between no arguments and an empty first argument. @code{dquote}, on the
|
|
other hand, results in a string no matter what, since it is still
|
|
possible to tell whether it was invoked without arguments based on the
|
|
resulting string.
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
undivert(`quote.m4')dnl
|
|
@result{}divert(`-1')
|
|
@result{}# quote(args) - convert args to single-quoted string
|
|
@result{}define(`quote', `ifelse(`$#', `0', `', ``$*'')')
|
|
@result{}# dquote(args) - convert args to quoted list of quoted strings
|
|
@result{}define(`dquote', ``$@@'')
|
|
@result{}# dquote_elt(args) - convert args to list of double-quoted strings
|
|
@result{}define(`dquote_elt', `ifelse(`$#', `0', `', `$#', `1', ```$1''',
|
|
@result{} ```$1'',$0(shift($@@))')')
|
|
@result{}divert`'dnl
|
|
@end example
|
|
|
|
It is worth pointing out that @samp{quote(@var{args})} is more efficient
|
|
than @samp{joinall(`,', @var{args})} for producing the same output.
|
|
|
|
@cindex nine arguments, more than
|
|
@cindex more than nine arguments
|
|
@cindex arguments, more than nine
|
|
One more useful macro based on @code{shift} allows portably selecting
|
|
an arbitrary argument (usually greater than the ninth argument), without
|
|
relying on the GNU extension of multi-digit arguments
|
|
(@pxref{Arguments}).
|
|
|
|
@deffn Composite argn (@var{n}, @dots{})
|
|
Expands to argument @var{n} out of the remaining arguments. @var{n}
|
|
must be a positive number. Usually invoked as
|
|
@samp{argn(`@var{n}',$@@)}.
|
|
@end deffn
|
|
|
|
It is implemented as:
|
|
|
|
@example
|
|
define(`argn', `ifelse(`$1', 1, ``$2'',
|
|
`argn(decr(`$1'), shift(shift($@@)))')')
|
|
@result{}
|
|
argn(`1', `a')
|
|
@result{}a
|
|
define(`foo', `argn(`11', $@@)')
|
|
@result{}
|
|
foo(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k', `l')
|
|
@result{}k
|
|
@end example
|
|
|
|
@node Forloop
|
|
@section Iteration by counting
|
|
|
|
@cindex for loops
|
|
@cindex loops, counting
|
|
@cindex counting loops
|
|
Here is an example of a loop macro that implements a simple for loop.
|
|
|
|
@deffn Composite forloop (@var{iterator}, @var{start}, @var{end}, @var{text})
|
|
Takes the name in @var{iterator}, which must be a valid macro name, and
|
|
successively assign it each integer value from @var{start} to @var{end},
|
|
inclusive. For each assignment to @var{iterator}, append @var{text} to
|
|
the expansion of the @code{forloop}. @var{text} may refer to
|
|
@var{iterator}. Any definition of @var{iterator} prior to this
|
|
invocation is restored.
|
|
@end deffn
|
|
|
|
It can, for example, be used for simple counting:
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
include(`forloop.m4')
|
|
@result{}
|
|
forloop(`i', `1', `8', `i ')
|
|
@result{}1 2 3 4 5 6 7 8@w{ }
|
|
@end example
|
|
|
|
For-loops can be nested, like:
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
include(`forloop.m4')
|
|
@result{}
|
|
forloop(`i', `1', `4', `forloop(`j', `1', `8', ` (i, j)')
|
|
')
|
|
@result{} (1, 1) (1, 2) (1, 3) (1, 4) (1, 5) (1, 6) (1, 7) (1, 8)
|
|
@result{} (2, 1) (2, 2) (2, 3) (2, 4) (2, 5) (2, 6) (2, 7) (2, 8)
|
|
@result{} (3, 1) (3, 2) (3, 3) (3, 4) (3, 5) (3, 6) (3, 7) (3, 8)
|
|
@result{} (4, 1) (4, 2) (4, 3) (4, 4) (4, 5) (4, 6) (4, 7) (4, 8)
|
|
@result{}
|
|
@end example
|
|
|
|
The implementation of the @code{forloop} macro is fairly
|
|
straightforward. The @code{forloop} macro itself is simply a wrapper,
|
|
which saves the previous definition of the first argument, calls the
|
|
internal macro @code{@w{_forloop}}, and re-establishes the saved
|
|
definition of the first argument.
|
|
|
|
The macro @code{@w{_forloop}} expands the fourth argument once, and
|
|
tests to see if the iterator has reached the final value. If it has
|
|
not finished, it increments the iterator (using the predefined macro
|
|
@code{incr}, @pxref{Incr}), and recurses.
|
|
|
|
Here is an actual implementation of @code{forloop}, distributed as
|
|
@file{m4-@value{VERSION}/@/examples/@/forloop.m4} in this package:
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
undivert(`forloop.m4')dnl
|
|
@result{}divert(`-1')
|
|
@result{}# forloop(var, from, to, stmt) - simple version
|
|
@result{}define(`forloop', `pushdef(`$1', `$2')_forloop($@@)popdef(`$1')')
|
|
@result{}define(`_forloop',
|
|
@result{} `$4`'ifelse($1, `$3', `', `define(`$1', incr($1))$0($@@)')')
|
|
@result{}divert`'dnl
|
|
@end example
|
|
|
|
Notice the careful use of quotes. Certain macro arguments are left
|
|
unquoted, each for its own reason. Try to find out @emph{why} these
|
|
arguments are left unquoted, and see what happens if they are quoted.
|
|
(As presented, these two macros are useful but not very robust for
|
|
general use. They lack even basic error handling for cases like
|
|
@var{start} less than @var{end}, @var{end} not numeric, or
|
|
@var{iterator} not being a macro name. See if you can improve these
|
|
macros; or @pxref{Improved forloop, , Answers}).
|
|
|
|
@node Foreach
|
|
@section Iteration by list contents
|
|
|
|
@cindex for each loops
|
|
@cindex loops, list iteration
|
|
@cindex iterating over lists
|
|
Here is an example of a loop macro that implements list iteration.
|
|
|
|
@deffn Composite foreach (@var{iterator}, @var{paren-list}, @var{text})
|
|
@deffnx Composite foreachq (@var{iterator}, @var{quote-list}, @var{text})
|
|
Takes the name in @var{iterator}, which must be a valid macro name, and
|
|
successively assign it each value from @var{paren-list} or
|
|
@var{quote-list}. In @code{foreach}, @var{paren-list} is a
|
|
comma-separated list of elements contained in parentheses. In
|
|
@code{foreachq}, @var{quote-list} is a comma-separated list of elements
|
|
contained in a quoted string. For each assignment to @var{iterator},
|
|
append @var{text} to the overall expansion. @var{text} may refer to
|
|
@var{iterator}. Any definition of @var{iterator} prior to this
|
|
invocation is restored.
|
|
@end deffn
|
|
|
|
As an example, this displays each word in a list inside of a sentence,
|
|
using an implementation of @code{foreach} distributed as
|
|
@file{m4-@value{VERSION}/@/examples/@/foreach.m4}, and @code{foreachq}
|
|
in @file{m4-@value{VERSION}/@/examples/@/foreachq.m4}.
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
include(`foreach.m4')
|
|
@result{}
|
|
foreach(`x', (foo, bar, foobar), `Word was: x
|
|
')dnl
|
|
@result{}Word was: foo
|
|
@result{}Word was: bar
|
|
@result{}Word was: foobar
|
|
include(`foreachq.m4')
|
|
@result{}
|
|
foreachq(`x', `foo, bar, foobar', `Word was: x
|
|
')dnl
|
|
@result{}Word was: foo
|
|
@result{}Word was: bar
|
|
@result{}Word was: foobar
|
|
@end example
|
|
|
|
It is possible to be more complex; each element of the @var{paren-list}
|
|
or @var{quote-list} can itself be a list, to pass as further arguments
|
|
to a helper macro. This example generates a shell case statement:
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
include(`foreach.m4')
|
|
@result{}
|
|
define(`_case', ` $1)
|
|
$2=" $1";;
|
|
')dnl
|
|
define(`_cat', `$1$2')dnl
|
|
case $`'1 in
|
|
@result{}case $1 in
|
|
foreach(`x', `(`(`a', `vara')', `(`b', `varb')', `(`c', `varc')')',
|
|
`_cat(`_case', x)')dnl
|
|
@result{} a)
|
|
@result{} vara=" a";;
|
|
@result{} b)
|
|
@result{} varb=" b";;
|
|
@result{} c)
|
|
@result{} varc=" c";;
|
|
esac
|
|
@result{}esac
|
|
@end example
|
|
|
|
The implementation of the @code{foreach} macro is a bit more involved;
|
|
it is a wrapper around two helper macros. First, @code{@w{_arg1}} is
|
|
needed to grab the first element of a list. Second,
|
|
@code{@w{_foreach}} implements the recursion, successively walking
|
|
through the original list. Here is a simple implementation of
|
|
@code{foreach}:
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
undivert(`foreach.m4')dnl
|
|
@result{}divert(`-1')
|
|
@result{}# foreach(x, (item_1, item_2, ..., item_n), stmt)
|
|
@result{}# parenthesized list, simple version
|
|
@result{}define(`foreach', `pushdef(`$1')_foreach($@@)popdef(`$1')')
|
|
@result{}define(`_arg1', `$1')
|
|
@result{}define(`_foreach', `ifelse(`$2', `()', `',
|
|
@result{} `define(`$1', _arg1$2)$3`'$0(`$1', (shift$2), `$3')')')
|
|
@result{}divert`'dnl
|
|
@end example
|
|
|
|
Unfortunately, that implementation is not robust to macro names as list
|
|
elements. Each iteration of @code{@w{_foreach}} is stripping another
|
|
layer of quotes, leading to erratic results if list elements are not
|
|
already fully expanded. The first cut at implementing @code{foreachq}
|
|
takes this into account. Also, when using quoted elements in a
|
|
@var{paren-list}, the overall list must be quoted. A @var{quote-list}
|
|
has the nice property of requiring fewer characters to create a list
|
|
containing the same quoted elements. To see the difference between the
|
|
two macros, we attempt to pass double-quoted macro names in a list,
|
|
expecting the macro name on output after one layer of quotes is removed
|
|
during list iteration and the final layer removed during the final
|
|
rescan:
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
define(`a', `1')define(`b', `2')define(`c', `3')
|
|
@result{}
|
|
include(`foreach.m4')
|
|
@result{}
|
|
include(`foreachq.m4')
|
|
@result{}
|
|
foreach(`x', `(``a'', ``(b'', ``c)'')', `x
|
|
')
|
|
@result{}1
|
|
@result{}(2)1
|
|
@result{}
|
|
@result{}, x
|
|
@result{})
|
|
foreachq(`x', ```a'', ``(b'', ``c)''', `x
|
|
')dnl
|
|
@result{}a
|
|
@result{}(b
|
|
@result{}c)
|
|
@end example
|
|
|
|
Obviously, @code{foreachq} did a better job; here is its implementation:
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
undivert(`foreachq.m4')dnl
|
|
@result{}include(`quote.m4')dnl
|
|
@result{}divert(`-1')
|
|
@result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
|
|
@result{}# quoted list, simple version
|
|
@result{}define(`foreachq', `pushdef(`$1')_foreachq($@@)popdef(`$1')')
|
|
@result{}define(`_arg1', `$1')
|
|
@result{}define(`_foreachq', `ifelse(quote($2), `', `',
|
|
@result{} `define(`$1', `_arg1($2)')$3`'$0(`$1', `shift($2)', `$3')')')
|
|
@result{}divert`'dnl
|
|
@end example
|
|
|
|
Notice that @code{@w{_foreachq}} had to use the helper macro
|
|
@code{quote} defined earlier (@pxref{Shift}), to ensure that the
|
|
embedded @code{ifelse} call does not go haywire if a list element
|
|
contains a comma. Unfortunately, this implementation of @code{foreachq}
|
|
has its own severe flaw. Whereas the @code{foreach} implementation was
|
|
linear, this macro is quadratic in the number of list elements, and is
|
|
much more likely to trip up the limit set by the command line option
|
|
@option{--nesting-limit} (or @option{-L}, @pxref{Limits control, ,
|
|
Invoking m4}). Additionally, this implementation does not expand
|
|
@samp{defn(`@var{iterator}')} very well, when compared with
|
|
@code{foreach}.
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
include(`foreach.m4')include(`foreachq.m4')
|
|
@result{}
|
|
foreach(`name', `(`a', `b')', ` defn(`name')')
|
|
@result{} a b
|
|
foreachq(`name', ``a', `b'', ` defn(`name')')
|
|
@result{} _arg1(`a', `b') _arg1(shift(`a', `b'))
|
|
@end example
|
|
|
|
It is possible to have robust iteration with linear behavior and sane
|
|
@var{iterator} contents for either list style. See if you can learn
|
|
from the best elements of both of these implementations to create robust
|
|
macros (or @pxref{Improved foreach, , Answers}).
|
|
|
|
@node Stacks
|
|
@section Working with definition stacks
|
|
|
|
@cindex definition stack
|
|
@cindex pushdef stack
|
|
@cindex stack, macro definition
|
|
Thanks to @code{pushdef}, manipulation of a stack is an intrinsic
|
|
operation in @code{m4}. Normally, only the topmost definition in a
|
|
stack is important, but sometimes, it is desirable to manipulate the
|
|
entire definition stack.
|
|
|
|
@deffn Composite stack_foreach (@var{macro}, @var{action})
|
|
@deffnx Composite stack_foreach_lifo (@var{macro}, @var{action})
|
|
For each of the @code{pushdef} definitions associated with @var{macro},
|
|
invoke the macro @var{action} with a single argument of that definition.
|
|
@code{stack_foreach} visits the oldest definition first, while
|
|
@code{stack_foreach_lifo} visits the current definition first.
|
|
@var{action} should not modify or dereference @var{macro}. There are a
|
|
few special macros, such as @code{defn}, which cannot be used as the
|
|
@var{macro} parameter.
|
|
@end deffn
|
|
|
|
A sample implementation of these macros is distributed in the file
|
|
@file{m4-@value{VERSION}/@/examples/@/stack.m4}.
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
include(`stack.m4')
|
|
@result{}
|
|
pushdef(`a', `1')pushdef(`a', `2')pushdef(`a', `3')
|
|
@result{}
|
|
define(`show', ``$1'
|
|
')
|
|
@result{}
|
|
stack_foreach(`a', `show')dnl
|
|
@result{}1
|
|
@result{}2
|
|
@result{}3
|
|
stack_foreach_lifo(`a', `show')dnl
|
|
@result{}3
|
|
@result{}2
|
|
@result{}1
|
|
@end example
|
|
|
|
Now for the implementation. Note the definition of a helper macro,
|
|
@code{_stack_reverse}, which destructively swaps the contents of one
|
|
stack of definitions into the reverse order in the temporary macro
|
|
@samp{tmp-$1}. By calling the helper twice, the original order is
|
|
restored back into the macro @samp{$1}; since the operation is
|
|
destructive, this explains why @samp{$1} must not be modified or
|
|
dereferenced during the traversal. The caller can then inject
|
|
additional code to pass the definition currently being visited to
|
|
@samp{$2}. The choice of helper names is intentional; since @samp{-} is
|
|
not valid as part of a macro name, there is no risk of conflict with a
|
|
valid macro name, and the code is guaranteed to use @code{defn} where
|
|
necessary. Finally, note that any macro used in the traversal of a
|
|
@code{pushdef} stack, such as @code{pushdef} or @code{defn}, cannot be
|
|
handled by @code{stack_foreach}, since the macro would temporarily be
|
|
undefined during the algorithm.
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
undivert(`stack.m4')dnl
|
|
@result{}divert(`-1')
|
|
@result{}# stack_foreach(macro, action)
|
|
@result{}# Invoke ACTION with a single argument of each definition
|
|
@result{}# from the definition stack of MACRO, starting with the oldest.
|
|
@result{}define(`stack_foreach',
|
|
@result{}`_stack_reverse(`$1', `tmp-$1')'dnl
|
|
@result{}`_stack_reverse(`tmp-$1', `$1', `$2(defn(`$1'))')')
|
|
@result{}# stack_foreach_lifo(macro, action)
|
|
@result{}# Invoke ACTION with a single argument of each definition
|
|
@result{}# from the definition stack of MACRO, starting with the newest.
|
|
@result{}define(`stack_foreach_lifo',
|
|
@result{}`_stack_reverse(`$1', `tmp-$1', `$2(defn(`$1'))')'dnl
|
|
@result{}`_stack_reverse(`tmp-$1', `$1')')
|
|
@result{}define(`_stack_reverse',
|
|
@result{}`ifdef(`$1', `pushdef(`$2', defn(`$1'))$3`'popdef(`$1')$0($@@)')')
|
|
@result{}divert`'dnl
|
|
@end example
|
|
|
|
@node Composition
|
|
@section Building macros with macros
|
|
|
|
@cindex macro composition
|
|
@cindex composing macros
|
|
Since m4 is a macro language, it is possible to write macros that
|
|
can build other macros. First on the list is a way to automate the
|
|
creation of blind macros.
|
|
|
|
@cindex macro, blind
|
|
@cindex blind macro
|
|
@deffn Composite define_blind (@var{name}, @ovar{value})
|
|
Defines @var{name} as a blind macro, such that @var{name} will expand to
|
|
@var{value} only when given explicit arguments. @var{value} should not
|
|
be the result of @code{defn} (@pxref{Defn}). This macro is only
|
|
recognized with parameters, and results in an empty string.
|
|
@end deffn
|
|
|
|
Defining a macro to define another macro can be a bit tricky. We want
|
|
to use a literal @samp{$#} in the argument to the nested @code{define}.
|
|
However, if @samp{$} and @samp{#} are adjacent in the definition of
|
|
@code{define_blind}, then it would be expanded as the number of
|
|
arguments to @code{define_blind} rather than the intended number of
|
|
arguments to @var{name}. The solution is to pass the difficult
|
|
characters through extra arguments to a helper macro
|
|
@code{_define_blind}. When composing macros, it is a common idiom to
|
|
need a helper macro to concatenate text that forms parameters in the
|
|
composed macro, rather than interpreting the text as a parameter of the
|
|
composing macro.
|
|
|
|
As for the limitation against using @code{defn}, there are two reasons.
|
|
If a macro was previously defined with @code{define_blind}, then it can
|
|
safely be renamed to a new blind macro using plain @code{define}; using
|
|
@code{define_blind} to rename it just adds another layer of
|
|
@code{ifelse}, occupying memory and slowing down execution. And if a
|
|
macro is a builtin, then it would result in an attempt to define a macro
|
|
consisting of both text and a builtin token; this is not supported, and
|
|
the builtin token is flattened to an empty string.
|
|
|
|
With that explanation, here's the definition, and some sample usage.
|
|
Notice that @code{define_blind} is itself a blind macro.
|
|
|
|
@example
|
|
$ @kbd{m4 -d}
|
|
define(`define_blind', `ifelse(`$#', `0', ``$0'',
|
|
`_$0(`$1', `$2', `$'`#', `$'`0')')')
|
|
@result{}
|
|
define(`_define_blind', `define(`$1',
|
|
`ifelse(`$3', `0', ``$4'', `$2')')')
|
|
@result{}
|
|
define_blind
|
|
@result{}define_blind
|
|
define_blind(`foo', `arguments were $*')
|
|
@result{}
|
|
foo
|
|
@result{}foo
|
|
foo(`bar')
|
|
@result{}arguments were bar
|
|
define(`blah', defn(`foo'))
|
|
@result{}
|
|
blah
|
|
@result{}blah
|
|
blah(`a', `b')
|
|
@result{}arguments were a,b
|
|
defn(`blah')
|
|
@result{}ifelse(`$#', `0', ``$0'', `arguments were $*')
|
|
@end example
|
|
|
|
@cindex currying arguments
|
|
@cindex argument currying
|
|
Another interesting composition tactic is argument @dfn{currying}, or
|
|
wrapping a macro that normally takes multiple arguments in a way that
|
|
the initial arguments can be supplied early, and the remaining arguments
|
|
are applied later in a context that normally expects a macro name that
|
|
accepts fewer arguments (often just one).
|
|
|
|
@deffn Composite curry (@var{macro}, @dots{})
|
|
Accept a list of early arguments, then expand to an unspecified macro
|
|
name that takes one or more late arguments, appends those extra
|
|
arguments to the earlier, and finally invokes @var{macro} with the
|
|
resulting list of arguments.
|
|
@end deffn
|
|
|
|
A demonstration of currying makes the intent of this macro a little more
|
|
obvious. The macro @code{stack_foreach} mentioned earlier is an example
|
|
of a context that provides exactly one argument to a macro name. But
|
|
coupled with currying, we can invoke @code{reverse} with two or more
|
|
arguments for each definition of a macro stack (that is, we factor out
|
|
the early argument @code{`4'} and couple it with a different late
|
|
argument for each member of the stack @code{a}). This example uses the
|
|
file @file{m4-@value{VERSION}/@/examples/@/curry.m4} included in the
|
|
distribution.
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
include(`curry.m4')include(`stack.m4')
|
|
@result{}
|
|
define(`reverse', `ifelse(`$#', `0', , `$#', `1', ``$1'',
|
|
`reverse(shift($@@)), `$1'')')
|
|
@result{}
|
|
pushdef(`a', `1')pushdef(`a', `2')pushdef(`a', `3')
|
|
@result{}
|
|
stack_foreach(`a', `:curry(`reverse', `4')')
|
|
@result{}:1, 4:2, 4:3, 4
|
|
stack_foreach(`a', `:curry(`reverse', `4', `5')')
|
|
@result{}:1, 5, 4:2, 5, 4:3, 5, 4
|
|
curry(`curry', `reverse', `1')(`2')(`3')
|
|
@result{}3, 2, 1
|
|
curry(`reverse', `1')(`2', `3')
|
|
@result{}3, 2, 1
|
|
@end example
|
|
|
|
Now for the implementation. Notice how @code{curry} leaves off with a
|
|
macro name but no open parenthesis, while still in the middle of
|
|
collecting arguments for @samp{$1}. The macro @code{_curry} is the
|
|
helper macro that takes one or more late arguments, then adds to the list and
|
|
finally supplies the closing parenthesis. The use of a comma inside the
|
|
@code{shift} call allows currying to also work for a macro that takes
|
|
one argument, although it often makes more sense to invoke that macro
|
|
directly rather than going through @code{curry}.
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
undivert(`curry.m4')dnl
|
|
@result{}divert(`-1')
|
|
@result{}# curry(macro, args...)
|
|
@result{}# Perform partial argument application on the given macro. This
|
|
@result{}# expands to an unspecified macro name that accepts one or more extra
|
|
@result{}# arguments, and appends those to the args supplied to the original
|
|
@result{}# curry call as the overall set of arguments to pass to macro. That
|
|
@result{}# is, curry(`macro', args1...)(args2...) is the same as invoking
|
|
@result{}# macro(args1..., args2...).
|
|
@result{}#
|
|
@result{}# Most often, argument currying comes in handy when given a context
|
|
@result{}# that normally takes a macro name to call with one argument, but
|
|
@result{}# where you want to combine that variable argument with other fixed
|
|
@result{}# arguments to forward to a macro that takes multiple arguments. For
|
|
@result{}# example, given a "foreach" macro that calls its first argument once
|
|
@result{}# for each successive argument, "foreach(`curry(`mult', 3)', 1, 2, 3)"
|
|
@result{}# would behave the same as "mult(3, 1), mult(3, 2), mult(3, 3)".
|
|
@result{}#
|
|
@result{}# It is also possible to create a named function curry. For example:
|
|
@result{}# define(`mult3', `curry(`mult', 3)($1)')
|
|
@result{}# Later use of mult3(value) will compute the same as mult(3, value).
|
|
@result{}define(`curry', `$1(shift($@@,)_curry')
|
|
@result{}define(`_curry', `$@@)')
|
|
@result{}divert`'dnl
|
|
@end example
|
|
|
|
Unfortunately, with M4 1.4.x, @code{curry} is unable to handle builtin
|
|
tokens, which are silently flattened to the empty string when passed
|
|
through another text macro. This limitation will be lifted in a future
|
|
release of M4.
|
|
|
|
@cindex renaming macros
|
|
@cindex copying macros
|
|
@cindex macros, copying
|
|
Putting the last few concepts together, it is possible to copy or rename
|
|
an entire stack of macro definitions.
|
|
|
|
@deffn Composite copy (@var{source}, @var{dest})
|
|
@deffnx Composite rename (@var{source}, @var{dest})
|
|
Ensure that @var{dest} is undefined, then define it to the same stack of
|
|
definitions currently in @var{source}. @code{copy} leaves @var{source}
|
|
unchanged, while @code{rename} undefines @var{source}. There are only a
|
|
few macros, such as @code{copy} or @code{defn}, which cannot be copied
|
|
via this macro.
|
|
@end deffn
|
|
|
|
The implementation is relatively straightforward (although since it uses
|
|
@code{curry}, it is unable to copy builtin macros, such as the second
|
|
definition of @code{a} as a synonym for @code{divnum}. See if you can
|
|
design a version that works around this limitation, or @pxref{Improved
|
|
copy, , Answers}).
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
include(`curry.m4')include(`stack.m4')
|
|
@result{}
|
|
define(`rename', `copy($@@)undefine(`$1')')dnl
|
|
define(`copy', `ifdef(`$2', `errprint(`$2 already defined
|
|
')m4exit(`1')',
|
|
`stack_foreach(`$1', `curry(`pushdef', `$2')')')')dnl
|
|
pushdef(`a', `1')pushdef(`a', defn(`divnum'))pushdef(`a', `2')
|
|
@result{}
|
|
copy(`a', `b')
|
|
@result{}
|
|
rename(`b', `c')
|
|
@result{}
|
|
a b c
|
|
@result{}2 b 2
|
|
popdef(`a', `c')c a
|
|
@result{} 0
|
|
popdef(`a', `c')a c
|
|
@result{}1 1
|
|
@end example
|
|
|
|
@node Debugging
|
|
@chapter How to debug macros and input
|
|
|
|
@cindex debugging macros
|
|
@cindex macros, debugging
|
|
When writing macros for @code{m4}, they often do not work as intended on
|
|
the first try (as is the case with most programming languages).
|
|
Fortunately, there is support for macro debugging in @code{m4}.
|
|
|
|
@menu
|
|
* Dumpdef:: Displaying macro definitions
|
|
* Trace:: Tracing macro calls
|
|
* Debug Levels:: Controlling debugging output
|
|
* Debug Output:: Saving debugging output
|
|
@end menu
|
|
|
|
@node Dumpdef
|
|
@section Displaying macro definitions
|
|
|
|
@cindex displaying macro definitions
|
|
@cindex macros, displaying definitions
|
|
@cindex definitions, displaying macro
|
|
@cindex standard error, output to
|
|
If you want to see what a name expands into, you can use the builtin
|
|
@code{dumpdef}:
|
|
|
|
@deffn Builtin dumpdef (@ovar{names@dots{}})
|
|
Accepts any number of arguments. If called without any arguments,
|
|
it displays the definitions of all known names, otherwise it displays
|
|
the definitions of the @var{names} given. The output is printed to the
|
|
current debug file (usually standard error), and is sorted by name. If
|
|
an unknown name is encountered, a warning is printed.
|
|
|
|
The expansion of @code{dumpdef} is void.
|
|
@end deffn
|
|
|
|
@example
|
|
$ @kbd{m4 -d}
|
|
define(`foo', `Hello world.')
|
|
@result{}
|
|
dumpdef(`foo')
|
|
@error{}foo:@tabchar{}`Hello world.'
|
|
@result{}
|
|
dumpdef(`define')
|
|
@error{}define:@tabchar{}<define>
|
|
@result{}
|
|
@end example
|
|
|
|
The last example shows how builtin macros definitions are displayed.
|
|
The definition that is dumped corresponds to what would occur if the
|
|
macro were to be called at that point, even if other definitions are
|
|
still live due to redefining a macro during argument collection.
|
|
|
|
@example
|
|
$ @kbd{m4 -d}
|
|
pushdef(`f', ``$0'1')pushdef(`f', ``$0'2')
|
|
@result{}
|
|
f(popdef(`f')dumpdef(`f'))
|
|
@error{}f:@tabchar{}``$0'1'
|
|
@result{}f2
|
|
f(popdef(`f')dumpdef(`f'))
|
|
@error{}m4:stdin:3: undefined macro `f'
|
|
@result{}f1
|
|
@end example
|
|
|
|
@xref{Debug Levels}, for information on controlling the details of the
|
|
display.
|
|
|
|
@node Trace
|
|
@section Tracing macro calls
|
|
|
|
@cindex tracing macro expansion
|
|
@cindex macro expansion, tracing
|
|
@cindex expansion, tracing macro
|
|
@cindex standard error, output to
|
|
It is possible to trace macro calls and expansions through the builtins
|
|
@code{traceon} and @code{traceoff}:
|
|
|
|
@deffn Builtin traceon (@ovar{names@dots{}})
|
|
@deffnx Builtin traceoff (@ovar{names@dots{}})
|
|
When called without any arguments, @code{traceon} and @code{traceoff}
|
|
will turn tracing on and off, respectively, for all currently defined
|
|
macros.
|
|
|
|
When called with arguments, only the macros listed in @var{names} are
|
|
affected, whether or not they are currently defined.
|
|
|
|
The expansion of @code{traceon} and @code{traceoff} is void.
|
|
@end deffn
|
|
|
|
Whenever a traced macro is called and the arguments have been collected,
|
|
the call is displayed. If the expansion of the macro call is not void,
|
|
the expansion can be displayed after the call. The output is printed
|
|
to the current debug file (defaulting to standard error, @pxref{Debug
|
|
Output}).
|
|
|
|
@example
|
|
$ @kbd{m4 -d}
|
|
define(`foo', `Hello World.')
|
|
@result{}
|
|
define(`echo', `$@@')
|
|
@result{}
|
|
traceon(`foo', `echo')
|
|
@result{}
|
|
foo
|
|
@error{}m4trace: -1- foo -> `Hello World.'
|
|
@result{}Hello World.
|
|
echo(`gnus', `and gnats')
|
|
@error{}m4trace: -1- echo(`gnus', `and gnats') -> ``gnus',`and gnats''
|
|
@result{}gnus,and gnats
|
|
@end example
|
|
|
|
The number between dashes is the depth of the expansion. It is one most
|
|
of the time, signifying an expansion at the outermost level, but it
|
|
increases when macro arguments contain unquoted macro calls. The
|
|
maximum number that will appear between dashes is controlled by the
|
|
option @option{--nesting-limit} (or @option{-L}, @pxref{Limits control,
|
|
, Invoking m4}). Additionally, the option @option{--trace} (or
|
|
@option{-t}) can be used to invoke @code{traceon(@var{name})} before
|
|
parsing input.
|
|
|
|
@comment The explicit -dp neutralizes the testsuite default of -d.
|
|
@comment options: -dp -L3 -tifelse
|
|
@comment status: 1
|
|
@example
|
|
$ @kbd{m4 -L 3 -t ifelse}
|
|
ifelse(`one level')
|
|
@error{}m4trace: -1- ifelse
|
|
@result{}
|
|
ifelse(ifelse(ifelse(`three levels')))
|
|
@error{}m4trace: -3- ifelse
|
|
@error{}m4trace: -2- ifelse
|
|
@error{}m4trace: -1- ifelse
|
|
@result{}
|
|
ifelse(ifelse(ifelse(ifelse(`four levels'))))
|
|
@error{}m4:stdin:3: recursion limit of 3 exceeded, use -L<N> to change it
|
|
@end example
|
|
|
|
Tracing by name is an attribute that is preserved whether the macro is
|
|
defined or not. This allows the selection of macros to trace before
|
|
those macros are defined.
|
|
|
|
@example
|
|
$ @kbd{m4 -d}
|
|
traceoff(`foo')
|
|
@result{}
|
|
traceon(`foo')
|
|
@result{}
|
|
foo
|
|
@result{}foo
|
|
defn(`foo')
|
|
@result{}
|
|
define(`foo', `bar')
|
|
@result{}
|
|
foo
|
|
@error{}m4trace: -1- foo -> `bar'
|
|
@result{}bar
|
|
undefine(`foo')
|
|
@result{}
|
|
ifdef(`foo', `yes', `no')
|
|
@result{}no
|
|
indir(`foo')
|
|
@error{}m4:stdin:9: undefined macro `foo'
|
|
@result{}
|
|
define(`foo', `blah')
|
|
@result{}
|
|
foo
|
|
@error{}m4trace: -1- foo -> `blah'
|
|
@result{}blah
|
|
traceoff
|
|
@result{}
|
|
foo
|
|
@result{}blah
|
|
@end example
|
|
|
|
Tracing even works on builtins. However, @code{defn} (@pxref{Defn})
|
|
does not transfer tracing status.
|
|
|
|
@example
|
|
$ @kbd{m4 -d}
|
|
traceon(`traceon')
|
|
@result{}
|
|
traceon(`traceoff')
|
|
@error{}m4trace: -1- traceon(`traceoff')
|
|
@result{}
|
|
traceoff(`traceoff')
|
|
@error{}m4trace: -1- traceoff(`traceoff')
|
|
@result{}
|
|
traceoff(`traceon')
|
|
@result{}
|
|
traceon(`eval', `m4_divnum')
|
|
@result{}
|
|
define(`m4_eval', defn(`eval'))
|
|
@result{}
|
|
define(`m4_divnum', defn(`divnum'))
|
|
@result{}
|
|
eval(divnum)
|
|
@error{}m4trace: -1- eval(`0') -> `0'
|
|
@result{}0
|
|
m4_eval(m4_divnum)
|
|
@error{}m4trace: -2- m4_divnum -> `0'
|
|
@result{}0
|
|
@end example
|
|
|
|
@xref{Debug Levels}, for information on controlling the details of the
|
|
display. The format of the trace output is not specified by
|
|
POSIX, and varies between implementations of @code{m4}.
|
|
|
|
@ignore
|
|
@comment not worth including in the manual, but this tests a trace code
|
|
@comment path that was temporarily broken
|
|
@comment options: -de --trace ifelse
|
|
@example
|
|
$ @kbd{m4 -de --trace ifelse}
|
|
define(`e', `ifelse(`$1', `$2', `ifelse(`$1', `$2', `e(shift($@@))')')')
|
|
@result{}
|
|
e(`1', `1')
|
|
@error{}m4trace: -1- ifelse -> ifelse(`1', `1', `e(shift(`1',`1'))')
|
|
@error{}m4trace: -1- ifelse -> e(shift(`1',`1'))
|
|
@error{}m4trace: -1- ifelse
|
|
@result{}
|
|
@end example
|
|
@end ignore
|
|
|
|
@node Debug Levels
|
|
@section Controlling debugging output
|
|
|
|
@cindex controlling debugging output
|
|
@cindex debugging output, controlling
|
|
The @option{-d} option to @code{m4} (or @option{--debug},
|
|
@pxref{Debugging options, , Invoking m4}) controls the amount of details
|
|
presented in three
|
|
categories of output. Trace output is requested by @code{traceon}
|
|
(@pxref{Trace}), and each line is prefixed by @samp{m4trace:} in
|
|
relation to a macro invocation. Debug output tracks useful events not
|
|
associated with a macro invocation, and each line is prefixed by
|
|
@samp{m4debug:}. Finally, @code{dumpdef} (@pxref{Dumpdef}) output is
|
|
affected, with no prefix added to the output lines.
|
|
|
|
The @var{flags} following the option can be one or more of the
|
|
following:
|
|
|
|
@table @code
|
|
@item a
|
|
In trace output, show the actual arguments that were collected before
|
|
invoking the macro. This applies to all macro calls if the @samp{t}
|
|
flag is used, otherwise only the macros covered by calls of
|
|
@code{traceon}. Arguments are subject to length truncation specified by
|
|
the command line option @option{--arglength} (or @option{-l}).
|
|
|
|
@item c
|
|
In trace output, show several trace lines for each macro call. A line
|
|
is shown when the macro is seen, but before the arguments are collected;
|
|
a second line when the arguments have been collected and a third line
|
|
after the call has completed.
|
|
|
|
@item e
|
|
In trace output, show the expansion of each macro call, if it is not
|
|
void. This applies to all macro calls if the @samp{t} flag is used,
|
|
otherwise only the macros covered by calls of @code{traceon}. The
|
|
expansion is subject to length truncation specified by the command line
|
|
option @option{--arglength} (or @option{-l}).
|
|
|
|
@item f
|
|
In debug and trace output, include the name of the current input file in
|
|
the output line.
|
|
|
|
@item i
|
|
In debug output, print a message each time the current input file is
|
|
changed.
|
|
|
|
@item l
|
|
In debug and trace output, include the current input line number in the
|
|
output line.
|
|
|
|
@item p
|
|
In debug output, print a message when a named file is found through the
|
|
path search mechanism (@pxref{Search Path}), giving the actual file name
|
|
used.
|
|
|
|
@item q
|
|
In trace and dumpdef output, quote actual arguments and macro expansions
|
|
in the display with the current quotes. This is useful in connection
|
|
with the @samp{a} and @samp{e} flags above.
|
|
|
|
@item t
|
|
In trace output, trace all macro calls made in this invocation of
|
|
@code{m4}, regardless of the settings of @code{traceon}.
|
|
|
|
@item x
|
|
In trace output, add a unique `macro call id' to each line of the trace
|
|
output. This is useful in connection with the @samp{c} flag above.
|
|
|
|
@item V
|
|
A shorthand for all of the above flags.
|
|
@end table
|
|
|
|
If no flags are specified with the @option{-d} option, the default is
|
|
@samp{aeq}. The examples throughout this manual assume the default
|
|
flags.
|
|
|
|
@cindex GNU extensions
|
|
There is a builtin macro @code{debugmode}, which allows on-the-fly control of
|
|
the debugging output format:
|
|
|
|
@deffn Builtin debugmode (@ovar{flags})
|
|
The argument @var{flags} should be a subset of the letters listed above.
|
|
As special cases, if the argument starts with a @samp{+}, the flags are
|
|
added to the current debug flags, and if it starts with a @samp{-}, they
|
|
are removed. If no argument is present, all debugging flags are cleared
|
|
(as if no @option{-d} was given), and with an empty argument the flags
|
|
are reset to the default of @samp{aeq}.
|
|
|
|
The expansion of @code{debugmode} is void.
|
|
@end deffn
|
|
|
|
@comment The explicit -dp neutralizes the testsuite default of -d.
|
|
@comment options: -dp
|
|
@example
|
|
$ @kbd{m4}
|
|
define(`foo', `FOO')
|
|
@result{}
|
|
traceon(`foo')
|
|
@result{}
|
|
debugmode()
|
|
@result{}
|
|
foo
|
|
@error{}m4trace: -1- foo -> `FOO'
|
|
@result{}FOO
|
|
debugmode
|
|
@result{}
|
|
foo
|
|
@error{}m4trace: -1- foo
|
|
@result{}FOO
|
|
debugmode(`+l')
|
|
@result{}
|
|
foo
|
|
@error{}m4trace:8: -1- foo
|
|
@result{}FOO
|
|
@end example
|
|
|
|
The following example demonstrates the behavior of length truncation,
|
|
when specified on the command line. Note that each argument and the
|
|
final result are individually truncated. Also, the special tokens for
|
|
builtin functions are not truncated.
|
|
|
|
@comment options: -l6
|
|
@example
|
|
$ @kbd{m4 -d -l 6}
|
|
define(`echo', `$@@')debugmode(`+t')
|
|
@result{}
|
|
echo(`1', `long string')
|
|
@error{}m4trace: -1- echo(`1', `long s...') -> ``1',`l...'
|
|
@result{}1,long string
|
|
indir(`echo', defn(`changequote'))
|
|
@error{}m4trace: -2- defn(`change...')
|
|
@error{}m4trace: -1- indir(`echo', <changequote>) -> ``''
|
|
@result{}
|
|
@end example
|
|
|
|
This example shows the effects of the debug flags that are not related
|
|
to macro tracing.
|
|
|
|
@comment examples
|
|
@comment options: -dip
|
|
@example
|
|
$ @kbd{m4 -dip -I examples}
|
|
@error{}m4debug: input read from stdin
|
|
include(`foo')dnl
|
|
@error{}m4debug: path search for `foo' found `examples/foo'
|
|
@error{}m4debug: input read from examples/foo
|
|
@result{}bar
|
|
@error{}m4debug: input reverted to stdin, line 1
|
|
^D
|
|
@error{}m4debug: input exhausted
|
|
@end example
|
|
|
|
@node Debug Output
|
|
@section Saving debugging output
|
|
|
|
@cindex saving debugging output
|
|
@cindex debugging output, saving
|
|
@cindex output, saving debugging
|
|
@cindex GNU extensions
|
|
Debug and tracing output can be redirected to files using either the
|
|
@option{--debugfile} option to @code{m4} (@pxref{Debugging options, ,
|
|
Invoking m4}), or with the builtin macro @code{debugfile}:
|
|
|
|
@deffn Builtin debugfile (@ovar{file})
|
|
Sends all further debug and trace output to @var{file}, opened in append
|
|
mode. If @var{file} is the empty string, debug and trace output are
|
|
discarded. If @code{debugfile} is called without any arguments, debug
|
|
and trace output are sent to standard error. This does not affect
|
|
warnings, error messages, or @code{errprint} output, which are
|
|
always sent to standard error. If @var{file} cannot be opened, the
|
|
current debug file is unchanged, and an error is issued.
|
|
|
|
The expansion of @code{debugfile} is void.
|
|
@end deffn
|
|
|
|
@example
|
|
$ @kbd{m4 -d}
|
|
traceon(`divnum')
|
|
@result{}
|
|
divnum(`extra')
|
|
@error{}m4:stdin:2: Warning: excess arguments to builtin `divnum' ignored
|
|
@error{}m4trace: -1- divnum(`extra') -> `0'
|
|
@result{}0
|
|
debugfile()
|
|
@result{}
|
|
divnum(`extra')
|
|
@error{}m4:stdin:4: Warning: excess arguments to builtin `divnum' ignored
|
|
@result{}0
|
|
debugfile
|
|
@result{}
|
|
divnum
|
|
@error{}m4trace: -1- divnum -> `0'
|
|
@result{}0
|
|
@end example
|
|
|
|
@node Input Control
|
|
@chapter Input control
|
|
|
|
This chapter describes various builtin macros for controlling the input
|
|
to @code{m4}.
|
|
|
|
@menu
|
|
* Dnl:: Deleting whitespace in input
|
|
* Changequote:: Changing the quote characters
|
|
* Changecom:: Changing the comment delimiters
|
|
* Changeword:: Changing the lexical structure of words
|
|
* M4wrap:: Saving text until end of input
|
|
@end menu
|
|
|
|
@node Dnl
|
|
@section Deleting whitespace in input
|
|
|
|
@cindex deleting whitespace in input
|
|
@cindex discarding input
|
|
@cindex input, discarding
|
|
The builtin @code{dnl} stands for ``Discard to Next Line'':
|
|
|
|
@deffn Builtin dnl
|
|
All characters, up to and including the next newline, are discarded
|
|
without performing any macro expansion. A warning is issued if the end
|
|
of the file is encountered without a newline.
|
|
|
|
The expansion of @code{dnl} is void.
|
|
@end deffn
|
|
|
|
It is often used in connection with @code{define}, to remove the
|
|
newline that follows the call to @code{define}. Thus
|
|
|
|
@example
|
|
define(`foo', `Macro `foo'.')dnl A very simple macro, indeed.
|
|
foo
|
|
@result{}Macro foo.
|
|
@end example
|
|
|
|
The input up to and including the next newline is discarded, as opposed
|
|
to the way comments are treated (@pxref{Comments}).
|
|
|
|
Usually, @code{dnl} is immediately followed by an end of line or some
|
|
other whitespace. GNU @code{m4} will produce a warning diagnostic if
|
|
@code{dnl} is followed by an open parenthesis. In this case, @code{dnl}
|
|
will collect and process all arguments, looking for a matching close
|
|
parenthesis. All predictable side effects resulting from this
|
|
collection will take place. @code{dnl} will return no output. The
|
|
input following the matching close parenthesis up to and including the
|
|
next newline, on whatever line containing it, will still be discarded.
|
|
|
|
@example
|
|
dnl(`args are ignored, but side effects occur',
|
|
define(`foo', `like this')) while this text is ignored: undefine(`foo')
|
|
@error{}m4:stdin:1: Warning: excess arguments to builtin `dnl' ignored
|
|
See how `foo' was defined, foo?
|
|
@result{}See how foo was defined, like this?
|
|
@end example
|
|
|
|
If the end of file is encountered without a newline character, a
|
|
warning is issued and dnl stops consuming input.
|
|
|
|
@example
|
|
m4wrap(`m4wrap(`2 hi
|
|
')0 hi dnl 1 hi')
|
|
@result{}
|
|
define(`hi', `HI')
|
|
@result{}
|
|
^D
|
|
@error{}m4:stdin:1: Warning: end of file treated as newline
|
|
@result{}0 HI 2 HI
|
|
@end example
|
|
|
|
@node Changequote
|
|
@section Changing the quote characters
|
|
|
|
@cindex changing quote delimiters
|
|
@cindex quote delimiters, changing
|
|
@cindex delimiters, changing
|
|
The default quote delimiters can be changed with the builtin
|
|
@code{changequote}:
|
|
|
|
@deffn Builtin changequote (@dvar{start, `}, @dvar{end, '})
|
|
This sets @var{start} as the new begin-quote delimiter and @var{end} as
|
|
the new end-quote delimiter. If both arguments are missing, the default
|
|
quotes (@code{`} and @code{'}) are used. If @var{start} is void, then
|
|
quoting is disabled. Otherwise, if @var{end} is missing or void, the
|
|
default end-quote delimiter (@code{'}) is used. The quote delimiters
|
|
can be of any length.
|
|
|
|
The expansion of @code{changequote} is void.
|
|
@end deffn
|
|
|
|
@example
|
|
changequote(`[', `]')
|
|
@result{}
|
|
define([foo], [Macro [foo].])
|
|
@result{}
|
|
foo
|
|
@result{}Macro foo.
|
|
@end example
|
|
|
|
The quotation strings can safely contain non-@sc{ascii} characters.
|
|
|
|
@example
|
|
define(`a', `b')
|
|
@result{}
|
|
«a»
|
|
@result{}«b»
|
|
changequote(`«', `»')
|
|
@result{}
|
|
«a»
|
|
@result{}a
|
|
@end example
|
|
|
|
If no single character is appropriate, @var{start} and @var{end} can be
|
|
of any length. Other implementations cap the delimiter length to five
|
|
characters, but GNU has no inherent limit.
|
|
|
|
@example
|
|
changequote(`[[[', `]]]')
|
|
@result{}
|
|
define([[[foo]]], [[[Macro [[[[[foo]]]]].]]])
|
|
@result{}
|
|
foo
|
|
@result{}Macro [[foo]].
|
|
@end example
|
|
|
|
Calling @code{changequote} with @var{start} as the empty string will
|
|
effectively disable the quoting mechanism, leaving no way to quote text.
|
|
However, using an empty string is not portable, as some other
|
|
implementations of @code{m4} revert to the default quoting, while others
|
|
preserve the prior non-empty delimiter. If @var{start} is not empty,
|
|
then an empty @var{end} will use the default end-quote delimiter of
|
|
@samp{'}, as otherwise, it would be impossible to end a quoted string.
|
|
Again, this is not portable, as some other @code{m4} implementations
|
|
reuse @var{start} as the end-quote delimiter, while others preserve the
|
|
previous non-empty value. Omitting both arguments restores the default
|
|
begin-quote and end-quote delimiters; fortunately this behavior is
|
|
portable to all implementations of @code{m4}.
|
|
|
|
@example
|
|
define(`foo', `Macro `FOO'.')
|
|
@result{}
|
|
changequote(`', `')
|
|
@result{}
|
|
foo
|
|
@result{}Macro `FOO'.
|
|
`foo'
|
|
@result{}`Macro `FOO'.'
|
|
changequote(`,)
|
|
@result{}
|
|
foo
|
|
@result{}Macro FOO.
|
|
@end example
|
|
|
|
There is no way in @code{m4} to quote a string containing an unmatched
|
|
begin-quote, except using @code{changequote} to change the current
|
|
quotes.
|
|
|
|
If the quotes should be changed from, say, @samp{[} to @samp{[[},
|
|
temporary quote characters have to be defined. To achieve this, two
|
|
calls of @code{changequote} must be made, one for the temporary quotes
|
|
and one for the new quotes.
|
|
|
|
The following is an example of how to use @code{changequote} to output
|
|
what would normally be an unmatched quote string:
|
|
|
|
@deffn Composite lquo (@var{ignored@dots{}})
|
|
@deffnx Composite rquo (@var{ignored@dots{}})
|
|
Output the normal left or right quote string.
|
|
@end deffn
|
|
|
|
@example
|
|
define(`lquo', `ifelse(`$#', `0', ``$0'', `changequote(`[', `]')`dnl'
|
|
changequote([`], ['])')')
|
|
@result{}
|
|
define(`rquo', `ifelse(`$#', `0', ``$0'', `changequote(`[', `]')dnl`
|
|
'changequote([`], ['])')')
|
|
@result{}
|
|
Quotes in reverse: rquo:rquo() lquo:lquo()
|
|
@result{}Quotes in reverse: rquo:' lquo:`
|
|
define(`hi', `HELLO')dnl
|
|
lquo()hi`'rquo()
|
|
@result{}`HELLO'
|
|
substr(`-hi-', 1, 2) substr(`-`hi'-', `1', `4')
|
|
@result{}HELLO hi
|
|
substr(`-'lquo()`hi'rquo()`-', `1', `4')
|
|
@result{}hi
|
|
substr(`-'lquo()hi`'rquo()`-', `1', `2')rquo()'
|
|
@result{}Hrquo()
|
|
@end example
|
|
|
|
The example chose to require an ignored parameter so that @code{lquo} or
|
|
@code{rquo} are not recognized without @code{()}; but the use of
|
|
@code{ifelse} to make the macros blind is not strictly needed. On the
|
|
other hand, the use of @code{dnl} in the macro bodies was essential for
|
|
proper quote nesting during the @code{define}. Note that the output of
|
|
these macros do not directly behave as quote strings; however, any
|
|
context where expanded text is rescanned back in the normal choice of
|
|
quote strings does not care if the quotes were supplied literally or via
|
|
these macros.
|
|
|
|
Macros are recognized in preference to the begin-quote string, so if a
|
|
prefix of @var{start} can be recognized as part of a potential macro
|
|
name, the quoting mechanism is effectively disabled. Unless you use
|
|
@code{changeword} (@pxref{Changeword}), this means that @var{start}
|
|
should not begin with a letter, digit, or @samp{_} (underscore).
|
|
However, even though quoted strings are not recognized, the quote
|
|
characters can still be discerned in macro expansion and in trace
|
|
output.
|
|
|
|
@example
|
|
define(`echo', `$@@')
|
|
@result{}
|
|
define(`hi', `HI')
|
|
@result{}
|
|
changequote(`q', `Q')
|
|
@result{}
|
|
q hi Q hi
|
|
@result{}q HI Q HI
|
|
echo(hi)
|
|
@result{}qHIQ
|
|
changequote
|
|
@result{}
|
|
changequote(`-', `EOF')
|
|
@result{}
|
|
- hi EOF hi
|
|
@result{} hi HI
|
|
changequote
|
|
@result{}
|
|
changequote(`1', `2')
|
|
@result{}
|
|
hi1hi2
|
|
@result{}hi1hi2
|
|
hi 1hi2
|
|
@result{}HI hi
|
|
@end example
|
|
|
|
Quotes are recognized in preference to argument collection. In
|
|
particular, if @var{start} is a single @samp{(}, then argument
|
|
collection is effectively disabled. For portability with other
|
|
implementations, it is a good idea to avoid @samp{(}, @samp{,}, and
|
|
@samp{)} as the first character in @var{start}.
|
|
|
|
@example
|
|
define(`echo', `$#:$@@:')
|
|
@result{}
|
|
define(`hi', `HI')
|
|
@result{}
|
|
changequote(`(',`)')
|
|
@result{}
|
|
echo(hi)
|
|
@result{}0::hi
|
|
changequote
|
|
@result{}
|
|
changequote(`((', `))')
|
|
@result{}
|
|
echo(hi)
|
|
@result{}1:HI:
|
|
echo((hi))
|
|
@result{}0::hi
|
|
changequote
|
|
@result{}
|
|
changequote(`,', `)')
|
|
@result{}
|
|
echo(hi,hi)bye)
|
|
@result{}1:HIhibye:
|
|
@end example
|
|
|
|
However, if you are not worried about portability, using @samp{(} and
|
|
@samp{)} as quoting characters has an interesting property---you can use
|
|
it to compute a quoted string containing the expansion of any quoted
|
|
text, as long as the expansion results in both balanced quotes and
|
|
balanced parentheses. The trick is realizing @code{expand} uses
|
|
@samp{$1} unquoted, to trigger its expansion using the normal quoting
|
|
characters, but uses extra parentheses to group unquoted commas that
|
|
occur in the expansion without consuming whitespace following those
|
|
commas. Then @code{_expand} uses @code{changequote} to convert the
|
|
extra parentheses back into quoting characters. Note that it takes two
|
|
more @code{changequote} invocations to restore the original quotes.
|
|
Contrast the behavior on whitespace when using @samp{$*}, via
|
|
@code{quote}, to attempt the same task.
|
|
|
|
@example
|
|
changequote(`[', `]')dnl
|
|
define([a], [1, (b)])dnl
|
|
define([b], [2])dnl
|
|
define([quote], [[$*]])dnl
|
|
define([expand], [_$0(($1))])dnl
|
|
define([_expand],
|
|
[changequote([(], [)])$1changequote`'changequote(`[', `]')])dnl
|
|
expand([a, a, [a, a], [[a, a]]])
|
|
@result{}1, (2), 1, (2), a, a, [a, a]
|
|
quote(a, a, [a, a], [[a, a]])
|
|
@result{}1,(2),1,(2),a, a,[a, a]
|
|
@end example
|
|
|
|
If @var{end} is a prefix of @var{start}, the end-quote will be
|
|
recognized in preference to a nested begin-quote. In particular,
|
|
changing the quotes to have the same string for @var{start} and
|
|
@var{end} disables nesting of quotes. When quote nesting is disabled,
|
|
it is impossible to double-quote strings across macro expansions, so
|
|
using the same string is not done very often.
|
|
|
|
@example
|
|
define(`hi', `HI')
|
|
@result{}
|
|
changequote(`""', `"')
|
|
@result{}
|
|
""hi"""hi"
|
|
@result{}hihi
|
|
""hi" ""hi"
|
|
@result{}hi hi
|
|
""hi"" "hi"
|
|
@result{}hi" "HI"
|
|
changequote
|
|
@result{}
|
|
`hi`hi'hi'
|
|
@result{}hi`hi'hi
|
|
changequote(`"', `"')
|
|
@result{}
|
|
"hi"hi"hi"
|
|
@result{}hiHIhi
|
|
@end example
|
|
|
|
During macro expansion, instances of @code{$@@} in the macro's
|
|
definition will use the quotation strings that are in effect at the end
|
|
of argument collection, even if this is different than the quotation
|
|
strings in effect when the macro was defined. When combined with
|
|
@code{translit} (@pxref{Translit}), this can be exploited for splitting
|
|
a string containing multiple instances of a single-byte separator into a
|
|
macro call on each token of the string with linear scaling (a naive loop
|
|
that uses @code{index} to search for the first instance of the
|
|
separator, followed by @code{substr} to process the rest of the input
|
|
string, scales quadratically, since each iteration of the loop can only
|
|
process one substring before re-processing an average of half of the
|
|
overall input to get to the next substring, instead of getting at all
|
|
substrings in a single pass). However, note that this trick cannot
|
|
prevent premature expansion of tokens within the string.
|
|
|
|
@example
|
|
define(`some', `several')
|
|
@result{}
|
|
define(`text', `long.string.with.some.separators')
|
|
@result{}
|
|
define(`display', ``<$1>'')
|
|
@result{}
|
|
display(text)
|
|
@result{}<long.string.with.several.separators>
|
|
display(defn(`text'))
|
|
@result{}<long.string.with.some.separators>
|
|
dnl quotes are still `' at the time requote is defined:
|
|
define(`requote', `"[$@@]')
|
|
@result{}
|
|
define(`tokenized', requote(translit(defn(`text'), `.',
|
|
changequote(`"[', `]')"[,])))
|
|
@result{}
|
|
dnl but quotes are "[] at the time $@@ in requote is computed
|
|
changequote
|
|
@result{}
|
|
dnl quotes are back to `', yet the content of tokenized still has "[]
|
|
dnl however, this form of tokenizing already expanded "some"
|
|
tokenized
|
|
@result{}"[long],"[string],"[with],"[several],"[separators]
|
|
define(`a', ` display(`$1')')
|
|
@result{}
|
|
dnl now it is possible to call `a' on each token
|
|
translit(defn(`tokenized'), `"[],', `a()')
|
|
@result{} <long> <string> <with> <several> <separators>
|
|
@end example
|
|
|
|
@ignore
|
|
@comment And another stress test, not worth documenting in the manual.
|
|
@example
|
|
define(`aaaaaaaaaaaaaaaaaaaa', `A')define(`q', `"$@@"')
|
|
@result{}
|
|
changequote(`"', `"')
|
|
@result{}
|
|
q(q("aaaaaaaaaaaaaaaaaaaa", "a"))
|
|
@result{}A,a
|
|
@end example
|
|
@end ignore
|
|
|
|
It is an error if the end of file occurs within a quoted string.
|
|
|
|
@comment status: 1
|
|
@example
|
|
`hello world'
|
|
@result{}hello world
|
|
`dangling quote
|
|
^D
|
|
@error{}m4:stdin:2: ERROR: end of file in string
|
|
@end example
|
|
|
|
@comment status: 1
|
|
@example
|
|
ifelse(`dangling quote
|
|
^D
|
|
@error{}m4:stdin:1: ERROR: end of file in string
|
|
@end example
|
|
|
|
@node Changecom
|
|
@section Changing the comment delimiters
|
|
|
|
@cindex changing comment delimiters
|
|
@cindex comment delimiters, changing
|
|
@cindex delimiters, changing
|
|
The default comment delimiters can be changed with the builtin
|
|
macro @code{changecom}:
|
|
|
|
@deffn Builtin changecom (@ovar{start}, @dvar{end, @key{NL}})
|
|
This sets @var{start} as the new begin-comment delimiter and @var{end}
|
|
as the new end-comment delimiter. If both arguments are missing, or
|
|
@var{start} is void, then comments are disabled. Otherwise, if
|
|
@var{end} is missing or void, the default end-comment delimiter of
|
|
newline is used. The comment delimiters can be of any length.
|
|
|
|
The expansion of @code{changecom} is void.
|
|
@end deffn
|
|
|
|
@example
|
|
define(`comment', `COMMENT')
|
|
@result{}
|
|
# A normal comment
|
|
@result{}# A normal comment
|
|
changecom(`/*', `*/')
|
|
@result{}
|
|
# Not a comment anymore
|
|
@result{}# Not a COMMENT anymore
|
|
But: /* this is a comment now */ while this is not a comment
|
|
@result{}But: /* this is a comment now */ while this is not a COMMENT
|
|
@end example
|
|
|
|
@cindex comments, copied to output
|
|
Note how comments are copied to the output, much as if they were quoted
|
|
strings. If you want the text inside a comment expanded, quote the
|
|
begin-comment delimiter.
|
|
|
|
Calling @code{changecom} without any arguments, or with @var{start} as
|
|
the empty string, will effectively disable the commenting mechanism. To
|
|
restore the original comment start of @samp{#}, you must explicitly ask
|
|
for it. If @var{start} is not empty, then an empty @var{end} will use
|
|
the default end-comment delimiter of newline, as otherwise, it would be
|
|
impossible to end a comment. However, this is not portable, as some
|
|
other @code{m4} implementations preserve the previous non-empty
|
|
delimiters instead.
|
|
|
|
@example
|
|
define(`comment', `COMMENT')
|
|
@result{}
|
|
changecom
|
|
@result{}
|
|
# Not a comment anymore
|
|
@result{}# Not a COMMENT anymore
|
|
changecom(`#', `')
|
|
@result{}
|
|
# comment again
|
|
@result{}# comment again
|
|
@end example
|
|
|
|
The comment strings can safely contain non-@sc{ascii} characters.
|
|
|
|
@example
|
|
define(`a', `b')
|
|
@result{}
|
|
«a»
|
|
@result{}«b»
|
|
changecom(`«', `»')
|
|
@result{}
|
|
«a»
|
|
@result{}«a»
|
|
@end example
|
|
|
|
If no single character is appropriate, @var{start} and @var{end} can be
|
|
of any length. Other implementations cap the delimiter length to five
|
|
characters, but GNU has no inherent limit.
|
|
|
|
Comments are recognized in preference to macros. However, this is not
|
|
compatible with other implementations, where macros and even quoting
|
|
takes precedence over comments, so it may change in a future release.
|
|
For portability, this means that @var{start} should not begin with a
|
|
letter, digit, or @samp{_} (underscore), and that neither the
|
|
start-quote nor the start-comment string should be a prefix of the
|
|
other.
|
|
|
|
@example
|
|
define(`hi', `HI')
|
|
@result{}
|
|
define(`hi1hi2', `hello')
|
|
@result{}
|
|
changecom(`q', `Q')
|
|
@result{}
|
|
q hi Q hi
|
|
@result{}q hi Q HI
|
|
changecom(`1', `2')
|
|
@result{}
|
|
hi1hi2
|
|
@result{}hello
|
|
hi 1hi2
|
|
@result{}HI 1hi2
|
|
@end example
|
|
|
|
Comments are recognized in preference to argument collection. In
|
|
particular, if @var{start} is a single @samp{(}, then argument
|
|
collection is effectively disabled. For portability with other
|
|
implementations, it is a good idea to avoid @samp{(}, @samp{,}, and
|
|
@samp{)} as the first character in @var{start}.
|
|
|
|
@example
|
|
define(`echo', `$#:$*:$@@:')
|
|
@result{}
|
|
define(`hi', `HI')
|
|
@result{}
|
|
changecom(`(',`)')
|
|
@result{}
|
|
echo(hi)
|
|
@result{}0:::(hi)
|
|
changecom
|
|
@result{}
|
|
changecom(`((', `))')
|
|
@result{}
|
|
echo(hi)
|
|
@result{}1:HI:HI:
|
|
echo((hi))
|
|
@result{}0:::((hi))
|
|
changecom(`,', `)')
|
|
@result{}
|
|
echo(hi,hi)bye)
|
|
@result{}1:HI,hi)bye:HI,hi)bye:
|
|
changecom
|
|
@result{}
|
|
echo(hi,`,`'hi',hi)
|
|
@result{}3:HI,,HI,HI:HI,,`'hi,HI:
|
|
echo(hi,`,`'hi',hi`'changecom(`,,', `hi'))
|
|
@result{}3:HI,,`'hi,HI:HI,,`'hi,HI:
|
|
@end example
|
|
|
|
It is an error if the end of file occurs within a comment.
|
|
|
|
@comment status: 1
|
|
@example
|
|
changecom(`/*', `*/')
|
|
@result{}
|
|
/*dangling comment
|
|
^D
|
|
@error{}m4:stdin:2: ERROR: end of file in comment
|
|
@end example
|
|
|
|
@node Changeword
|
|
@section Changing the lexical structure of words
|
|
|
|
@cindex lexical structure of words
|
|
@cindex words, lexical structure of
|
|
@cindex syntax, changing
|
|
@cindex changing syntax
|
|
@cindex regular expressions
|
|
@quotation
|
|
The macro @code{changeword} and all associated functionality is
|
|
experimental. It is only available if the @option{--enable-changeword}
|
|
option was given to @command{configure}, at GNU @code{m4}
|
|
installation
|
|
time. The functionality will go away in the future, to be replaced by
|
|
other new features that are more efficient at providing the same
|
|
capabilities. @emph{Do not rely on it}. Please direct your comments
|
|
about it the same way you would do for bugs.
|
|
@end quotation
|
|
|
|
A file being processed by @code{m4} is split into quoted strings, words
|
|
(potential macro names) and simple tokens (any other single character).
|
|
Initially a word is defined by the following regular expression:
|
|
|
|
@comment ignore
|
|
@example
|
|
[_a-zA-Z][_a-zA-Z0-9]*
|
|
@end example
|
|
|
|
Using @code{changeword}, you can change this regular expression:
|
|
|
|
@deffn {Optional builtin} changeword (@var{regex})
|
|
Changes the regular expression for recognizing macro names to be
|
|
@var{regex}. If @var{regex} is empty, use
|
|
@samp{[_a-zA-Z][_a-zA-Z0-9]*}. @var{regex} must obey the constraint
|
|
that every prefix of the desired final pattern is also accepted by the
|
|
regular expression. If @var{regex} contains grouping parentheses, the
|
|
macro invoked is the portion that matched the first group, rather than
|
|
the entire matching string.
|
|
|
|
The expansion of @code{changeword} is void.
|
|
The macro @code{changeword} is recognized only with parameters.
|
|
@end deffn
|
|
|
|
Relaxing the lexical rules of @code{m4} might be useful (for example) if
|
|
you wanted to apply translations to a file of numbers:
|
|
|
|
@example
|
|
ifdef(`changeword', `', `errprint(` skipping: no changeword support
|
|
')m4exit(`77')')dnl
|
|
changeword(`[_a-zA-Z0-9]+')
|
|
@result{}
|
|
define(`1', `0')1
|
|
@result{}0
|
|
@end example
|
|
|
|
Tightening the lexical rules is less useful, because it will generally
|
|
make some of the builtins unavailable. You could use it to prevent
|
|
accidental call of builtins, for example:
|
|
|
|
@example
|
|
ifdef(`changeword', `', `errprint(` skipping: no changeword support
|
|
')m4exit(`77')')dnl
|
|
define(`_indir', defn(`indir'))
|
|
@result{}
|
|
changeword(`_[_a-zA-Z0-9]*')
|
|
@result{}
|
|
esyscmd(`foo')
|
|
@result{}esyscmd(foo)
|
|
_indir(`esyscmd', `echo hi')
|
|
@result{}hi
|
|
@result{}
|
|
@end example
|
|
|
|
Because @code{m4} constructs its words a character at a time, there
|
|
is a restriction on the regular expressions that may be passed to
|
|
@code{changeword}. This is that if your regular expression accepts
|
|
@samp{abc}, it must also accept @samp{a} and @samp{ab}.
|
|
|
|
@example
|
|
ifdef(`changeword', `', `errprint(` skipping: no changeword support
|
|
')m4exit(`77')')dnl
|
|
define(`abc
|
|
', `bar
|
|
')
|
|
@result{}
|
|
dnl This example wants to recognize changeword, dnl, and `abc\n'.
|
|
dnl First, we check that our regexp will match.
|
|
regexp(`changeword', `[cd][a-z]*\|abc[
|
|
]')
|
|
@result{}0
|
|
regexp(`abc
|
|
', `[cd][a-z]*\|abc[
|
|
]')
|
|
@result{}0
|
|
regexp(`a', `[cd][a-z]*\|abc[
|
|
]')
|
|
@result{}-1
|
|
abc
|
|
@result{}abc
|
|
changeword(`[cd][a-z]*\|abc[
|
|
]')
|
|
@result{}
|
|
dnl Even though `abc\n' matches, we forgot to allow `ab'.
|
|
abc
|
|
@result{}abc
|
|
changeword(`[cd][a-z]*\|ab?c?[
|
|
]?')
|
|
@result{}
|
|
dnl Now we can call `abc\n'.
|
|
abc
|
|
@result{}bar
|
|
@end example
|
|
|
|
@ignore
|
|
@comment Expose a core dump introduced in 1.4.11.
|
|
@example
|
|
ifdef(`changeword', `', `errprint(` skipping: no changeword support
|
|
')m4exit(`77')')dnl
|
|
define(`foo
|
|
', `bar
|
|
')
|
|
@result{}
|
|
changeword(`[cd][a-z]*\|foo[
|
|
]')foo
|
|
@result{}foo
|
|
@end example
|
|
|
|
@comment One more test of including newline in a macro name; but this
|
|
@comment does not need to be displayed in the manual. This ensures
|
|
@comment that line numbering is correct when dnl cuts across include
|
|
@comment file boundaries, and when __file__ or __line__ is the last
|
|
@comment token in an include file.
|
|
|
|
@example
|
|
ifdef(`changeword', `', `errprint(` skipping: no changeword support
|
|
')m4exit(`77')')dnl
|
|
define(`bar
|
|
', defn(`dnl'))dnl
|
|
define(`baz', `dnl
|
|
include(`foo') ignored
|
|
dnl')dnl
|
|
changeword(`\([_a-zA-Z][_a-zA-Z0-9]*\|bar
|
|
\)')
|
|
@result{}
|
|
__file__:__line__
|
|
@result{}stdin:10
|
|
include(`foo') ignored
|
|
__file__:__line__
|
|
@result{}stdin:12
|
|
baz ignored
|
|
__file__:__line__
|
|
@result{}stdin:14
|
|
define(`bar
|
|
', defn(`__file__'))
|
|
@result{}
|
|
include(`foo')
|
|
@result{}examples/foo
|
|
define(`bar
|
|
', defn(`__line__'))
|
|
@result{}
|
|
include(`foo')
|
|
@result{}1
|
|
__file__:__line__
|
|
@result{}stdin:21
|
|
@end example
|
|
@end ignore
|
|
|
|
@code{changeword} has another function. If the regular expression
|
|
supplied contains any grouped subexpressions, then text outside
|
|
the first of these is discarded before symbol lookup. So:
|
|
|
|
@example
|
|
ifdef(`changeword', `', `errprint(` skipping: no changeword support
|
|
')m4exit(`77')')dnl
|
|
ifdef(`__unix__', ,
|
|
`errprint(` skipping: syscmd does not have unix semantics
|
|
')m4exit(`77')')dnl
|
|
changecom(`/*', `*/')dnl
|
|
define(`foo', `bar')dnl
|
|
changeword(`#\([_a-zA-Z0-9]*\)')
|
|
@result{}
|
|
#esyscmd(`echo foo \#foo')
|
|
@result{}foo bar
|
|
@result{}
|
|
@end example
|
|
|
|
@code{m4} now requires a @samp{#} mark at the beginning of every
|
|
macro invocation, so one can use @code{m4} to preprocess plain
|
|
text without losing various words like @samp{divert}.
|
|
|
|
In @code{m4}, macro substitution is based on text, while in @TeX{}, it
|
|
is based on tokens. @code{changeword} can throw this difference into
|
|
relief. For example, here is the same idea represented in @TeX{} and
|
|
@code{m4}. First, the @TeX{} version:
|
|
|
|
@comment ignore
|
|
@example
|
|
\def\a@{\message@{Hello@}@}
|
|
\catcode`\@@=0
|
|
\catcode`\\=12
|
|
@@a
|
|
@@bye
|
|
@result{}Hello
|
|
@end example
|
|
|
|
@noindent
|
|
Then, the @code{m4} version:
|
|
|
|
@example
|
|
ifdef(`changeword', `', `errprint(` skipping: no changeword support
|
|
')m4exit(`77')')dnl
|
|
define(`a', `errprint(`Hello')')dnl
|
|
changeword(`@@\([_a-zA-Z0-9]*\)')
|
|
@result{}
|
|
@@a
|
|
@result{}errprint(Hello)
|
|
@end example
|
|
|
|
In the @TeX{} example, the first line defines a macro @code{a} to
|
|
print the message @samp{Hello}. The second line defines @key{@@} to
|
|
be usable instead of @key{\} as an escape character. The third line
|
|
defines @key{\} to be a normal printing character, not an escape.
|
|
The fourth line invokes the macro @code{a}. So, when @TeX{} is run
|
|
on this file, it displays the message @samp{Hello}.
|
|
|
|
When the @code{m4} example is passed through @code{m4}, it outputs
|
|
@samp{errprint(Hello)}. The reason for this is that @TeX{} does
|
|
lexical analysis of macro definition when the macro is @emph{defined}.
|
|
@code{m4} just stores the text, postponing the lexical analysis until
|
|
the macro is @emph{used}.
|
|
|
|
You should note that using @code{changeword} will slow @code{m4} down
|
|
by a factor of about seven, once it is changed to something other
|
|
than the default regular expression. You can invoke @code{changeword}
|
|
with the empty string to restore the default word definition, and regain
|
|
the parsing speed.
|
|
|
|
@node M4wrap
|
|
@section Saving text until end of input
|
|
|
|
@cindex saving input
|
|
@cindex input, saving
|
|
@cindex deferring expansion
|
|
@cindex expansion, deferring
|
|
It is possible to `save' some text until the end of the normal input has
|
|
been seen. Text can be saved, to be read again by @code{m4} when the
|
|
normal input has been exhausted. This feature is normally used to
|
|
initiate cleanup actions before normal exit, e.g., deleting temporary
|
|
files.
|
|
|
|
To save input text, use the builtin @code{m4wrap}:
|
|
|
|
@deffn Builtin m4wrap (@var{string}, @dots{})
|
|
Stores @var{string} in a safe place, to be reread when end of input is
|
|
reached. As a GNU extension, additional arguments are
|
|
concatenated with a space to the @var{string}.
|
|
|
|
The expansion of @code{m4wrap} is void.
|
|
The macro @code{m4wrap} is recognized only with parameters.
|
|
@end deffn
|
|
|
|
@example
|
|
define(`cleanup', `This is the `cleanup' action.
|
|
')
|
|
@result{}
|
|
m4wrap(`cleanup')
|
|
@result{}
|
|
This is the first and last normal input line.
|
|
@result{}This is the first and last normal input line.
|
|
^D
|
|
@result{}This is the cleanup action.
|
|
@end example
|
|
|
|
The saved input is only reread when the end of normal input is seen, and
|
|
not if @code{m4exit} is used to exit @code{m4}.
|
|
|
|
@comment FIXME: this contradicts POSIX, which requires that "If the
|
|
@comment m4wrap macro is used multiple times, the arguments specified
|
|
@comment shall be processed in the order in which the m4wrap macros were
|
|
@comment processed."
|
|
It is safe to call @code{m4wrap} from saved text, but then the order in
|
|
which the saved text is reread is undefined. If @code{m4wrap} is not used
|
|
recursively, the saved pieces of text are reread in the opposite order
|
|
in which they were saved (LIFO---last in, first out). However, this
|
|
behavior is likely to change in a future release, to match
|
|
POSIX, so you should not depend on this order.
|
|
|
|
It is possible to emulate POSIX behavior even
|
|
with older versions of GNU M4 by including the file
|
|
@file{m4-@value{VERSION}/@/examples/@/wrapfifo.m4} from the
|
|
distribution:
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
undivert(`wrapfifo.m4')dnl
|
|
@result{}dnl Redefine m4wrap to have FIFO semantics.
|
|
@result{}define(`_m4wrap_level', `0')dnl
|
|
@result{}define(`m4wrap',
|
|
@result{}`ifdef(`m4wrap'_m4wrap_level,
|
|
@result{} `define(`m4wrap'_m4wrap_level,
|
|
@result{} defn(`m4wrap'_m4wrap_level)`$1')',
|
|
@result{} `builtin(`m4wrap', `define(`_m4wrap_level',
|
|
@result{} incr(_m4wrap_level))dnl
|
|
@result{}m4wrap'_m4wrap_level)dnl
|
|
@result{}define(`m4wrap'_m4wrap_level, `$1')')')dnl
|
|
include(`wrapfifo.m4')
|
|
@result{}
|
|
m4wrap(`a`'m4wrap(`c
|
|
', `d')')m4wrap(`b')
|
|
@result{}
|
|
^D
|
|
@result{}abc
|
|
@end example
|
|
|
|
It is likewise possible to emulate LIFO behavior without resorting to
|
|
the GNU M4 extension of @code{builtin}, by including the file
|
|
@file{m4-@value{VERSION}/@/examples/@/wraplifo.m4} from the
|
|
distribution. (Unfortunately, both examples shown here share some
|
|
subtle bugs. See if you can find and correct them; or @pxref{Improved
|
|
m4wrap, , Answers}).
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
undivert(`wraplifo.m4')dnl
|
|
@result{}dnl Redefine m4wrap to have LIFO semantics.
|
|
@result{}define(`_m4wrap_level', `0')dnl
|
|
@result{}define(`_m4wrap', defn(`m4wrap'))dnl
|
|
@result{}define(`m4wrap',
|
|
@result{}`ifdef(`m4wrap'_m4wrap_level,
|
|
@result{} `define(`m4wrap'_m4wrap_level,
|
|
@result{} `$1'defn(`m4wrap'_m4wrap_level))',
|
|
@result{} `_m4wrap(`define(`_m4wrap_level', incr(_m4wrap_level))dnl
|
|
@result{}m4wrap'_m4wrap_level)dnl
|
|
@result{}define(`m4wrap'_m4wrap_level, `$1')')')dnl
|
|
include(`wraplifo.m4')
|
|
@result{}
|
|
m4wrap(`a`'m4wrap(`c
|
|
', `d')')m4wrap(`b')
|
|
@result{}
|
|
^D
|
|
@result{}bac
|
|
@end example
|
|
|
|
Here is an example of implementing a factorial function using
|
|
@code{m4wrap}:
|
|
|
|
@example
|
|
define(`f', `ifelse(`$1', `0', `Answer: 0!=1
|
|
', eval(`$1>1'), `0', `Answer: $2$1=eval(`$2$1')
|
|
', `m4wrap(`f(decr(`$1'), `$2$1*')')')')
|
|
@result{}
|
|
f(`10')
|
|
@result{}
|
|
^D
|
|
@result{}Answer: 10*9*8*7*6*5*4*3*2*1=3628800
|
|
@end example
|
|
|
|
Invocations of @code{m4wrap} at the same recursion level are
|
|
concatenated and rescanned as usual:
|
|
|
|
@example
|
|
define(`aa', `AA
|
|
')
|
|
@result{}
|
|
m4wrap(`a')m4wrap(`a')
|
|
@result{}
|
|
^D
|
|
@result{}AA
|
|
@end example
|
|
|
|
@noindent
|
|
however, the transition between recursion levels behaves like an end of
|
|
file condition between two input files.
|
|
|
|
@comment status: 1
|
|
@example
|
|
m4wrap(`m4wrap(`)')len(abc')
|
|
@result{}
|
|
^D
|
|
@error{}m4:stdin:1: ERROR: end of file in argument list
|
|
@end example
|
|
|
|
@node File Inclusion
|
|
@chapter File inclusion
|
|
|
|
@cindex file inclusion
|
|
@cindex inclusion, of files
|
|
@code{m4} allows you to include named files at any point in the input.
|
|
|
|
@menu
|
|
* Include:: Including named files
|
|
* Search Path:: Searching for include files
|
|
@end menu
|
|
|
|
@node Include
|
|
@section Including named files
|
|
|
|
There are two builtin macros in @code{m4} for including files:
|
|
|
|
@deffn Builtin include (@var{file})
|
|
@deffnx Builtin sinclude (@var{file})
|
|
Both macros cause the file named @var{file} to be read by
|
|
@code{m4}. When the end of the file is reached, input is resumed from
|
|
the previous input file.
|
|
|
|
The expansion of @code{include} and @code{sinclude} is therefore the
|
|
contents of @var{file}.
|
|
|
|
If @var{file} does not exist, is a directory, or cannot otherwise be
|
|
read, the expansion is void,
|
|
and @code{include} will fail with an error while @code{sinclude} is
|
|
silent. The empty string counts as a file that does not exist.
|
|
|
|
The macros @code{include} and @code{sinclude} are recognized only with
|
|
parameters.
|
|
@end deffn
|
|
|
|
@comment status: 1
|
|
@example
|
|
include(`none')
|
|
@error{}m4:stdin:1: cannot open `none': No such file or directory
|
|
@result{}
|
|
include()
|
|
@error{}m4:stdin:2: cannot open `': No such file or directory
|
|
@result{}
|
|
sinclude(`none')
|
|
@result{}
|
|
sinclude()
|
|
@result{}
|
|
@end example
|
|
|
|
The rest of this section assumes that @code{m4} is invoked with the
|
|
@option{-I} option (@pxref{Preprocessor features, , Invoking m4})
|
|
pointing to the @file{m4-@value{VERSION}/@/examples}
|
|
directory shipped as part of the GNU @code{m4} package. The
|
|
file @file{m4-@value{VERSION}/@/examples/@/incl.m4} in the distribution
|
|
contains the lines:
|
|
|
|
@comment ignore
|
|
@example
|
|
$ @kbd{cat examples/incl.m4}
|
|
@result{}Include file start
|
|
@result{}foo
|
|
@result{}Include file end
|
|
@end example
|
|
|
|
Normally file inclusion is used to insert the contents of a file
|
|
into the input stream. The contents of the file will be read by
|
|
@code{m4} and macro calls in the file will be expanded:
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
define(`foo', `FOO')
|
|
@result{}
|
|
include(`incl.m4')
|
|
@result{}Include file start
|
|
@result{}FOO
|
|
@result{}Include file end
|
|
@result{}
|
|
@end example
|
|
|
|
The fact that @code{include} and @code{sinclude} expand to the contents
|
|
of the file can be used to define macros that operate on entire files.
|
|
Here is an example, which defines @samp{bar} to expand to the contents
|
|
of @file{incl.m4}:
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
define(`bar', include(`incl.m4'))
|
|
@result{}
|
|
This is `bar': >>bar<<
|
|
@result{}This is bar: >>Include file start
|
|
@result{}foo
|
|
@result{}Include file end
|
|
@result{}<<
|
|
@end example
|
|
|
|
This use of @code{include} is not trivial, though, as files can contain
|
|
quotes, commas, and parentheses, which can interfere with the way the
|
|
@code{m4} parser works. GNU @code{m4} seamlessly concatenates
|
|
the file contents with the next character, even if the included file
|
|
ended in the middle of a comment, string, or macro call. These
|
|
conditions are only treated as end of file errors if specified as input
|
|
files on the command line.
|
|
|
|
In GNU @code{m4}, an alternative method of reading files is
|
|
using @code{undivert} (@pxref{Undivert}) on a named file.
|
|
|
|
@ignore
|
|
@comment Test that include(`file/') detects that file is not a
|
|
@comment directory; we can assume that the current directory contains a
|
|
@comment Makefile. mingw fails with EINVAL rather than ENOTDIR.
|
|
|
|
@comment status: 1
|
|
@comment xerr: ignore
|
|
@example
|
|
include(`Makefile/')
|
|
@error{}m4:stdin:1: cannot open `Makefile/': Not a directory
|
|
@result{}
|
|
@end example
|
|
|
|
@comment POSIX allows, but doesn't require, failure on reading
|
|
@comment directories. But since they aren't text files, it never makes
|
|
@comment sense, so we globally forbid it even if fopen doesn't. mingw
|
|
@comment fails with EACCES rather than EISDIR.
|
|
|
|
@comment status: 1
|
|
@comment xerr: ignore
|
|
@example
|
|
include(`.')
|
|
@error{}m4:stdin:1: cannot open `.': Is a directory
|
|
@result{}
|
|
@end example
|
|
|
|
@comment Meanwhile, ignore errors with sinclude.
|
|
|
|
@example
|
|
sinclude(`Makefile/')
|
|
@result{}
|
|
sinclude(`.')
|
|
@result{}
|
|
@end example
|
|
@end ignore
|
|
|
|
@node Search Path
|
|
@section Searching for include files
|
|
|
|
@cindex search path for included files
|
|
@cindex included files, search path for
|
|
@cindex GNU extensions
|
|
GNU @code{m4} allows included files to be found in other directories
|
|
than the current working directory.
|
|
|
|
@cindex @env{M4PATH}
|
|
If the @option{--prepend-include} or @option{-B} command-line option was
|
|
provided (@pxref{Preprocessor features, , Invoking m4}), those
|
|
directories are searched first, in reverse order that those options were
|
|
listed on the command line. Then @code{m4} looks in the current working
|
|
directory. Next comes the directories specified with the
|
|
@option{--include} or @option{-I} option, in the order found on the
|
|
command line. Finally, if the @env{M4PATH} environment variable is set,
|
|
it is expected to contain a colon-separated list of directories, which
|
|
will be searched in order.
|
|
|
|
If the automatic search for include-files causes trouble, the @samp{p}
|
|
debug flag (@pxref{Debug Levels}) can help isolate the problem.
|
|
|
|
@node Diversions
|
|
@chapter Diverting and undiverting output
|
|
|
|
@cindex deferring output
|
|
Diversions are a way of temporarily saving output. The output of
|
|
@code{m4} can at any time be diverted to a temporary file, and be
|
|
reinserted into the output stream, @dfn{undiverted}, again at a later
|
|
time.
|
|
|
|
@cindex @env{TMPDIR}
|
|
Numbered diversions are counted from 0 upwards, diversion number 0
|
|
being the normal output stream. GNU
|
|
@code{m4} tries to keep diversions in memory. However, there is a
|
|
limit to the overall memory usable by all diversions taken together
|
|
(512K, currently). When this maximum is about to be exceeded,
|
|
a temporary file is opened to receive the contents of the biggest
|
|
diversion still in memory, freeing this memory for other diversions.
|
|
When creating the temporary file, @code{m4} honors the value of the
|
|
environment variable @env{TMPDIR}, and falls back to @file{/tmp}.
|
|
Thus, the amount of available disk space provides the only real limit on
|
|
the number and aggregate size of diversions.
|
|
|
|
@ignore
|
|
@comment We need to test spilled diversions, but don't need to expose
|
|
@comment this highly repetitive test in the manual.
|
|
|
|
@example
|
|
divert(`-1')define(`f', `.')
|
|
define(`f', defn(`f')defn(`f'))
|
|
define(`f', defn(`f')defn(`f'))
|
|
define(`f', defn(`f')defn(`f'))
|
|
define(`f', defn(`f')defn(`f'))
|
|
define(`f', defn(`f')defn(`f'))
|
|
define(`f', defn(`f')defn(`f'))
|
|
define(`f', defn(`f')defn(`f'))
|
|
define(`f', defn(`f')defn(`f'))
|
|
define(`f', defn(`f')defn(`f'))
|
|
define(`f', defn(`f')defn(`f'))
|
|
define(`f', defn(`f')defn(`f'))
|
|
define(`f', defn(`f')defn(`f'))
|
|
define(`f', defn(`f')defn(`f'))
|
|
define(`f', defn(`f')defn(`f'))
|
|
define(`f', defn(`f')defn(`f'))
|
|
define(`f', defn(`f')defn(`f'))
|
|
define(`f', defn(`f')defn(`f'))
|
|
define(`f', defn(`f')defn(`f'))
|
|
define(`f', defn(`f')defn(`f'))
|
|
define(`f', defn(`f')defn(`f'))
|
|
divert`'dnl
|
|
len(f)
|
|
@result{}1048576
|
|
divert(`1')
|
|
f
|
|
divert(`2')
|
|
f
|
|
divert(`-1')undivert
|
|
divert(`1')bye
|
|
^D
|
|
@result{}bye
|
|
@end example
|
|
|
|
@comment Another test of spilled diversions.
|
|
|
|
@example
|
|
divert(`-1')define(`f', `.')
|
|
define(`f', defn(`f')defn(`f'))
|
|
define(`f', defn(`f')defn(`f'))
|
|
define(`f', defn(`f')defn(`f'))
|
|
define(`f', defn(`f')defn(`f'))
|
|
define(`f', defn(`f')defn(`f'))
|
|
define(`f', defn(`f')defn(`f'))
|
|
define(`f', defn(`f')defn(`f'))
|
|
define(`f', defn(`f')defn(`f'))
|
|
define(`f', defn(`f')defn(`f'))
|
|
define(`f', defn(`f')defn(`f'))
|
|
define(`f', defn(`f')defn(`f'))
|
|
define(`f', defn(`f')defn(`f'))
|
|
define(`f', defn(`f')defn(`f'))
|
|
define(`f', defn(`f')defn(`f'))
|
|
define(`f', defn(`f')defn(`f'))
|
|
define(`f', defn(`f')defn(`f'))
|
|
define(`f', defn(`f')defn(`f'))
|
|
define(`f', defn(`f')defn(`f'))
|
|
define(`f', defn(`f')defn(`f'))
|
|
define(`f', defn(`f')defn(`f'))
|
|
divert`'dnl
|
|
len(f)
|
|
@result{}1048576
|
|
divert(`1')
|
|
f
|
|
m4exit
|
|
@end example
|
|
|
|
@comment Catch regression in 1.4.10 with spilled diversions.
|
|
|
|
@example
|
|
ifdef(`__unix__', ,
|
|
`errprint(` skipping: syscmd does not have unix semantics
|
|
')m4exit(`77')')dnl
|
|
changequote(`[', `]')dnl
|
|
syscmd([echo 'divert(1)hi
|
|
format(%1000000d, 1)' | ']__program__[' | sed -n 1p])dnl
|
|
@result{}hi
|
|
sysval
|
|
@result{}0
|
|
@end example
|
|
|
|
@comment Avoid quadratic copying time when transferring diversions;
|
|
@comment test both in-memory and spilled to file.
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
include(`forloop2.m4')dnl
|
|
divert(`1')format(`%10000s', `')dnl
|
|
forloop(`i', `1', `10000',
|
|
`divert(incr(i))undivert(i)')dnl
|
|
divert(`9001')format(`%1000000s', `')dnl
|
|
forloop(`i', `9001', `10000',
|
|
`divert(incr(i))undivert(i)')dnl
|
|
divert(`-1')undivert
|
|
@end example
|
|
@end ignore
|
|
|
|
Diversions make it possible to generate output in a different order than
|
|
the input was read. It is possible to implement topological sorting
|
|
dependencies. For example, GNU Autoconf makes use of
|
|
diversions under the hood to ensure that the expansion of a prerequisite
|
|
macro appears in the output prior to the expansion of a dependent macro,
|
|
regardless of which order the two macros were invoked in the user's
|
|
input file.
|
|
|
|
@menu
|
|
* Divert:: Diverting output
|
|
* Undivert:: Undiverting output
|
|
* Divnum:: Diversion numbers
|
|
* Cleardivert:: Discarding diverted text
|
|
@end menu
|
|
|
|
@node Divert
|
|
@section Diverting output
|
|
|
|
@cindex diverting output to files
|
|
@cindex output, diverting to files
|
|
@cindex files, diverting output to
|
|
Output is diverted using @code{divert}:
|
|
|
|
@deffn Builtin divert (@dvar{number, 0})
|
|
The current diversion is changed to @var{number}. If @var{number} is left
|
|
out or empty, it is assumed to be zero. If @var{number} cannot be
|
|
parsed, the diversion is unchanged.
|
|
|
|
The expansion of @code{divert} is void.
|
|
@end deffn
|
|
|
|
When all the @code{m4} input will have been processed, all existing
|
|
diversions are automatically undiverted, in numerical order.
|
|
|
|
@example
|
|
divert(`1')
|
|
This text is diverted.
|
|
divert
|
|
@result{}
|
|
This text is not diverted.
|
|
@result{}This text is not diverted.
|
|
^D
|
|
@result{}
|
|
@result{}This text is diverted.
|
|
@end example
|
|
|
|
Several calls of @code{divert} with the same argument do not overwrite
|
|
the previous diverted text, but append to it. Diversions are printed
|
|
after any wrapped text is expanded.
|
|
|
|
@example
|
|
define(`text', `TEXT')
|
|
@result{}
|
|
divert(`1')`diverted text.'
|
|
divert
|
|
@result{}
|
|
m4wrap(`Wrapped text precedes ')
|
|
@result{}
|
|
^D
|
|
@result{}Wrapped TEXT precedes diverted text.
|
|
@end example
|
|
|
|
@cindex discarding input
|
|
@cindex input, discarding
|
|
If output is diverted to a negative diversion, it is simply discarded.
|
|
This can be used to suppress unwanted output. A common example of
|
|
unwanted output is the trailing newlines after macro definitions. Here
|
|
is a common programming idiom in @code{m4} for avoiding them.
|
|
|
|
@example
|
|
divert(`-1')
|
|
define(`foo', `Macro `foo'.')
|
|
define(`bar', `Macro `bar'.')
|
|
divert
|
|
@result{}
|
|
@end example
|
|
|
|
@cindex GNU extensions
|
|
Traditional implementations only supported ten diversions. But as a
|
|
GNU extension, diversion numbers can be as large as positive
|
|
integers will allow, rather than treating a multi-digit diversion number
|
|
as a request to discard text.
|
|
|
|
@example
|
|
divert(eval(`1<<28'))world
|
|
divert(`2')hello
|
|
^D
|
|
@result{}hello
|
|
@result{}world
|
|
@end example
|
|
|
|
Note that @code{divert} is an English word, but also an active macro
|
|
without arguments. When processing plain text, the word might appear in
|
|
normal text and be unintentionally swallowed as a macro invocation. One
|
|
way to avoid this is to use the @option{-P} option to rename all
|
|
builtins (@pxref{Operation modes, , Invoking m4}). Another is to write
|
|
a wrapper that requires a parameter to be recognized.
|
|
|
|
@example
|
|
We decided to divert the stream for irrigation.
|
|
@result{}We decided to the stream for irrigation.
|
|
define(`divert', `ifelse(`$#', `0', ``$0'', `builtin(`$0', $@@)')')
|
|
@result{}
|
|
divert(`-1')
|
|
Ignored text.
|
|
divert(`0')
|
|
@result{}
|
|
We decided to divert the stream for irrigation.
|
|
@result{}We decided to divert the stream for irrigation.
|
|
@end example
|
|
|
|
@node Undivert
|
|
@section Undiverting output
|
|
|
|
Diverted text can be undiverted explicitly using the builtin
|
|
@code{undivert}:
|
|
|
|
@deffn Builtin undivert (@ovar{diversions@dots{}})
|
|
Undiverts the numeric @var{diversions} given by the arguments, in the
|
|
order given. If no arguments are supplied, all diversions are
|
|
undiverted, in numerical order.
|
|
|
|
@cindex file inclusion
|
|
@cindex inclusion, of files
|
|
@cindex GNU extensions
|
|
As a GNU extension, @var{diversions} may contain non-numeric
|
|
strings, which are treated as the names of files to copy into the output
|
|
without expansion. A warning is issued if a file could not be opened.
|
|
|
|
The expansion of @code{undivert} is void.
|
|
@end deffn
|
|
|
|
@example
|
|
divert(`1')
|
|
This text is diverted.
|
|
divert
|
|
@result{}
|
|
This text is not diverted.
|
|
@result{}This text is not diverted.
|
|
undivert(`1')
|
|
@result{}
|
|
@result{}This text is diverted.
|
|
@result{}
|
|
@end example
|
|
|
|
Notice the last two blank lines. One of them comes from the newline
|
|
following @code{undivert}, the other from the newline that followed the
|
|
@code{divert}! A diversion often starts with a blank line like this.
|
|
|
|
When diverted text is undiverted, it is @emph{not} reread by @code{m4},
|
|
but rather copied directly to the current output, and it is therefore
|
|
not an error to undivert into a diversion. Undiverting the empty string
|
|
is the same as specifying diversion 0; in either case nothing happens
|
|
since the output has already been flushed.
|
|
|
|
@example
|
|
divert(`1')diverted text
|
|
divert
|
|
@result{}
|
|
undivert()
|
|
@result{}
|
|
undivert(`0')
|
|
@result{}
|
|
undivert
|
|
@result{}diverted text
|
|
@result{}
|
|
divert(`1')more
|
|
divert(`2')undivert(`1')diverted text`'divert
|
|
@result{}
|
|
undivert(`1')
|
|
@result{}
|
|
undivert(`2')
|
|
@result{}more
|
|
@result{}diverted text
|
|
@end example
|
|
|
|
When a diversion has been undiverted, the diverted text is discarded,
|
|
and it is not possible to bring back diverted text more than once.
|
|
|
|
@example
|
|
divert(`1')
|
|
This text is diverted first.
|
|
divert(`0')undivert(`1')dnl
|
|
@result{}
|
|
@result{}This text is diverted first.
|
|
undivert(`1')
|
|
@result{}
|
|
divert(`1')
|
|
This text is also diverted but not appended.
|
|
divert(`0')undivert(`1')dnl
|
|
@result{}
|
|
@result{}This text is also diverted but not appended.
|
|
@end example
|
|
|
|
Attempts to undivert the current diversion are silently ignored. Thus,
|
|
when the current diversion is not 0, the current diversion does not get
|
|
rearranged among the other diversions.
|
|
|
|
@example
|
|
divert(`1')one
|
|
divert(`2')two
|
|
divert(`3')three
|
|
divert(`2')undivert`'dnl
|
|
divert`'undivert`'dnl
|
|
@result{}two
|
|
@result{}one
|
|
@result{}three
|
|
@end example
|
|
|
|
@cindex GNU extensions
|
|
@cindex file inclusion
|
|
@cindex inclusion, of files
|
|
GNU @code{m4} allows named files to be undiverted. Given a
|
|
non-numeric argument, the contents of the file named will be copied,
|
|
uninterpreted, to the current output. This complements the builtin
|
|
@code{include} (@pxref{Include}). To illustrate the difference, assume
|
|
the file @file{foo} contains:
|
|
|
|
@comment ignore
|
|
@example
|
|
$ @kbd{cat foo}
|
|
bar
|
|
@end example
|
|
|
|
@noindent
|
|
then
|
|
|
|
@example
|
|
define(`bar', `BAR')
|
|
@result{}
|
|
undivert(`foo')
|
|
@result{}bar
|
|
@result{}
|
|
include(`foo')
|
|
@result{}BAR
|
|
@result{}
|
|
@end example
|
|
|
|
If the file is not found (or cannot be read), an error message is
|
|
issued, and the expansion is void. It is possible to intermix files
|
|
and diversion numbers.
|
|
|
|
@example
|
|
divert(`1')diversion one
|
|
divert(`2')undivert(`foo')dnl
|
|
divert(`3')diversion three
|
|
divert`'dnl
|
|
undivert(`1', `2', `foo', `3')dnl
|
|
@result{}diversion one
|
|
@result{}bar
|
|
@result{}bar
|
|
@result{}diversion three
|
|
@end example
|
|
|
|
@node Divnum
|
|
@section Diversion numbers
|
|
|
|
@cindex diversion numbers
|
|
The current diversion is tracked by the builtin @code{divnum}:
|
|
|
|
@deffn Builtin divnum
|
|
Expands to the number of the current diversion.
|
|
@end deffn
|
|
|
|
@example
|
|
Initial divnum
|
|
@result{}Initial 0
|
|
divert(`1')
|
|
Diversion one: divnum
|
|
divert(`2')
|
|
Diversion two: divnum
|
|
^D
|
|
@result{}
|
|
@result{}Diversion one: 1
|
|
@result{}
|
|
@result{}Diversion two: 2
|
|
@end example
|
|
|
|
@node Cleardivert
|
|
@section Discarding diverted text
|
|
|
|
@cindex discarding diverted text
|
|
@cindex diverted text, discarding
|
|
Often it is not known, when output is diverted, whether the diverted
|
|
text is actually needed. Since all non-empty diversion are brought back
|
|
on the main output stream when the end of input is seen, a method of
|
|
discarding a diversion is needed. If all diversions should be
|
|
discarded, the easiest is to end the input to @code{m4} with
|
|
@samp{divert(`-1')} followed by an explicit @samp{undivert}:
|
|
|
|
@example
|
|
divert(`1')
|
|
Diversion one: divnum
|
|
divert(`2')
|
|
Diversion two: divnum
|
|
divert(`-1')
|
|
undivert
|
|
^D
|
|
@end example
|
|
|
|
@noindent
|
|
No output is produced at all.
|
|
|
|
Clearing selected diversions can be done with the following macro:
|
|
|
|
@deffn Composite cleardivert (@ovar{diversions@dots{}})
|
|
Discard the contents of each of the listed numeric @var{diversions}.
|
|
@end deffn
|
|
|
|
@example
|
|
define(`cleardivert',
|
|
`pushdef(`_n', divnum)divert(`-1')undivert($@@)divert(_n)popdef(`_n')')
|
|
@result{}
|
|
@end example
|
|
|
|
It is called just like @code{undivert}, but the effect is to clear the
|
|
diversions, given by the arguments. (This macro has a nasty bug! You
|
|
should try to see if you can find it and correct it; or @pxref{Improved
|
|
cleardivert, , Answers}).
|
|
|
|
@node Text handling
|
|
@chapter Macros for text handling
|
|
|
|
There are a number of builtins in @code{m4} for manipulating text in
|
|
various ways, extracting substrings, searching, substituting, and so on.
|
|
|
|
@menu
|
|
* Len:: Calculating length of strings
|
|
* Index macro:: Searching for substrings
|
|
* Regexp:: Searching for regular expressions
|
|
* Substr:: Extracting substrings
|
|
* Translit:: Translating characters
|
|
* Patsubst:: Substituting text by regular expression
|
|
* Format:: Formatting strings (printf-like)
|
|
@end menu
|
|
|
|
@node Len
|
|
@section Calculating length of strings
|
|
|
|
@cindex length of strings
|
|
@cindex strings, length of
|
|
The length of a string can be calculated by @code{len}:
|
|
|
|
@deffn Builtin len (@var{string})
|
|
Expands to the length of @var{string}, as a decimal number.
|
|
|
|
The macro @code{len} is recognized only with parameters.
|
|
@end deffn
|
|
|
|
@example
|
|
len()
|
|
@result{}0
|
|
len(`abcdef')
|
|
@result{}6
|
|
@end example
|
|
|
|
@node Index macro
|
|
@section Searching for substrings
|
|
|
|
@cindex substrings, locating
|
|
Searching for substrings is done with @code{index}:
|
|
|
|
@deffn Builtin index (@var{string}, @var{substring})
|
|
Expands to the index of the first occurrence of @var{substring} in
|
|
@var{string}. The first character in @var{string} has index 0. If
|
|
@var{substring} does not occur in @var{string}, @code{index} expands to
|
|
@samp{-1}.
|
|
|
|
The macro @code{index} is recognized only with parameters.
|
|
@end deffn
|
|
|
|
@example
|
|
index(`gnus, gnats, and armadillos', `nat')
|
|
@result{}7
|
|
index(`gnus, gnats, and armadillos', `dag')
|
|
@result{}-1
|
|
@end example
|
|
|
|
Omitting @var{substring} evokes a warning, but still produces output;
|
|
contrast this with an empty @var{substring}.
|
|
|
|
@example
|
|
index(`abc')
|
|
@error{}m4:stdin:1: Warning: too few arguments to builtin `index'
|
|
@result{}0
|
|
index(`abc', `')
|
|
@result{}0
|
|
index(`abc', `b')
|
|
@result{}1
|
|
@end example
|
|
|
|
@ignore
|
|
@comment Expose a bug in the strstr() algorithm present in glibc
|
|
@comment 2.9 through 2.12 and in gnulib up to Sep 2010.
|
|
|
|
@example
|
|
index(`;:11-:12-:12-:12-:12-:12-:12-:12-:12.:12.:12.:12.:12.:12.:12.:12.:12-',
|
|
`:12-:12-:12-:12-:12-:12-:12-:12-')
|
|
@result{}-1
|
|
@end example
|
|
|
|
@comment Expose a bug in the gnulib replacement strstr() algorithm
|
|
@comment present from Jun 2010 to Feb 2011, including m4 1.4.15.
|
|
|
|
@example
|
|
index(`..wi.d.', `.d.')
|
|
@result{}4
|
|
@end example
|
|
@end ignore
|
|
|
|
@node Regexp
|
|
@section Searching for regular expressions
|
|
|
|
@cindex basic regular expressions
|
|
@cindex regular expressions
|
|
@cindex expressions, regular
|
|
@cindex GNU extensions
|
|
Searching for regular expressions is done with the builtin
|
|
@code{regexp}:
|
|
|
|
@deffn Builtin regexp (@var{string}, @var{regexp}, @ovar{replacement})
|
|
Searches for @var{regexp} in @var{string}. The syntax for regular
|
|
expressions is closest to older GNU Emacs (grouping via @samp{\(} and
|
|
@samp{\)}, alternation via @samp{\|}, one-or-more matching via @samp{+},
|
|
zero-or-one matching via @samp{?}), although for backwards compatibility
|
|
reasons, M4 1.4.x does not enable intervals (that is, both @samp{@{} and
|
|
@samp{\@{} match a literal @samp{@{}), nor character classes (that is,
|
|
@samp{[[:alpha:]]} matches a sequence of one of six bytes followed by a
|
|
literal close bracket, rather than all letters, the same as
|
|
@samp{[[ahlp:]} followed by @samp{]}).
|
|
@ifnothtml
|
|
@xref{Regexps, , Syntax of Regular Expressions, emacs, The GNU Emacs
|
|
Manual}.
|
|
@end ifnothtml
|
|
@ifhtml
|
|
See
|
|
@uref{https://www.gnu.org/@/software/@/emacs/@/manual/@/emacs.html#Regexps,
|
|
Syntax of Regular Expressions} in the GNU Emacs Manual.
|
|
@end ifhtml
|
|
Support for intervals, character classes, non-greedy matches, or even
|
|
alternative regular expression syntax (such as POSIX Basic or Extended
|
|
Regular Expressions) are likely to be added in a future version of GNU
|
|
M4.
|
|
|
|
If @var{replacement} is omitted, @code{regexp} expands to the index of
|
|
the first match of @var{regexp} in @var{string}. If @var{regexp} does
|
|
not match anywhere in @var{string}, it expands to -1.
|
|
|
|
If @var{replacement} is supplied, and there was a match, @code{regexp}
|
|
changes the expansion to this argument, with @samp{\@var{n}} substituted
|
|
by the text matched by the @var{n}th parenthesized sub-expression of
|
|
@var{regexp}, up to nine sub-expressions. The escape @samp{\&} is
|
|
replaced by the text of the entire regular expression matched. For
|
|
all other characters, @samp{\} treats the next character literally. A
|
|
warning is issued if there were fewer sub-expressions than the
|
|
@samp{\@var{n}} requested, or if there is a trailing @samp{\}. If there
|
|
was no match, @code{regexp} expands to the empty string.
|
|
|
|
The macro @code{regexp} is recognized only with parameters.
|
|
@end deffn
|
|
|
|
@example
|
|
regexp(`GNUs not Unix', `\<[a-z]\w+')
|
|
@result{}5
|
|
regexp(`GNUs not Unix', `\<Q\w*')
|
|
@result{}-1
|
|
regexp(`GNUs not Unix', `\w\(\w+\)$', `*** \& *** \1 ***')
|
|
@result{}*** Unix *** nix ***
|
|
regexp(`GNUs not Unix', `\<Q\w*', `*** \& *** \1 ***')
|
|
@result{}
|
|
@end example
|
|
|
|
Here are some more examples on the handling of backslash:
|
|
|
|
@example
|
|
regexp(`abc', `\(b\)', `\\\10\a')
|
|
@result{}\b0a
|
|
regexp(`abc', `b', `\1\')
|
|
@error{}m4:stdin:2: Warning: sub-expression 1 not present
|
|
@error{}m4:stdin:2: Warning: trailing \ ignored in replacement
|
|
@result{}
|
|
regexp(`abc', `\(\(d\)?\)\(c\)', `\1\2\3\4\5\6')
|
|
@error{}m4:stdin:3: Warning: sub-expression 4 not present
|
|
@error{}m4:stdin:3: Warning: sub-expression 5 not present
|
|
@error{}m4:stdin:3: Warning: sub-expression 6 not present
|
|
@result{}c
|
|
@end example
|
|
|
|
Omitting @var{regexp} evokes a warning, but still produces output;
|
|
contrast this with an empty @var{regexp} argument.
|
|
|
|
@example
|
|
regexp(`abc')
|
|
@error{}m4:stdin:1: Warning: too few arguments to builtin `regexp'
|
|
@result{}0
|
|
regexp(`abc', `')
|
|
@result{}0
|
|
regexp(`abc', `', `\\def')
|
|
@result{}\def
|
|
@end example
|
|
|
|
@node Substr
|
|
@section Extracting substrings
|
|
|
|
@cindex extracting substrings
|
|
@cindex substrings, extracting
|
|
Substrings are extracted with @code{substr}:
|
|
|
|
@deffn Builtin substr (@var{string}, @var{from}, @ovar{length})
|
|
Expands to the substring of @var{string}, which starts at index
|
|
@var{from}, and extends for @var{length} characters, or to the end of
|
|
@var{string}, if @var{length} is omitted. The starting index of a string
|
|
is always 0. The expansion is empty if there is an error parsing
|
|
@var{from} or @var{length}, if @var{from} is beyond the end of
|
|
@var{string}, or if @var{length} is negative.
|
|
|
|
The macro @code{substr} is recognized only with parameters.
|
|
@end deffn
|
|
|
|
@example
|
|
substr(`gnus, gnats, and armadillos', `6')
|
|
@result{}gnats, and armadillos
|
|
substr(`gnus, gnats, and armadillos', `6', `5')
|
|
@result{}gnats
|
|
@end example
|
|
|
|
Omitting @var{from} evokes a warning, but still produces output.
|
|
|
|
@example
|
|
substr(`abc')
|
|
@error{}m4:stdin:1: Warning: too few arguments to builtin `substr'
|
|
@result{}abc
|
|
substr(`abc',)
|
|
@error{}m4:stdin:2: empty string treated as 0 in builtin `substr'
|
|
@result{}abc
|
|
@end example
|
|
|
|
@node Translit
|
|
@section Translating characters
|
|
|
|
@cindex translating characters
|
|
@cindex characters, translating
|
|
Character translation is done with @code{translit}:
|
|
|
|
@deffn Builtin translit (@var{string}, @var{chars}, @ovar{replacement})
|
|
Expands to @var{string}, with each character that occurs in
|
|
@var{chars} translated into the character from @var{replacement} with
|
|
the same index.
|
|
|
|
If @var{replacement} is shorter than @var{chars}, the excess characters
|
|
of @var{chars} are deleted from the expansion; if @var{chars} is
|
|
shorter, the excess characters in @var{replacement} are silently
|
|
ignored. If @var{replacement} is omitted, all characters in
|
|
@var{string} that are present in @var{chars} are deleted from the
|
|
expansion. If a character appears more than once in @var{chars}, only
|
|
the first instance is used in making the translation. Only a single
|
|
translation pass is made, even if characters in @var{replacement} also
|
|
appear in @var{chars}.
|
|
|
|
As a GNU extension, both @var{chars} and @var{replacement} can
|
|
contain character-ranges, e.g., @samp{a-z} (meaning all lowercase
|
|
letters) or @samp{0-9} (meaning all digits). To include a dash @samp{-}
|
|
in @var{chars} or @var{replacement}, place it first or last in the
|
|
entire string, or as the last character of a range. Back-to-back ranges
|
|
can share a common endpoint. It is not an error for the last character
|
|
in the range to be `larger' than the first. In that case, the range
|
|
runs backwards, i.e., @samp{9-0} means the string @samp{9876543210}.
|
|
The expansion of a range is dependent on the underlying encoding of
|
|
characters, so using ranges is not always portable between machines.
|
|
|
|
The macro @code{translit} is recognized only with parameters.
|
|
@end deffn
|
|
|
|
@example
|
|
translit(`GNUs not Unix', `A-Z')
|
|
@result{}s not nix
|
|
translit(`GNUs not Unix', `a-z', `A-Z')
|
|
@result{}GNUS NOT UNIX
|
|
translit(`GNUs not Unix', `A-Z', `z-a')
|
|
@result{}tmfs not fnix
|
|
translit(`+,-12345', `+--1-5', `<;>a-c-a')
|
|
@result{}<;>abcba
|
|
translit(`abcdef', `aabdef', `bcged')
|
|
@result{}bgced
|
|
@end example
|
|
|
|
In the @sc{ascii} encoding, the first example deletes all uppercase
|
|
letters, the second converts lowercase to uppercase, and the third
|
|
`mirrors' all uppercase letters, while converting them to lowercase.
|
|
The two first cases are by far the most common, even though they are not
|
|
portable to @sc{ebcdic} or other encodings. The fourth example shows a
|
|
range ending in @samp{-}, as well as back-to-back ranges. The final
|
|
example shows that @samp{a} is mapped to @samp{b}, not @samp{c}; the
|
|
resulting @samp{b} is not further remapped to @samp{g}; the @samp{d} and
|
|
@samp{e} are swapped, and the @samp{f} is discarded.
|
|
|
|
@ignore
|
|
@comment No need to fight 8-bit characters, as it is difficult to get
|
|
@comment rendering right in both info and dvi, and examples like this
|
|
@comment do not work correctly with UTF-8 anyway since m4 is byte-oriented.
|
|
|
|
@example
|
|
translit(`«abc~', `~-»')
|
|
@result{}abc
|
|
@end example
|
|
|
|
@comment Stress test short arguments, since they use a different code
|
|
@comment path.
|
|
@example
|
|
translit(`abcdeabcde', `a')
|
|
@result{}bcdebcde
|
|
translit(`abcdeabcde', `ab')
|
|
@result{}cdecde
|
|
translit(`abcdeabcde', `a', `f')
|
|
@result{}fbcdefbcde
|
|
translit(`abcdeabcde', `a', `f')
|
|
@result{}fbcdefbcde
|
|
translit(`abcdeabcde', `a', `fg')
|
|
@result{}fbcdefbcde
|
|
translit(`abcdeabcde', `ab', `f')
|
|
@result{}fcdefcde
|
|
translit(`abcdeabcde', `ab', `fg')
|
|
@result{}fgcdefgcde
|
|
translit(`abcdeabcde', `ab', `ba')
|
|
@result{}bacdebacde
|
|
translit(`abcdeabcde', `e', `f')
|
|
@result{}abcdfabcdf
|
|
translit(`abc', `', `cde')
|
|
@result{}abc
|
|
translit(`', `a', `bc')
|
|
@result{}
|
|
@end example
|
|
@end ignore
|
|
|
|
Omitting @var{chars} evokes a warning, but still produces output.
|
|
|
|
@example
|
|
translit(`abc')
|
|
@error{}m4:stdin:1: Warning: too few arguments to builtin `translit'
|
|
@result{}abc
|
|
@end example
|
|
|
|
@node Patsubst
|
|
@section Substituting text by regular expression
|
|
|
|
@cindex basic regular expressions
|
|
@cindex regular expressions
|
|
@cindex expressions, regular
|
|
@cindex pattern substitution
|
|
@cindex substitution by regular expression
|
|
@cindex GNU extensions
|
|
Global substitution in a string is done by @code{patsubst}:
|
|
|
|
@deffn Builtin patsubst (@var{string}, @var{regexp}, @ovar{replacement})
|
|
Searches @var{string} for matches of @var{regexp}, and substitutes
|
|
@var{replacement} for each match. The syntax for regular expressions
|
|
is the same as in the @code{regexp} builtin (@pxref{Regexp}).
|
|
|
|
The parts of @var{string} that are not covered by any match of
|
|
@var{regexp} are copied to the expansion. Whenever a match is found, the
|
|
search proceeds from the end of the match, so a character from
|
|
@var{string} will never be substituted twice. If @var{regexp} matches a
|
|
string of zero length, the start position for the search is incremented,
|
|
to avoid infinite loops.
|
|
|
|
When a replacement is to be made, @var{replacement} is inserted into
|
|
the expansion, with @samp{\@var{n}} substituted by the text matched by
|
|
the @var{n}th parenthesized sub-expression of @var{patsubst}, for up to
|
|
nine sub-expressions. The escape @samp{\&} is replaced by the text of
|
|
the entire regular expression matched. For all other characters,
|
|
@samp{\} treats the next character literally. A warning is issued if
|
|
there were fewer sub-expressions than the @samp{\@var{n}} requested, or
|
|
if there is a trailing @samp{\}.
|
|
|
|
The @var{replacement} argument can be omitted, in which case the text
|
|
matched by @var{regexp} is deleted.
|
|
|
|
The macro @code{patsubst} is recognized only with parameters.
|
|
@end deffn
|
|
|
|
@example
|
|
patsubst(`GNUs not Unix', `^', `OBS: ')
|
|
@result{}OBS: GNUs not Unix
|
|
patsubst(`GNUs not Unix', `\<', `OBS: ')
|
|
@result{}OBS: GNUs OBS: not OBS: Unix
|
|
patsubst(`GNUs not Unix', `\w*', `(\&)')
|
|
@result{}(GNUs)() (not)() (Unix)()
|
|
patsubst(`GNUs not Unix', `\w+', `(\&)')
|
|
@result{}(GNUs) (not) (Unix)
|
|
patsubst(`GNUs not Unix', `[A-Z][a-z]+')
|
|
@result{}GN not@w{ }
|
|
patsubst(`GNUs not Unix', `not', `NOT\')
|
|
@error{}m4:stdin:6: Warning: trailing \ ignored in replacement
|
|
@result{}GNUs NOT Unix
|
|
@end example
|
|
|
|
Here is a slightly more realistic example, which capitalizes individual
|
|
words or whole sentences, by substituting calls of the macros
|
|
@code{upcase} and @code{downcase} into the strings.
|
|
|
|
@deffn Composite upcase (@var{text})
|
|
@deffnx Composite downcase (@var{text})
|
|
@deffnx Composite capitalize (@var{text})
|
|
Expand to @var{text}, but with capitalization changed: @code{upcase}
|
|
changes all letters to upper case, @code{downcase} changes all letters
|
|
to lower case, and @code{capitalize} changes the first character of each
|
|
word to upper case and the remaining characters to lower case.
|
|
@end deffn
|
|
|
|
First, an example of their usage, using implementations distributed in
|
|
@file{m4-@value{VERSION}/@/examples/@/capitalize.m4}.
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
include(`capitalize.m4')
|
|
@result{}
|
|
upcase(`GNUs not Unix')
|
|
@result{}GNUS NOT UNIX
|
|
downcase(`GNUs not Unix')
|
|
@result{}gnus not unix
|
|
capitalize(`GNUs not Unix')
|
|
@result{}Gnus Not Unix
|
|
@end example
|
|
|
|
Now for the implementation. There is a helper macro @code{_capitalize}
|
|
which puts only its first word in mixed case. Then @code{capitalize}
|
|
merely parses out the words, and replaces them with an invocation of
|
|
@code{_capitalize}. (As presented here, the @code{capitalize} macro has
|
|
some subtle flaws. You should try to see if you can find and correct
|
|
them; or @pxref{Improved capitalize, , Answers}).
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
undivert(`capitalize.m4')dnl
|
|
@result{}divert(`-1')
|
|
@result{}# upcase(text)
|
|
@result{}# downcase(text)
|
|
@result{}# capitalize(text)
|
|
@result{}# change case of text, simple version
|
|
@result{}define(`upcase', `translit(`$*', `a-z', `A-Z')')
|
|
@result{}define(`downcase', `translit(`$*', `A-Z', `a-z')')
|
|
@result{}define(`_capitalize',
|
|
@result{} `regexp(`$1', `^\(\w\)\(\w*\)',
|
|
@result{} `upcase(`\1')`'downcase(`\2')')')
|
|
@result{}define(`capitalize', `patsubst(`$1', `\w+', `_$0(`\&')')')
|
|
@result{}divert`'dnl
|
|
@end example
|
|
|
|
While @code{regexp} replaces the whole input with the replacement as
|
|
soon as there is a match, @code{patsubst} replaces each
|
|
@emph{occurrence} of a match and preserves non-matching pieces:
|
|
|
|
@example
|
|
define(`patreg',
|
|
`patsubst($@@)
|
|
regexp($@@)')dnl
|
|
patreg(`bar foo baz Foo', `foo\|Foo', `FOO')
|
|
@result{}bar FOO baz FOO
|
|
@result{}FOO
|
|
patreg(`aba abb 121', `\(.\)\(.\)\1', `\2\1\2')
|
|
@result{}bab abb 212
|
|
@result{}bab
|
|
@end example
|
|
|
|
Omitting @var{regexp} evokes a warning, but still produces output;
|
|
contrast this with an empty @var{regexp} argument.
|
|
|
|
@example
|
|
patsubst(`abc')
|
|
@error{}m4:stdin:1: Warning: too few arguments to builtin `patsubst'
|
|
@result{}abc
|
|
patsubst(`abc', `')
|
|
@result{}abc
|
|
patsubst(`abc', `', `\\-')
|
|
@result{}\-a\-b\-c\-
|
|
@end example
|
|
|
|
@node Format
|
|
@section Formatting strings (printf-like)
|
|
|
|
@cindex formatted output
|
|
@cindex output, formatted
|
|
@cindex GNU extensions
|
|
Formatted output can be made with @code{format}:
|
|
|
|
@deffn Builtin format (@var{format-string}, @dots{})
|
|
Works much like the C function @code{printf}. The first argument
|
|
@var{format-string} can contain @samp{%} specifications which are
|
|
satisfied by additional arguments, and the expansion of @code{format} is
|
|
the formatted string.
|
|
|
|
The macro @code{format} is recognized only with parameters.
|
|
@end deffn
|
|
|
|
Its use is best described by a few examples:
|
|
|
|
@comment This test is a bit fragile, if someone tries to port to a
|
|
@comment platform without infinity.
|
|
@example
|
|
define(`foo', `The brown fox jumped over the lazy dog')
|
|
@result{}
|
|
format(`The string "%s" uses %d characters', foo, len(foo))
|
|
@result{}The string "The brown fox jumped over the lazy dog" uses 38 characters
|
|
format(`%*.*d', `-1', `-1', `1')
|
|
@result{}1
|
|
format(`%.0f', `56789.9876')
|
|
@result{}56790
|
|
len(format(`%-*X', `5000', `1'))
|
|
@result{}5000
|
|
ifelse(format(`%010F', `infinity'), ` INF', `success',
|
|
format(`%010F', `infinity'), ` INFINITY', `success',
|
|
format(`%010F', `infinity'))
|
|
@result{}success
|
|
ifelse(format(`%.1A', `1.999'), `0X1.0P+1', `success',
|
|
format(`%.1A', `1.999'), `0X2.0P+0', `success',
|
|
format(`%.1A', `1.999'))
|
|
@result{}success
|
|
format(`%g', `0xa.P+1')
|
|
@result{}20
|
|
@end example
|
|
|
|
Using the @code{forloop} macro defined earlier (@pxref{Forloop}), this
|
|
example shows how @code{format} can be used to produce tabular output.
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
include(`forloop.m4')
|
|
@result{}
|
|
forloop(`i', `1', `10', `format(`%6d squared is %10d
|
|
', i, eval(i**2))')
|
|
@result{} 1 squared is 1
|
|
@result{} 2 squared is 4
|
|
@result{} 3 squared is 9
|
|
@result{} 4 squared is 16
|
|
@result{} 5 squared is 25
|
|
@result{} 6 squared is 36
|
|
@result{} 7 squared is 49
|
|
@result{} 8 squared is 64
|
|
@result{} 9 squared is 81
|
|
@result{} 10 squared is 100
|
|
@result{}
|
|
@end example
|
|
|
|
The builtin @code{format} is modeled after the ANSI C @samp{printf}
|
|
function, and supports these @samp{%} specifiers: @samp{c}, @samp{s},
|
|
@samp{d}, @samp{o}, @samp{x}, @samp{X}, @samp{u}, @samp{a}, @samp{A},
|
|
@samp{e}, @samp{E}, @samp{f}, @samp{F}, @samp{g}, @samp{G}, and
|
|
@samp{%}; it supports field widths and precisions, and the flags
|
|
@samp{+}, @samp{-}, @samp{ }, @samp{0}, @samp{#}, and @samp{'}. For
|
|
integer specifiers, the width modifiers @samp{hh}, @samp{h}, and
|
|
@samp{l} are recognized, and for floating point specifiers, the width
|
|
modifier @samp{l} is recognized. Items not yet supported include
|
|
positional arguments, the @samp{n}, @samp{p}, @samp{S}, and @samp{C}
|
|
specifiers, the @samp{z}, @samp{t}, @samp{j}, @samp{L} and @samp{ll}
|
|
modifiers, and any platform extensions available in the native
|
|
@code{printf}. For more details on the functioning of @code{printf},
|
|
see the C Library Manual, or the POSIX specification (for
|
|
example, @samp{%a} is supported even on platforms that haven't yet
|
|
implemented C99 hexadecimal floating point output natively).
|
|
|
|
Unrecognized specifiers result in a warning. It is anticipated that a
|
|
future release of GNU @code{m4} will support more specifiers,
|
|
and give better warnings when various problems such as overflow are
|
|
encountered. Likewise, escape sequences are not yet recognized.
|
|
|
|
@example
|
|
format(`%p', `0')
|
|
@error{}m4:stdin:1: Warning: unrecognized specifier in `%p'
|
|
@result{}
|
|
@end example
|
|
|
|
@ignore
|
|
@comment Expose a crash with a bad format string fixed in 1.4.15.
|
|
@comment Unfortuntely, 8-bit bytes are hard to check for; but the
|
|
@comment exit status is enough to sniff the crash in broken versions.
|
|
|
|
@comment xerr: ignore
|
|
@example
|
|
format(`%'format(`%c', `128'))
|
|
@result{}
|
|
@end example
|
|
@end ignore
|
|
|
|
@node Arithmetic
|
|
@chapter Macros for doing arithmetic
|
|
|
|
@cindex arithmetic
|
|
@cindex integer arithmetic
|
|
Integer arithmetic is included in @code{m4}, with a C-like syntax. As
|
|
convenient shorthands, there are builtins for simple increment and
|
|
decrement operations.
|
|
|
|
@menu
|
|
* Incr:: Decrement and increment operators
|
|
* Eval:: Evaluating integer expressions
|
|
@end menu
|
|
|
|
@node Incr
|
|
@section Decrement and increment operators
|
|
|
|
@cindex decrement operator
|
|
@cindex increment operator
|
|
Increment and decrement of integers are supported using the builtins
|
|
@code{incr} and @code{decr}:
|
|
|
|
@deffn Builtin incr (@var{number})
|
|
@deffnx Builtin decr (@var{number})
|
|
Expand to the numerical value of @var{number}, incremented
|
|
or decremented, respectively, by one. Except for the empty string, the
|
|
expansion is empty if @var{number} could not be parsed.
|
|
|
|
The macros @code{incr} and @code{decr} are recognized only with
|
|
parameters.
|
|
@end deffn
|
|
|
|
@example
|
|
incr(`4')
|
|
@result{}5
|
|
decr(`7')
|
|
@result{}6
|
|
incr()
|
|
@error{}m4:stdin:3: empty string treated as 0 in builtin `incr'
|
|
@result{}1
|
|
decr()
|
|
@error{}m4:stdin:4: empty string treated as 0 in builtin `decr'
|
|
@result{}-1
|
|
@end example
|
|
|
|
It is possible to use @code{incr} to write a macro that produces a
|
|
different value each time it is invoked:
|
|
|
|
@deffn Composite counter (@ovar{seed})
|
|
When invoked with one argument, reset the counter to @var{seed} (where
|
|
an empty seed is silently treated like 0), without output. Without
|
|
arguments, output and increment the internal counter.
|
|
@end deffn
|
|
|
|
@example
|
|
define(`_counter', `-1')dnl
|
|
define(`counter', `ifelse(`$#', `0', `define(`_$0', incr(_$0))_$0`'',
|
|
`define(`_$0', eval(`$1-1'))')')
|
|
@result{}
|
|
counter counter counter
|
|
@result{}0 1 2
|
|
counter(`42')
|
|
@result{}
|
|
counter counter
|
|
@result{}42 43
|
|
counter()counter
|
|
@result{}0
|
|
@end example
|
|
|
|
It is worth noting how the implementation treats a default empty string
|
|
as @code{0} - this is done by storing one less than @var{seed} into the
|
|
internal counter @code{_counter}, by using @code{eval} in a way that
|
|
works even when @code{$1} is blank.
|
|
|
|
@node Eval
|
|
@section Evaluating integer expressions
|
|
|
|
@cindex integer expression evaluation
|
|
@cindex evaluation, of integer expressions
|
|
@cindex expressions, evaluation of integer
|
|
Integer expressions are evaluated with @code{eval}:
|
|
|
|
@deffn Builtin eval (@var{expression}, @dvar{radix, 10}, @ovar{width})
|
|
Expands to the value of @var{expression}. The expansion is empty
|
|
if a problem is encountered while parsing the arguments. If specified,
|
|
@var{radix} and @var{width} control the format of the output.
|
|
|
|
Calculations are done with 32-bit signed numbers. Overflow silently
|
|
results in wraparound. A warning is issued if division by zero is
|
|
attempted, or if @var{expression} could not be parsed.
|
|
|
|
Expressions can contain the following operators, listed in order of
|
|
decreasing precedence.
|
|
|
|
@table @samp
|
|
@item ()
|
|
Parentheses
|
|
@item + - ~ !
|
|
Unary plus and minus, and bitwise and logical negation
|
|
@item **
|
|
Exponentiation
|
|
@item * / %
|
|
Multiplication, division, and modulo
|
|
@item + -
|
|
Addition and subtraction
|
|
@item << >>
|
|
Shift left or right
|
|
@item > >= < <=
|
|
Relational operators
|
|
@item == !=
|
|
Equality operators
|
|
@item &
|
|
Bitwise and
|
|
@item ^
|
|
Bitwise exclusive-or
|
|
@item |
|
|
Bitwise or
|
|
@item &&
|
|
Logical and
|
|
@item ||
|
|
Logical or
|
|
@end table
|
|
|
|
The macro @code{eval} is recognized only with parameters.
|
|
@end deffn
|
|
|
|
All binary operators, except exponentiation, are left associative. C
|
|
operators that perform variable assignment, such as @samp{+=} or
|
|
@samp{--}, are not implemented, since @code{eval} only operates on
|
|
constants, not variables. Attempting to use them results in an error.
|
|
However, since traditional implementations treated @samp{=} as an
|
|
undocumented alias for @samp{==} as opposed to an assignment operator,
|
|
this usage is supported as a special case. Be aware that a future
|
|
version of GNU M4 may support assignment semantics as an
|
|
extension when POSIX mode is not requested, and that using
|
|
@samp{=} to check equality is not portable.
|
|
|
|
@comment status: 1
|
|
@example
|
|
eval(`2 = 2')
|
|
@error{}m4:stdin:1: Warning: recommend ==, not =, for equality operator
|
|
@result{}1
|
|
eval(`++0')
|
|
@error{}m4:stdin:2: invalid operator in eval: ++0
|
|
@result{}
|
|
eval(`0 |= 1')
|
|
@error{}m4:stdin:3: invalid operator in eval: 0 |= 1
|
|
@result{}
|
|
eval(`(1 += 2)')
|
|
@error{}m4:stdin:4: invalid operator in eval: (1 += 2)
|
|
@result{}
|
|
@end example
|
|
|
|
Note that some older @code{m4} implementations use @samp{^} as an
|
|
alternate operator for the exponentiation, although POSIX
|
|
requires the C behavior of bitwise exclusive-or. The precedence of the
|
|
negation operators, @samp{~} and @samp{!}, was traditionally lower than
|
|
equality. The unary operators could not be used reliably more than once
|
|
on the same term without intervening parentheses. The traditional
|
|
precedence of the equality operators @samp{==} and @samp{!=} was
|
|
identical instead of lower than the relational operators such as
|
|
@samp{<}, even through GNU M4 1.4.8. Starting with version
|
|
1.4.9, GNU M4 correctly follows POSIX precedence
|
|
rules. M4 scripts designed to be portable between releases must be
|
|
aware that parentheses may be required to enforce C precedence rules.
|
|
Likewise, division by zero, even in the unused branch of a
|
|
short-circuiting operator, is not always well-defined in other
|
|
implementations.
|
|
|
|
Following are some examples where the current version of M4 follows C
|
|
precedence rules, but where older versions and some other
|
|
implementations of @code{m4} require explicit parentheses to get the
|
|
correct result:
|
|
|
|
@example
|
|
eval(`1 == 2 > 0')
|
|
@result{}1
|
|
eval(`(1 == 2) > 0')
|
|
@result{}0
|
|
eval(`! 0 * 2')
|
|
@result{}2
|
|
eval(`! (0 * 2)')
|
|
@result{}1
|
|
eval(`1 | 1 ^ 1')
|
|
@result{}1
|
|
eval(`(1 | 1) ^ 1')
|
|
@result{}0
|
|
eval(`+ + - ~ ! ~ 0')
|
|
@result{}1
|
|
eval(`1 / 0 + 1')
|
|
@error{}m4:stdin:8: divide by zero in eval: 1 / 0 + 1
|
|
@result{}
|
|
eval(`1 / 0 || 1 / 0')
|
|
@error{}m4:stdin:9: divide by zero in eval: 1 / 0 || 1 / 0
|
|
@result{}
|
|
eval(`2 || 1 / 0')
|
|
@result{}1
|
|
eval(`0 || 1 / 0')
|
|
@error{}m4:stdin:11: divide by zero in eval: 0 || 1 / 0
|
|
@result{}
|
|
eval(`0 && (1 % 0)')
|
|
@result{}0
|
|
eval(`2 && (1 % 0)')
|
|
@error{}m4:stdin:13: modulo by zero in eval: 2 && (1 % 0)
|
|
@result{}
|
|
@end example
|
|
|
|
@cindex GNU extensions
|
|
As a GNU extension, the operator @samp{**} performs integral
|
|
exponentiation. The operator is right-associative, and if evaluated,
|
|
the exponent must be non-negative, and at least one of the arguments
|
|
must be non-zero, or a warning is issued.
|
|
|
|
@example
|
|
eval(`2 ** 3 ** 2')
|
|
@result{}512
|
|
eval(`(2 ** 3) ** 2')
|
|
@result{}64
|
|
eval(`0 ** 1')
|
|
@result{}0
|
|
eval(`2 ** 0')
|
|
@result{}1
|
|
eval(`0 ** 0')
|
|
@result{}
|
|
@error{}m4:stdin:5: divide by zero in eval: 0 ** 0
|
|
eval(`4 ** -2')
|
|
@error{}m4:stdin:6: negative exponent in eval: 4 ** -2
|
|
@result{}
|
|
@end example
|
|
|
|
@ignore
|
|
@comment This test makes sure the algorithm is fast, but does not
|
|
@comment need to be displayed as-is in the manual.
|
|
|
|
@example
|
|
eval(`3**2000000000')
|
|
@result{}632360961
|
|
define(`loop', `ifelse(`$1', `1000', `pass',
|
|
`loop(eval($1+!!(3**2000000000)))')')loop(0)
|
|
@result{}pass
|
|
define(`check', `ifelse(`$1', `100', `pass', `$2', eval(-3**$1),
|
|
`check(incr($1), eval(-3*$2))', `oops')')check(0, 1)
|
|
@result{}pass
|
|
@end example
|
|
@end ignore
|
|
|
|
Within @var{expression}, (but not @var{radix} or @var{width}), numbers
|
|
without a special prefix are decimal. A simple @samp{0} prefix
|
|
introduces an octal number. @samp{0x} introduces a hexadecimal number.
|
|
As GNU extensions, @samp{0b} introduces a binary number.
|
|
@samp{0r} introduces a number expressed in any radix between 1 and 36:
|
|
the prefix should be immediately followed by the decimal expression of
|
|
the radix, a colon, then the digits making the number. For radix 1,
|
|
leading zeros are ignored, and all remaining digits must be @samp{1};
|
|
for all other radices, the digits are @samp{0}, @samp{1}, @samp{2},
|
|
@dots{}. Beyond @samp{9}, the digits are @samp{a}, @samp{b} @dots{} up
|
|
to @samp{z}. Lower and upper case letters can be used interchangeably
|
|
in numbers prefixes and as number digits.
|
|
|
|
Parentheses may be used to group subexpressions whenever needed. For the
|
|
relational operators, a true relation returns @code{1}, and a false
|
|
relation return @code{0}.
|
|
|
|
Here are a few examples of use of @code{eval}.
|
|
|
|
@example
|
|
eval(`-3 * 5')
|
|
@result{}-15
|
|
eval(`-99 / 10')
|
|
@result{}-9
|
|
eval(`-99 % 10')
|
|
@result{}-9
|
|
eval(`99 % -10')
|
|
@result{}9
|
|
eval(index(`Hello world', `llo') >= 0)
|
|
@result{}1
|
|
eval(`0r1:0111 + 0b100 + 0r3:12')
|
|
@result{}12
|
|
define(`square', `eval(`($1) ** 2')')
|
|
@result{}
|
|
square(`9')
|
|
@result{}81
|
|
square(square(`5')` + 1')
|
|
@result{}676
|
|
eval(`1+')
|
|
@error{}m4:stdin:10: bad expression in eval: 1+
|
|
@result{}
|
|
eval(`0x')
|
|
@error{}m4:stdin:11: invalid number in eval: 0x
|
|
@result{}
|
|
eval(`01239')
|
|
@error{}m4:stdin:12: invalid number in eval: 01239
|
|
@result{}
|
|
define(`foo', `666')
|
|
@result{}
|
|
eval(`foo / 6')
|
|
@error{}m4:stdin:14: bad expression in eval (bad input): foo / 6
|
|
@result{}
|
|
eval(foo / 6)
|
|
@result{}111
|
|
@end example
|
|
|
|
As the last two lines show, @code{eval} does not handle macro
|
|
names, even if they expand to a valid expression (or part of a valid
|
|
expression). Therefore all macros must be expanded before they are
|
|
passed to @code{eval}.
|
|
|
|
Some calculations are not portable to other implementations, since they
|
|
have undefined semantics in C, but GNU @code{m4} has
|
|
well-defined behavior on overflow. When shifting, an out-of-range shift
|
|
amount is implicitly brought into the range of 32-bit signed integers
|
|
using an implicit bit-wise and with 0x1f).
|
|
|
|
@example
|
|
define(`max_int', eval(`0x7fffffff'))
|
|
@result{}
|
|
define(`min_int', incr(max_int))
|
|
@result{}
|
|
eval(min_int` < 0')
|
|
@result{}1
|
|
eval(max_int` > 0')
|
|
@result{}1
|
|
ifelse(eval(min_int` / -1'), min_int, `overflow occurred')
|
|
@result{}overflow occurred
|
|
min_int
|
|
@result{}-2147483648
|
|
eval(`0x80000000 % -1')
|
|
@result{}0
|
|
eval(`-4 >> 1')
|
|
@result{}-2
|
|
eval(`-4 >> 33')
|
|
@result{}-2
|
|
@end example
|
|
|
|
If @var{radix} is specified, it specifies the radix to be used in the
|
|
expansion. The default radix is 10; this is also the case if
|
|
@var{radix} is the empty string. A warning results if the radix is
|
|
outside the range of 1 through 36, inclusive. The result of @code{eval}
|
|
is always taken to be signed. No radix prefix is output, and for
|
|
radices greater than 10, the digits are lower case. The @var{width}
|
|
argument specifies the minimum output width, excluding any negative
|
|
sign. The result is zero-padded to extend the expansion to the
|
|
requested width. A warning results if the width is negative. If
|
|
@var{radix} or @var{width} is out of bounds, the expansion of
|
|
@code{eval} is empty.
|
|
|
|
@example
|
|
eval(`666', `10')
|
|
@result{}666
|
|
eval(`666', `11')
|
|
@result{}556
|
|
eval(`666', `6')
|
|
@result{}3030
|
|
eval(`666', `6', `10')
|
|
@result{}0000003030
|
|
eval(`-666', `6', `10')
|
|
@result{}-0000003030
|
|
eval(`10', `', `0')
|
|
@result{}10
|
|
`0r1:'eval(`10', `1', `11')
|
|
@result{}0r1:01111111111
|
|
eval(`10', `16')
|
|
@result{}a
|
|
eval(`1', `37')
|
|
@error{}m4:stdin:9: radix 37 in builtin `eval' out of range
|
|
@result{}
|
|
eval(`1', , `-1')
|
|
@error{}m4:stdin:10: negative width to builtin `eval'
|
|
@result{}
|
|
eval()
|
|
@error{}m4:stdin:11: empty string treated as 0 in builtin `eval'
|
|
@result{}0
|
|
@end example
|
|
|
|
@node Shell commands
|
|
@chapter Macros for running shell commands
|
|
|
|
@cindex UNIX commands, running
|
|
@cindex executing shell commands
|
|
@cindex running shell commands
|
|
@cindex shell commands, running
|
|
@cindex commands, running shell
|
|
There are a few builtin macros in @code{m4} that allow you to run shell
|
|
commands from within @code{m4}.
|
|
|
|
Note that the definition of a valid shell command is system dependent.
|
|
On UNIX systems, this is the typical @command{/bin/sh}. But on other
|
|
systems, such as native Windows, the shell has a different syntax of
|
|
commands that it understands. Some examples in this chapter assume
|
|
@command{/bin/sh}, and also demonstrate how to quit early with a known
|
|
exit value if this is not the case.
|
|
|
|
@menu
|
|
* Platform macros:: Determining the platform
|
|
* Syscmd:: Executing simple commands
|
|
* Esyscmd:: Reading the output of commands
|
|
* Sysval:: Exit status
|
|
* Mkstemp:: Making temporary files
|
|
@end menu
|
|
|
|
@node Platform macros
|
|
@section Determining the platform
|
|
|
|
@cindex platform macros
|
|
Sometimes it is desirable for an input file to know which platform
|
|
@code{m4} is running on. GNU @code{m4} provides several
|
|
macros that are predefined to expand to the empty string; checking for
|
|
their existence will confirm platform details.
|
|
|
|
@deffn {Optional builtin} __gnu__
|
|
@deffnx {Optional builtin} __os2__
|
|
@deffnx {Optional builtin} os2
|
|
@deffnx {Optional builtin} __unix__
|
|
@deffnx {Optional builtin} unix
|
|
@deffnx {Optional builtin} __windows__
|
|
@deffnx {Optional builtin} windows
|
|
Each of these macros is conditionally defined as needed to describe the
|
|
environment of @code{m4}. If defined, each macro expands to the empty
|
|
string. For now, these macros silently ignore all arguments, but in a
|
|
future release of M4, they might warn if arguments are present.
|
|
@end deffn
|
|
|
|
When GNU extensions are in effect (that is, when you did not
|
|
use the @option{-G} option, @pxref{Limits control, , Invoking m4}),
|
|
GNU @code{m4} will define the macro @code{@w{__gnu__}} to
|
|
expand to the empty string.
|
|
|
|
@example
|
|
$ @kbd{m4}
|
|
__gnu__
|
|
@result{}
|
|
__gnu__(`ignored')
|
|
@result{}
|
|
Extensions are ifdef(`__gnu__', `active', `inactive')
|
|
@result{}Extensions are active
|
|
@end example
|
|
|
|
@comment options: -G
|
|
@example
|
|
$ @kbd{m4 -G}
|
|
__gnu__
|
|
@result{}__gnu__
|
|
__gnu__(`ignored')
|
|
@result{}__gnu__(ignored)
|
|
Extensions are ifdef(`__gnu__', `active', `inactive')
|
|
@result{}Extensions are inactive
|
|
@end example
|
|
|
|
On UNIX systems, GNU @code{m4} will define @code{@w{__unix__}}
|
|
by default, or @code{unix} when the @option{-G} option is specified.
|
|
|
|
On native Windows systems, GNU @code{m4} will define
|
|
@code{@w{__windows__}} by default, or @code{windows} when the
|
|
@option{-G} option is specified.
|
|
|
|
On OS/2 systems, GNU @code{m4} will define @code{@w{__os2__}}
|
|
by default, or @code{os2} when the @option{-G} option is specified.
|
|
|
|
If GNU @code{m4} does not provide a platform macro for your system,
|
|
please report that as a bug.
|
|
|
|
@example
|
|
define(`provided', `0')
|
|
@result{}
|
|
ifdef(`__unix__', `define(`provided', incr(provided))')
|
|
@result{}
|
|
ifdef(`__windows__', `define(`provided', incr(provided))')
|
|
@result{}
|
|
ifdef(`__os2__', `define(`provided', incr(provided))')
|
|
@result{}
|
|
provided
|
|
@result{}1
|
|
@end example
|
|
|
|
@node Syscmd
|
|
@section Executing simple commands
|
|
|
|
Any shell command can be executed, using @code{syscmd}:
|
|
|
|
@deffn Builtin syscmd (@var{shell-command})
|
|
Executes @var{shell-command} as a shell command.
|
|
|
|
The expansion of @code{syscmd} is void, @emph{not} the output from
|
|
@var{shell-command}! Output or error messages from @var{shell-command}
|
|
are not read by @code{m4}. @xref{Esyscmd}, if you need to process the
|
|
command output.
|
|
|
|
Prior to executing the command, @code{m4} flushes its buffers.
|
|
The default standard input, output and error of @var{shell-command} are
|
|
the same as those of @code{m4}.
|
|
|
|
By default, the @var{shell-command} will be used as the argument to the
|
|
@option{-c} option of the @command{/bin/sh} shell (or the version of
|
|
@command{sh} specified by @samp{command -p getconf PATH}, if your system
|
|
supports that). If you prefer a different shell, the
|
|
@command{configure} script can be given the option
|
|
@option{--with-syscmd-shell=@var{location}} to set the location of an
|
|
alternative shell at GNU @code{m4} installation; the
|
|
alternative shell must still support @option{-c}.
|
|
|
|
The macro @code{syscmd} is recognized only with parameters.
|
|
@end deffn
|
|
|
|
@example
|
|
define(`foo', `FOO')
|
|
@result{}
|
|
syscmd(`echo foo')
|
|
@result{}foo
|
|
@result{}
|
|
@end example
|
|
|
|
Note how the expansion of @code{syscmd} keeps the trailing newline of
|
|
the command, as well as using the newline that appeared after the macro.
|
|
|
|
The following is an example of @var{shell-command} using the same
|
|
standard input as @code{m4}:
|
|
|
|
@comment ignore
|
|
@example
|
|
$ @kbd{echo "m4wrap(\`syscmd(\`cat')')" | m4}
|
|
@result{}
|
|
@end example
|
|
|
|
@ignore
|
|
@comment If the user types the example below with stdin being an
|
|
@comment interactive terminal, then cat will hang waiting for additional
|
|
@comment input after m4 has exited. But the testsuite is using a pipe
|
|
@comment for stdin. Hence, we have two versions - the one we feed the
|
|
@comment testsuite below, and the one we display to the user above that
|
|
@comment more accurately shows what the testsuite is really doing but
|
|
@comment which the testsuite cannot parse.
|
|
|
|
@example
|
|
m4wrap(`syscmd(`cat')')
|
|
@result{}
|
|
^D
|
|
@end example
|
|
@end ignore
|
|
|
|
It tells @code{m4} to read all of its input before executing the wrapped
|
|
text, then hand a valid (albeit emptied) pipe as standard input for the
|
|
@code{cat} subcommand. Therefore, you should be careful when using
|
|
standard input (either by specifying no files, or by passing @samp{-} as
|
|
a file name on the command line, @pxref{Command line files, , Invoking
|
|
m4}), and also invoking subcommands via @code{syscmd} or @code{esyscmd}
|
|
that consume data from standard input. When standard input is a
|
|
seekable file, the subprocess will pick up with the next character not
|
|
yet processed by @code{m4}; when it is a pipe or other non-seekable
|
|
file, there is no guarantee how much data will already be buffered by
|
|
@code{m4} and thus unavailable to the child.
|
|
|
|
@node Esyscmd
|
|
@section Reading the output of commands
|
|
|
|
@cindex GNU extensions
|
|
If you want @code{m4} to read the output of a shell command, use
|
|
@code{esyscmd}:
|
|
|
|
@deffn Builtin esyscmd (@var{shell-command})
|
|
Expands to the standard output of the shell command
|
|
@var{shell-command}.
|
|
|
|
Prior to executing the command, @code{m4} flushes its buffers.
|
|
The default standard input and standard error of @var{shell-command} are
|
|
the same as those of @code{m4}. The error output of @var{shell-command}
|
|
is not a part of the expansion: it will appear along with the error
|
|
output of @code{m4}.
|
|
|
|
By default, the @var{shell-command} will be used as the argument to the
|
|
@option{-c} option of the @command{/bin/sh} shell (or the version of
|
|
@command{sh} specified by @samp{command -p getconf PATH}, if your system
|
|
supports that). If you prefer a different shell, the
|
|
@command{configure} script can be given the option
|
|
@option{--with-syscmd-shell=@var{location}} to set the location of an
|
|
alternative shell at GNU @code{m4} installation; the
|
|
alternative shell must still support @option{-c}.
|
|
|
|
The macro @code{esyscmd} is recognized only with parameters.
|
|
@end deffn
|
|
|
|
@example
|
|
define(`foo', `FOO')
|
|
@result{}
|
|
esyscmd(`echo foo')
|
|
@result{}FOO
|
|
@result{}
|
|
@end example
|
|
|
|
Note how the expansion of @code{esyscmd} keeps the trailing newline of
|
|
the command, as well as using the newline that appeared after the macro.
|
|
|
|
Just as with @code{syscmd}, care must be exercised when sharing standard
|
|
input between @code{m4} and the child process of @code{esyscmd}.
|
|
|
|
@node Sysval
|
|
@section Exit status
|
|
|
|
@cindex UNIX commands, exit status from
|
|
@cindex exit status from shell commands
|
|
@cindex shell commands, exit status from
|
|
@cindex commands, exit status from shell
|
|
@cindex status of shell commands
|
|
To see whether a shell command succeeded, use @code{sysval}:
|
|
|
|
@deffn Builtin sysval
|
|
Expands to the exit status of the last shell command run with
|
|
@code{syscmd} or @code{esyscmd}. Expands to 0 if no command has been
|
|
run yet.
|
|
@end deffn
|
|
|
|
@example
|
|
sysval
|
|
@result{}0
|
|
syscmd(`false')
|
|
@result{}
|
|
ifelse(sysval, `0', `zero', `non-zero')
|
|
@result{}non-zero
|
|
syscmd(`exit 2')
|
|
@result{}
|
|
sysval
|
|
@result{}2
|
|
syscmd(`true')
|
|
@result{}
|
|
sysval
|
|
@result{}0
|
|
esyscmd(`false')
|
|
@result{}
|
|
ifelse(sysval, `0', `zero', `non-zero')
|
|
@result{}non-zero
|
|
esyscmd(`echo dnl && exit 127')
|
|
@result{}
|
|
sysval
|
|
@result{}127
|
|
esyscmd(`true')
|
|
@result{}
|
|
sysval
|
|
@result{}0
|
|
@end example
|
|
|
|
@code{sysval} results in 127 if there was a problem executing the
|
|
command, for example, if the system-imposed argument length is exceeded,
|
|
or if there were not enough resources to fork. It is not possible to
|
|
distinguish between failed execution and successful execution that had
|
|
an exit status of 127, unless there was output from the child process.
|
|
|
|
On UNIX platforms, where it is possible to detect when command execution
|
|
is terminated by a signal, rather than a normal exit, the result is the
|
|
signal number shifted left by eight bits.
|
|
|
|
@comment This test has difficulties being portable, even on platforms
|
|
@comment where syscmd invokes /bin/sh. Kill is not portable with signal
|
|
@comment names. According to autoconf, the only portable signal numbers
|
|
@comment are 1 (HUP), 2 (INT), 9 (KILL), 13 (PIPE) and 15 (TERM). But
|
|
@comment all shells handle SIGINT, and ksh handles HUP (as in, the shell
|
|
@comment exits normally rather than letting the signal terminate it).
|
|
@comment Also, TERM is flaky, as it can also kill the running m4 on
|
|
@comment systems where /bin/sh does not create its own process group.
|
|
@comment And PIPE is unreliable, since people tend to run with it
|
|
@comment ignored, with m4 inheriting that choice. That leaves KILL as
|
|
@comment the only signal we can reliably test, but even that is tricky:
|
|
@comment on Haiku, 'kill -9' actually causes a process to die with
|
|
@comment signal 15 named KILLTHR on that platform.
|
|
@example
|
|
dnl This test assumes kill is a shell builtin, and that signals are
|
|
dnl recognizable.
|
|
ifdef(`__unix__', ,
|
|
`errprint(` skipping: syscmd does not have unix semantics
|
|
')m4exit(`77')')dnl
|
|
changequote(`[', `]')
|
|
@result{}
|
|
syscmd([@{ /bin/sh -c 'kill -9 $$'; @} 2>/dev/null; st=$?;
|
|
test $st = 137 || test $st = 265])
|
|
@result{}
|
|
ifelse(sysval, [0], , [errprint([ skipping: shell does not send signal 9
|
|
])m4exit([77])])dnl
|
|
syscmd([kill -9 $$])
|
|
@result{}
|
|
sysval
|
|
@result{}2304
|
|
syscmd()
|
|
@result{}
|
|
sysval
|
|
@result{}0
|
|
esyscmd([kill -9 $$])
|
|
@result{}
|
|
sysval
|
|
@result{}2304
|
|
@end example
|
|
|
|
@node Mkstemp
|
|
@section Making temporary files
|
|
|
|
@cindex temporary file names
|
|
@cindex files, names of temporary
|
|
Commands specified to @code{syscmd} or @code{esyscmd} might need a
|
|
temporary file, for output or for some other purpose. There is a
|
|
builtin macro, @code{mkstemp}, for making a temporary file:
|
|
|
|
@deffn Builtin mkstemp (@var{template})
|
|
@deffnx Builtin maketemp (@var{template})
|
|
Expands to the quoted name of a new, empty file, made from the string
|
|
@var{template}, which should end with the string @samp{XXXXXX}. The six
|
|
@samp{X} characters are then replaced with random characters matching
|
|
the regular expression @samp{[a-zA-Z0-9._-]}, in order to make the file
|
|
name unique. If fewer than six @samp{X} characters are found at the end
|
|
of @code{template}, the result will be longer than the template. The
|
|
created file will have access permissions as if by @kbd{chmod =rw,go=},
|
|
meaning that the current umask of the @code{m4} process is taken into
|
|
account, and at most only the current user can read and write the file.
|
|
|
|
The traditional behavior, standardized by POSIX, is that
|
|
@code{maketemp} merely replaces the trailing @samp{X} with the process
|
|
id, without creating a file or quoting the expansion, and without
|
|
ensuring that the resulting
|
|
string is a unique file name. In part, this means that using the same
|
|
@var{template} twice in the same input file will result in the same
|
|
expansion. This behavior is a security hole, as it is very easy for
|
|
another process to guess the name that will be generated, and thus
|
|
interfere with a subsequent use of @code{syscmd} trying to manipulate
|
|
that file name. Hence, POSIX has recommended that all new
|
|
implementations of @code{m4} provide the secure @code{mkstemp} builtin,
|
|
and that users of @code{m4} check for its existence.
|
|
|
|
The expansion is void and an error issued if a temporary file could
|
|
not be created.
|
|
|
|
The macros @code{mkstemp} and @code{maketemp} are recognized only with
|
|
parameters.
|
|
@end deffn
|
|
|
|
If you try this next example, you will most likely get different output
|
|
for the two file names, since the replacement characters are randomly
|
|
chosen:
|
|
|
|
@comment ignore
|
|
@example
|
|
$ @kbd{m4}
|
|
define(`tmp', `oops')
|
|
@result{}
|
|
maketemp(`/tmp/fooXXXXXX')
|
|
@result{}/tmp/fooa07346
|
|
ifdef(`mkstemp', `define(`maketemp', defn(`mkstemp'))',
|
|
`define(`mkstemp', defn(`maketemp'))dnl
|
|
errprint(`warning: potentially insecure maketemp implementation
|
|
')')
|
|
@result{}
|
|
mkstemp(`doc')
|
|
@result{}docQv83Uw
|
|
@end example
|
|
|
|
@cindex GNU extensions
|
|
Unless you use the @option{--traditional} command line option (or
|
|
@option{-G}, @pxref{Limits control, , Invoking m4}), the GNU
|
|
version of @code{maketemp} is secure. This means that using the same
|
|
template to multiple calls will generate multiple files. However, we
|
|
recommend that you use the new @code{mkstemp} macro, introduced in
|
|
GNU M4 1.4.8, which is secure even in traditional mode. Also,
|
|
as of M4 1.4.11, the secure implementation quotes the resulting file
|
|
name, so that you are guaranteed to know what file was created even if
|
|
the random file name happens to match an existing macro. Notice that
|
|
this example is careful to use @code{defn} to avoid unintended expansion
|
|
of @samp{foo}.
|
|
|
|
@example
|
|
$ @kbd{m4}
|
|
define(`foo', `errprint(`oops')')
|
|
@result{}
|
|
syscmd(`rm -f foo-??????')sysval
|
|
@result{}0
|
|
define(`file1', maketemp(`foo-XXXXXX'))dnl
|
|
ifelse(esyscmd(`echo \` foo-?????? \''), ` foo-?????? ',
|
|
`no file', `created')
|
|
@result{}created
|
|
define(`file2', maketemp(`foo-XX'))dnl
|
|
define(`file3', mkstemp(`foo-XXXXXX'))dnl
|
|
ifelse(len(defn(`file1')), len(defn(`file2')),
|
|
`same length', `different')
|
|
@result{}same length
|
|
ifelse(defn(`file1'), defn(`file2'), `same', `different file')
|
|
@result{}different file
|
|
ifelse(defn(`file2'), defn(`file3'), `same', `different file')
|
|
@result{}different file
|
|
ifelse(defn(`file1'), defn(`file3'), `same', `different file')
|
|
@result{}different file
|
|
syscmd(`rm 'defn(`file1') defn(`file2') defn(`file3'))
|
|
@result{}
|
|
sysval
|
|
@result{}0
|
|
@end example
|
|
|
|
@ignore
|
|
@c Not worth documenting, but make sure we don't leave trailing NUL in
|
|
@c the expansion.
|
|
|
|
@example
|
|
syscmd(`rm -rf foodir')sysval
|
|
@result{}0
|
|
syscmd(`mkdir foodir')sysval
|
|
@result{}0
|
|
len(mkstemp(`foodir/fooXXXXX'))
|
|
@result{}16
|
|
syscmd(`rm -r foodir')sysval
|
|
@result{}0
|
|
@end example
|
|
|
|
@c Likewise, and ensure that traditional mode leaves the result unquoted
|
|
@c without creating a file.
|
|
|
|
@comment options: -G
|
|
@example
|
|
syscmd(`rm -f foo-*')sysval
|
|
@result{}0
|
|
len(maketemp(`foo-XXXXX'))
|
|
@error{}m4:stdin:2: recommend using mkstemp instead
|
|
@result{}9
|
|
define(`abc', `def')
|
|
@result{}
|
|
maketemp(`foo-abc')
|
|
@result{}foo-def
|
|
@error{}m4:stdin:4: recommend using mkstemp instead
|
|
syscmd(`test -f foo-*')ifelse(sysval, `0', `0', `1')
|
|
@result{}1
|
|
@end example
|
|
@end ignore
|
|
|
|
@node Miscellaneous
|
|
@chapter Miscellaneous builtin macros
|
|
|
|
This chapter describes various builtins, that do not really belong in
|
|
any of the previous chapters.
|
|
|
|
@menu
|
|
* Errprint:: Printing error messages
|
|
* Location:: Printing current location
|
|
* M4exit:: Exiting from @code{m4}
|
|
@end menu
|
|
|
|
@node Errprint
|
|
@section Printing error messages
|
|
|
|
@cindex printing error messages
|
|
@cindex error messages, printing
|
|
@cindex messages, printing error
|
|
@cindex standard error, output to
|
|
You can print error messages using @code{errprint}:
|
|
|
|
@deffn Builtin errprint (@var{message}, @dots{})
|
|
Prints @var{message} and the rest of the arguments to standard error,
|
|
separated by spaces. Standard error is used, regardless of the
|
|
@option{--debugfile} option (@pxref{Debugging options, , Invoking m4}).
|
|
|
|
The expansion of @code{errprint} is void.
|
|
The macro @code{errprint} is recognized only with parameters.
|
|
@end deffn
|
|
|
|
@example
|
|
errprint(`Invalid arguments to forloop
|
|
')
|
|
@error{}Invalid arguments to forloop
|
|
@result{}
|
|
errprint(`1')errprint(`2',`3
|
|
')
|
|
@error{}12 3
|
|
@result{}
|
|
@end example
|
|
|
|
A trailing newline is @emph{not} printed automatically, so it should be
|
|
supplied as part of the argument, as in the example. Unfortunately, the
|
|
exact output of @code{errprint} is not very portable to other @code{m4}
|
|
implementations: POSIX requires that all arguments be printed,
|
|
but some implementations of @code{m4} only print the first.
|
|
Furthermore, some BSD implementations always append a newline
|
|
for each @code{errprint} call, regardless of whether the last argument
|
|
already had one, and POSIX is silent on whether this is
|
|
acceptable.
|
|
|
|
@node Location
|
|
@section Printing current location
|
|
|
|
@cindex location, input
|
|
@cindex input location
|
|
To make it possible to specify the location of an error, three
|
|
utility builtins exist:
|
|
|
|
@deffn Builtin __file__
|
|
@deffnx Builtin __line__
|
|
@deffnx Builtin __program__
|
|
Expand to the quoted name of the current input file, the
|
|
current input line number in that file, and the quoted name of the
|
|
current invocation of @code{m4}.
|
|
@end deffn
|
|
|
|
@example
|
|
errprint(__program__:__file__:__line__: `input error
|
|
')
|
|
@error{}m4:stdin:1: input error
|
|
@result{}
|
|
@end example
|
|
|
|
Line numbers start at 1 for each file. If the file was found due to the
|
|
@option{-I} option or @env{M4PATH} environment variable, that is
|
|
reflected in the file name. The syncline option (@option{-s},
|
|
@pxref{Preprocessor features, , Invoking m4}), and the
|
|
@samp{f} and @samp{l} flags of @code{debugmode} (@pxref{Debug Levels}),
|
|
also use this notion of current file and line. Redefining the three
|
|
location macros has no effect on syncline, debug, warning, or error
|
|
message output.
|
|
|
|
This example reuses the file @file{incl.m4} mentioned earlier
|
|
(@pxref{Include}):
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
define(`foo', ``$0' called at __file__:__line__')
|
|
@result{}
|
|
foo
|
|
@result{}foo called at stdin:2
|
|
include(`incl.m4')
|
|
@result{}Include file start
|
|
@result{}foo called at examples/incl.m4:2
|
|
@result{}Include file end
|
|
@result{}
|
|
@end example
|
|
|
|
The location of macros invoked during the rescanning of macro expansion
|
|
text corresponds to the location in the file where the expansion was
|
|
triggered, regardless of how many newline characters the expansion text
|
|
contains. As of GNU M4 1.4.8, the location of text wrapped
|
|
with @code{m4wrap} (@pxref{M4wrap}) is the point at which the
|
|
@code{m4wrap} was invoked. Previous versions, however, behaved as
|
|
though wrapped text came from line 0 of the file ``''.
|
|
|
|
@example
|
|
define(`echo', `$@@')
|
|
@result{}
|
|
define(`foo', `echo(__line__
|
|
__line__)')
|
|
@result{}
|
|
echo(__line__
|
|
__line__)
|
|
@result{}4
|
|
@result{}5
|
|
m4wrap(`foo
|
|
')
|
|
@result{}
|
|
foo(errprint(__line__
|
|
__line__
|
|
))
|
|
@error{}8
|
|
@error{}9
|
|
@result{}8
|
|
@result{}8
|
|
__line__
|
|
@result{}11
|
|
m4wrap(`__line__
|
|
')
|
|
@result{}
|
|
^D
|
|
@result{}12
|
|
@result{}6
|
|
@result{}6
|
|
@end example
|
|
|
|
The @code{@w{__program__}} macro behaves like @samp{$0} in shell
|
|
terminology. If you invoke @code{m4} through an absolute path or a link
|
|
with a different spelling, rather than by relying on a @env{PATH} search
|
|
for plain @samp{m4}, it will affect how @code{@w{__program__}} expands.
|
|
The intent is that you can use it to produce error messages with the
|
|
same formatting that @code{m4} produces internally. It can also be used
|
|
within @code{syscmd} (@pxref{Syscmd}) to pick the same version of
|
|
@code{m4} that is currently running, rather than whatever version of
|
|
@code{m4} happens to be first in @env{PATH}. It was first introduced in
|
|
GNU M4 1.4.6.
|
|
|
|
@node M4exit
|
|
@section Exiting from @code{m4}
|
|
|
|
@cindex exiting from @code{m4}
|
|
@cindex status, setting @code{m4} exit
|
|
If you need to exit from @code{m4} before the entire input has been
|
|
read, you can use @code{m4exit}:
|
|
|
|
@deffn Builtin m4exit (@dvar{code, 0})
|
|
Causes @code{m4} to exit, with exit status @var{code}. If @var{code} is
|
|
left out, the exit status is zero. If @var{code} cannot be parsed, or
|
|
is outside the range of 0 to 255, the exit status is one. No further
|
|
input is read, and all wrapped and diverted text is discarded.
|
|
@end deffn
|
|
|
|
@example
|
|
m4wrap(`This text is lost due to `m4exit'.')
|
|
@result{}
|
|
divert(`1') So is this.
|
|
divert
|
|
@result{}
|
|
m4exit And this is never read.
|
|
@end example
|
|
|
|
A common use of this is to abort processing:
|
|
|
|
@deffn Composite fatal_error (@var{message})
|
|
Abort processing with an error message and non-zero status. Prefix
|
|
@var{message} with details about where the error occurred, and print the
|
|
resulting string to standard error.
|
|
@end deffn
|
|
|
|
@comment status: 1
|
|
@example
|
|
define(`fatal_error',
|
|
`errprint(__program__:__file__:__line__`: fatal error: $*
|
|
')m4exit(`1')')
|
|
@result{}
|
|
fatal_error(`this is a BAD one, buster')
|
|
@error{}m4:stdin:4: fatal error: this is a BAD one, buster
|
|
@end example
|
|
|
|
After this macro call, @code{m4} will exit with exit status 1. This macro
|
|
is only intended for error exits, since the normal exit procedures are
|
|
not followed, i.e., diverted text is not undiverted, and saved text
|
|
(@pxref{M4wrap}) is not reread. (This macro could be made more robust
|
|
to earlier versions of @code{m4}. You should try to see if you can find
|
|
weaknesses and correct them; or @pxref{Improved fatal_error, , Answers}).
|
|
|
|
Note that it is still possible for the exit status to be different than
|
|
what was requested by @code{m4exit}. If @code{m4} detects some other
|
|
error, such as a write error on standard output, the exit status will be
|
|
non-zero even if @code{m4exit} requested zero.
|
|
|
|
If standard input is seekable, then the file will be positioned at the
|
|
next unread character. If it is a pipe or other non-seekable file,
|
|
then there are no guarantees how much data @code{m4} might have read
|
|
into buffers, and thus discarded.
|
|
|
|
@node Frozen files
|
|
@chapter Fast loading of frozen state
|
|
|
|
Some bigger @code{m4} applications may be built over a common base
|
|
containing hundreds of definitions and other costly initializations.
|
|
Usually, the common base is kept in one or more declarative files,
|
|
which files are listed on each @code{m4} invocation prior to the
|
|
user's input file, or else each input file uses @code{include}.
|
|
|
|
Reading the common base of a big application, over and over again, may
|
|
be time consuming. GNU @code{m4} offers some machinery to
|
|
speed up the start of an application using lengthy common bases.
|
|
|
|
@menu
|
|
* Using frozen files:: Using frozen files
|
|
* Frozen file format:: Frozen file format
|
|
@end menu
|
|
|
|
@node Using frozen files
|
|
@section Using frozen files
|
|
|
|
@cindex fast loading of frozen files
|
|
@cindex frozen files for fast loading
|
|
@cindex initialization, frozen state
|
|
@cindex dumping into frozen file
|
|
@cindex reloading a frozen file
|
|
@cindex GNU extensions
|
|
Suppose a user has a library of @code{m4} initializations in
|
|
@file{base.m4}, which is then used with multiple input files:
|
|
|
|
@comment ignore
|
|
@example
|
|
$ @kbd{m4 base.m4 input1.m4}
|
|
$ @kbd{m4 base.m4 input2.m4}
|
|
$ @kbd{m4 base.m4 input3.m4}
|
|
@end example
|
|
|
|
Rather than spending time parsing the fixed contents of @file{base.m4}
|
|
every time, the user might rather execute:
|
|
|
|
@comment ignore
|
|
@example
|
|
$ @kbd{m4 -F base.m4f base.m4}
|
|
@end example
|
|
|
|
@noindent
|
|
once, and further execute, as often as needed:
|
|
|
|
@comment ignore
|
|
@example
|
|
$ @kbd{m4 -R base.m4f input1.m4}
|
|
$ @kbd{m4 -R base.m4f input2.m4}
|
|
$ @kbd{m4 -R base.m4f input3.m4}
|
|
@end example
|
|
|
|
@noindent
|
|
with the varying input. The first call, containing the @option{-F}
|
|
option, only reads and executes file @file{base.m4}, defining
|
|
various application macros and computing other initializations.
|
|
Once the input file @file{base.m4} has been completely processed, GNU
|
|
@code{m4} produces in @file{base.m4f} a @dfn{frozen} file, that is, a
|
|
file which contains a kind of snapshot of the @code{m4} internal state.
|
|
|
|
Later calls, containing the @option{-R} option, are able to reload
|
|
the internal state of @code{m4}, from @file{base.m4f},
|
|
@emph{prior} to reading any other input files. This means
|
|
instead of starting with a virgin copy of @code{m4}, input will be
|
|
read after having effectively recovered the effect of a prior run.
|
|
In our example, the effect is the same as if file @file{base.m4} has
|
|
been read anew. However, this effect is achieved a lot faster.
|
|
|
|
Only one frozen file may be created or read in any one @code{m4}
|
|
invocation. It is not possible to recover two frozen files at once.
|
|
However, frozen files may be updated incrementally, through using
|
|
@option{-R} and @option{-F} options simultaneously. For example, if
|
|
some care is taken, the command:
|
|
|
|
@comment ignore
|
|
@example
|
|
$ @kbd{m4 file1.m4 file2.m4 file3.m4 file4.m4}
|
|
@end example
|
|
|
|
@noindent
|
|
could be broken down in the following sequence, accumulating the same
|
|
output:
|
|
|
|
@comment ignore
|
|
@example
|
|
$ @kbd{m4 -F file1.m4f file1.m4}
|
|
$ @kbd{m4 -R file1.m4f -F file2.m4f file2.m4}
|
|
$ @kbd{m4 -R file2.m4f -F file3.m4f file3.m4}
|
|
$ @kbd{m4 -R file3.m4f file4.m4}
|
|
@end example
|
|
|
|
Some care is necessary because not every effort has been made for
|
|
this to work in all cases. In particular, the trace attribute of
|
|
macros is not handled, nor the current setting of @code{changeword}.
|
|
Currently, @code{m4wrap} and @code{sysval} also have problems.
|
|
Also, interactions for some options of @code{m4}, being used in one call
|
|
and not in the next, have not been fully analyzed yet. On the other
|
|
end, you may be confident that stacks of @code{pushdef} definitions
|
|
are handled correctly, as well as undefined or renamed builtins, and
|
|
changed strings for quotes or comments. And future releases of
|
|
GNU M4 will improve on the utility of frozen files.
|
|
|
|
@ignore
|
|
@c This example is not worth putting in the manual, but caused core
|
|
@c dumps in all versions prior to 1.4.11.
|
|
|
|
@comment options: -F /dev/null
|
|
@example
|
|
traceon(`undefined')dnl
|
|
@end example
|
|
|
|
@c Make sure freezing is successful.
|
|
|
|
@example
|
|
ifdef(`__unix__', ,
|
|
`errprint(` skipping: syscmd does not have unix semantics
|
|
')m4exit(`77')')dnl
|
|
changequote(`[', `]')dnl
|
|
syscmd([echo 'changequote([,])pushdef([divnum],[hi])dnl' \
|
|
| ']__program__[' -F in.m4f \
|
|
&& echo 'divnum popdef([divnum])divnum' \
|
|
| ']__program__[' -R in.m4f \
|
|
&& rm in.m4f])status sysval
|
|
@result{}hi 0
|
|
@result{}status 0
|
|
@end example
|
|
|
|
@c Detect inability to freeze.
|
|
@c Some systems harden /, and fail with EACCES rather than ENOENT.
|
|
|
|
@comment options: -F /none/such
|
|
@comment xerr: ignore
|
|
@comment status: 1
|
|
@example
|
|
$ @kbd{m4 -F /none/such}
|
|
^D
|
|
@error{}m4: cannot open `/none/such': No such file or directory
|
|
@end example
|
|
@end ignore
|
|
|
|
When an @code{m4} run is to be frozen, the automatic undiversion
|
|
which takes place at end of execution is inhibited. Instead, all
|
|
positively numbered diversions are saved into the frozen file.
|
|
The active diversion number is also transmitted.
|
|
|
|
A frozen file to be reloaded need not reside in the current directory.
|
|
It is looked up the same way as an @code{include} file (@pxref{Search
|
|
Path}).
|
|
|
|
If the frozen file was generated with a newer version of @code{m4}, and
|
|
contains directives that an older @code{m4} cannot parse, attempting to
|
|
load the frozen file with option @option{-R} will cause @code{m4} to
|
|
exit with status 63 to indicate version mismatch.
|
|
|
|
@node Frozen file format
|
|
@section Frozen file format
|
|
|
|
@cindex frozen file format
|
|
@cindex file format, frozen file
|
|
Frozen files are sharable across architectures. It is safe to write
|
|
a frozen file on one machine and read it on another, given that the
|
|
second machine uses the same or newer version of GNU @code{m4}.
|
|
It is conventional, but not required, to give a frozen file the suffix
|
|
of @code{.m4f}.
|
|
|
|
These are simple (editable) text files, made up of directives,
|
|
each starting with a capital letter and ending with a newline
|
|
(@key{NL}). Wherever a directive is expected, the character
|
|
@samp{#} introduces a comment line; empty lines are also ignored if they
|
|
are not part of an embedded string.
|
|
In the following descriptions, each @var{len} refers to the length of
|
|
the corresponding strings @var{str} in the next line of input. Numbers
|
|
are always expressed in decimal. There are no escape characters. The
|
|
directives are:
|
|
|
|
@table @code
|
|
@item C @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
|
|
Uses @var{str1} and @var{str2} as the begin-comment and
|
|
end-comment strings. If omitted, then @samp{#} and @key{NL} are the
|
|
comment delimiters.
|
|
|
|
@item D @var{number}, @var{len} @key{NL} @var{str} @key{NL}
|
|
Selects diversion @var{number}, making it current, then copy
|
|
@var{str} in the current diversion. @var{number} may be a negative
|
|
number for a non-existing diversion. To merely specify an active
|
|
selection, use this command with an empty @var{str}. With 0 as the
|
|
diversion @var{number}, @var{str} will be issued on standard output
|
|
at reload time. GNU @code{m4} will not produce the @samp{D}
|
|
directive with non-zero length for diversion 0, but this can be done
|
|
with manual edits. This directive may
|
|
appear more than once for the same diversion, in which case the
|
|
diversion is the concatenation of the various uses. If omitted, then
|
|
diversion 0 is current.
|
|
|
|
@item F @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
|
|
Defines, through @code{pushdef}, a definition for @var{str1}
|
|
expanding to the function whose builtin name is @var{str2}. If the
|
|
builtin does not exist (for example, if the frozen file was produced by
|
|
a copy of @code{m4} compiled with changeword support, but the version
|
|
of @code{m4} reloading was compiled without it), the reload is silent,
|
|
but any subsequent use of the definition of @var{str1} will result in
|
|
a warning. This directive may appear more than once for the same name,
|
|
and its order, along with @samp{T}, is important. If omitted, you will
|
|
have no access to any builtins.
|
|
|
|
@item Q @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
|
|
Uses @var{str1} and @var{str2} as the begin-quote and end-quote
|
|
strings. If omitted, then @samp{`} and @samp{'} are the quote
|
|
delimiters.
|
|
|
|
@item T @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
|
|
Defines, though @code{pushdef}, a definition for @var{str1}
|
|
expanding to the text given by @var{str2}. This directive may appear
|
|
more than once for the same name, and its order, along with @samp{F}, is
|
|
important.
|
|
|
|
@item V @var{number} @key{NL}
|
|
Confirms the format of the file. @code{m4} @value{VERSION} only creates
|
|
and understands frozen files where @var{number} is 1. This directive
|
|
must be the first non-comment in the file, and may not appear more than
|
|
once.
|
|
@end table
|
|
|
|
@node Compatibility
|
|
@chapter Compatibility with other versions of @code{m4}
|
|
|
|
@cindex compatibility
|
|
This chapter describes the many of the differences between this
|
|
implementation of @code{m4}, and of other implementations found under
|
|
UNIX, such as System V Release 4, Solaris, and BSD flavors.
|
|
In particular, it lists the known differences and extensions to
|
|
POSIX. However, the list is not necessarily comprehensive.
|
|
|
|
At the time of this writing, POSIX 2001 (also known as IEEE
|
|
Std 1003.1-2001) is the latest standard, although a new version of
|
|
POSIX is under development and includes several proposals for
|
|
modifying what @code{m4} is required to do. The requirements for
|
|
@code{m4} are shared between SUSv3 and POSIX, and
|
|
can be viewed at
|
|
@uref{https://www.opengroup.org/onlinepubs/@/000095399/@/utilities/@/m4.html}.
|
|
|
|
@menu
|
|
* Extensions:: Extensions in GNU M4
|
|
* Incompatibilities:: Facilities in System V m4 not in GNU M4
|
|
* Other Incompatibilities:: Other incompatibilities
|
|
@end menu
|
|
|
|
@node Extensions
|
|
@section Extensions in GNU M4
|
|
|
|
@cindex GNU extensions
|
|
@cindex POSIX
|
|
This version of @code{m4} contains a few facilities that do not exist
|
|
in System V @code{m4}. These extra facilities are all suppressed by
|
|
using the @option{-G} command line option (@pxref{Limits control, ,
|
|
Invoking m4}), unless overridden by other command line options.
|
|
|
|
@itemize @bullet
|
|
@item
|
|
In the @code{$@var{n}} notation for macro arguments, @var{n} can contain
|
|
several digits, while the System V @code{m4} only accepts one digit.
|
|
This allows macros in GNU @code{m4} to take any number of
|
|
arguments, and not only nine (@pxref{Arguments}).
|
|
|
|
This means that @code{define(`foo', `$11')} is ambiguous between
|
|
implementations. To portably choose between grabbing the first
|
|
parameter and appending 1 to the expansion, or grabbing the eleventh
|
|
parameter, you can do the following:
|
|
|
|
@example
|
|
define(`a1', `A1')
|
|
@result{}
|
|
dnl First argument, concatenated with 1
|
|
define(`_1', `$1')define(`first1', `_1($@@)1')
|
|
@result{}
|
|
dnl Eleventh argument, portable
|
|
define(`_9', `$9')define(`eleventh', `_9(shift(shift($@@)))')
|
|
@result{}
|
|
dnl Eleventh argument, GNU style
|
|
define(`Eleventh', `$11')
|
|
@result{}
|
|
first1(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k')
|
|
@result{}A1
|
|
eleventh(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k')
|
|
@result{}k
|
|
Eleventh(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k')
|
|
@result{}k
|
|
@end example
|
|
|
|
@noindent
|
|
Also see the @code{argn} macro (@pxref{Shift}).
|
|
|
|
@item
|
|
The @code{divert} (@pxref{Divert}) macro can manage more than 9
|
|
diversions. GNU @code{m4} treats all positive numbers as valid
|
|
diversions, rather than discarding diversions greater than 9.
|
|
|
|
@item
|
|
Files included with @code{include} and @code{sinclude} are sought in a
|
|
user specified search path, if they are not found in the working
|
|
directory. The search path is specified by the @option{-I} option and the
|
|
@env{M4PATH} environment variable (@pxref{Search Path}).
|
|
|
|
@item
|
|
Arguments to @code{undivert} can be non-numeric, in which case the named
|
|
file will be included uninterpreted in the output (@pxref{Undivert}).
|
|
|
|
@item
|
|
Formatted output is supported through the @code{format} builtin, which
|
|
is modeled after the C library function @code{printf} (@pxref{Format}).
|
|
|
|
@item
|
|
Searches and text substitution through basic regular expressions are
|
|
supported by the @code{regexp} (@pxref{Regexp}) and @code{patsubst}
|
|
(@pxref{Patsubst}) builtins. Some BSD implementations use
|
|
extended regular expressions instead.
|
|
|
|
@item
|
|
The output of shell commands can be read into @code{m4} with
|
|
@code{esyscmd} (@pxref{Esyscmd}).
|
|
|
|
@item
|
|
There is indirect access to any builtin macro with @code{builtin}
|
|
(@pxref{Builtin}).
|
|
|
|
@item
|
|
Macros can be called indirectly through @code{indir} (@pxref{Indir}).
|
|
|
|
@item
|
|
The name of the program, the current input file, and the current input
|
|
line number are accessible through the builtins @code{@w{__program__}},
|
|
@code{@w{__file__}}, and @code{@w{__line__}} (@pxref{Location}).
|
|
|
|
@item
|
|
The format of the output from @code{dumpdef} and macro tracing can be
|
|
controlled with @code{debugmode} (@pxref{Debug Levels}).
|
|
|
|
@item
|
|
The destination of trace and debug output can be controlled with
|
|
@code{debugfile} (@pxref{Debug Output}).
|
|
|
|
@item
|
|
The @code{maketemp} (@pxref{Mkstemp}) macro behaves like @code{mkstemp},
|
|
creating a new file with a unique name on every invocation, rather than
|
|
following the insecure behavior of replacing the trailing @samp{X}
|
|
characters with the @code{m4} process id.
|
|
|
|
@item
|
|
POSIX only requires support for the command line options
|
|
@option{-s}, @option{-D}, and @option{-U}, so all other options accepted
|
|
by GNU M4 are extensions. @xref{Invoking m4}, for a
|
|
description of these options.
|
|
|
|
The debugging and tracing facilities in GNU @code{m4} are much
|
|
more extensive than in most other versions of @code{m4}.
|
|
@end itemize
|
|
|
|
@node Incompatibilities
|
|
@section Facilities in System V @code{m4} not in GNU @code{m4}
|
|
|
|
The version of @code{m4} from System V contains a few facilities that
|
|
have not been implemented in GNU @code{m4} yet. Additionally,
|
|
POSIX requires some behaviors that GNU @code{m4} has not
|
|
implemented yet. Relying on these behaviors is non-portable, as a
|
|
future release of GNU @code{m4} may change.
|
|
|
|
@itemize @bullet
|
|
@item
|
|
POSIX requires support for multiple arguments to @code{defn},
|
|
without any clarification on how @code{defn} behaves when one of the
|
|
multiple arguments names a builtin. System V @code{m4} and some other
|
|
implementations allow mixing builtins and text macros into a single
|
|
macro. GNU @code{m4} only supports joining multiple text
|
|
arguments, although a future implementation may lift this restriction to
|
|
behave more like System V@. The only portable way to join text macros
|
|
with builtins is via helper macros and implicit concatenation of macro
|
|
results.
|
|
|
|
@item
|
|
POSIX requires an application to exit with non-zero status if
|
|
it wrote an error message to stderr. This has not yet been consistently
|
|
implemented for the various builtins that are required to issue an error
|
|
(such as @code{eval} (@pxref{Eval}) when an argument cannot be parsed).
|
|
|
|
@item
|
|
Some traditional implementations only allow reading standard input
|
|
once, but GNU @code{m4} correctly handles multiple instances
|
|
of @samp{-} on the command line.
|
|
|
|
@item
|
|
POSIX requires @code{m4wrap} (@pxref{M4wrap}) to act in FIFO
|
|
(first-in, first-out) order, but GNU @code{m4} currently uses
|
|
LIFO order. Furthermore, POSIX states that only the first
|
|
argument to @code{m4wrap} is saved for later evaluation, but
|
|
GNU @code{m4} saves and processes all arguments, with output
|
|
separated by spaces.
|
|
|
|
@item
|
|
POSIX states that builtins that require arguments, but are
|
|
called without arguments, have undefined behavior. Traditional
|
|
implementations simply behave as though empty strings had been passed.
|
|
For example, @code{a`'define`'b} would expand to @code{ab}. But
|
|
GNU @code{m4} ignores certain builtins if they have missing
|
|
arguments, giving @code{adefineb} for the above example.
|
|
|
|
@item
|
|
Traditional implementations handle @code{define(`f',`1')} (@pxref{Define})
|
|
by undefining the entire stack of previous definitions, and if doing
|
|
@code{undefine(`f')} first. GNU @code{m4} replaces just the top
|
|
definition on the stack, as if doing @code{popdef(`f')} followed by
|
|
@code{pushdef(`f',`1')}. POSIX allows either behavior.
|
|
|
|
@item
|
|
POSIX 2001 requires @code{syscmd} (@pxref{Syscmd}) to evaluate
|
|
command output for macro expansion, but this was a mistake that is
|
|
anticipated to be corrected in the next version of POSIX.
|
|
GNU @code{m4} follows traditional behavior in @code{syscmd}
|
|
where output is not rescanned, and provides the extension @code{esyscmd}
|
|
that does scan the output.
|
|
|
|
@item
|
|
At one point, POSIX required @code{changequote(@var{arg})}
|
|
(@pxref{Changequote}) to use newline as the close quote, but this was a
|
|
bug, and the next version of POSIX is anticipated to state
|
|
that using empty strings or just one argument is unspecified.
|
|
Meanwhile, the GNU @code{m4} behavior of treating an empty
|
|
end-quote delimiter as @samp{'} is not portable, as Solaris treats it as
|
|
repeating the start-quote delimiter, and BSD treats it as leaving the
|
|
previous end-quote delimiter unchanged. For predictable results, never
|
|
call changequote with just one argument, or with empty strings for
|
|
arguments.
|
|
|
|
@item
|
|
At one point, POSIX required @code{changecom(@var{arg},)}
|
|
(@pxref{Changecom}) to make it impossible to end a comment, but this is
|
|
a bug, and the next version of POSIX is anticipated to state
|
|
that using empty strings is unspecified. Meanwhile, the GNU
|
|
@code{m4} behavior of treating an empty end-comment delimiter as newline
|
|
is not portable, as BSD treats it as leaving the previous end-comment
|
|
delimiter unchanged. It is also impossible in BSD implementations to
|
|
disable comments, even though that is required by POSIX. For
|
|
predictable results, never call changecom with empty strings for
|
|
arguments.
|
|
|
|
@item
|
|
Most implementations of @code{m4} give macros a higher precedence than
|
|
comments when parsing, meaning that if the start delimiter given to
|
|
@code{changecom} (@pxref{Changecom}) starts with a macro name, comments
|
|
are effectively disabled. POSIX does not specify what the
|
|
precedence is, so this version of GNU @code{m4} parser
|
|
recognizes comments, then macros, then quoted strings.
|
|
|
|
@item
|
|
Traditional implementations allow argument collection, but not string
|
|
and comment processing, to span file boundaries. Thus, if @file{a.m4}
|
|
contains @samp{len(}, and @file{b.m4} contains @samp{abc)},
|
|
@kbd{m4 a.m4 b.m4} outputs @samp{3} with traditional @code{m4}, but
|
|
gives an error message that the end of file was encountered inside a
|
|
macro with GNU @code{m4}. On the other hand, traditional
|
|
implementations do end of file processing for files included with
|
|
@code{include} or @code{sinclude} (@pxref{Include}), while GNU
|
|
@code{m4} seamlessly integrates the content of those files. Thus
|
|
@code{include(`a.m4')include(`b.m4')} will output @samp{3} instead of
|
|
giving an error.
|
|
|
|
@item
|
|
Traditional @code{m4} treats @code{traceon} (@pxref{Trace}) without
|
|
arguments as a global variable, independent of named macro tracing.
|
|
Also, once a macro is undefined, named tracing of that macro is lost.
|
|
On the other hand, when GNU @code{m4} encounters
|
|
@code{traceon} without
|
|
arguments, it turns tracing on for all existing definitions at the time,
|
|
but does not trace future definitions; @code{traceoff} without arguments
|
|
turns tracing off for all definitions regardless of whether they were
|
|
also traced by name; and tracing by name, such as with @option{-tfoo} at
|
|
the command line or @code{traceon(`foo')} in the input, is an attribute
|
|
that is preserved even if the macro is currently undefined.
|
|
|
|
Additionally, while POSIX requires trace output, it makes no
|
|
demands on the formatting of that output. Parsing trace output is not
|
|
guaranteed to be reliable, even between different releases of
|
|
GNU M4; however, the intent is that any future changes in
|
|
trace output will only occur under the direction of additional
|
|
@code{debugmode} flags (@pxref{Debug Levels}).
|
|
|
|
@item
|
|
POSIX requires @code{eval} (@pxref{Eval}) to treat all
|
|
operators with the same precedence as C@. However, earlier versions of
|
|
GNU @code{m4} followed the traditional behavior of other
|
|
@code{m4} implementations, where bitwise and logical negation (@samp{~}
|
|
and @samp{!}) have lower precedence than equality operators; and where
|
|
equality operators (@samp{==} and @samp{!=}) had the same precedence as
|
|
relational operators (such as @samp{<}). Use explicit parentheses to
|
|
ensure proper precedence. As extensions to POSIX,
|
|
GNU @code{m4} gives well-defined semantics to operations that
|
|
C leaves undefined, such as when overflow occurs, when shifting negative
|
|
numbers, or when performing division by zero. POSIX also
|
|
requires @samp{=} to cause an error, but many traditional
|
|
implementations allowed it as an alias for @samp{==}.
|
|
|
|
@item
|
|
POSIX 2001 requires @code{translit} (@pxref{Translit}) to
|
|
treat each character of the second and third arguments literally.
|
|
However, it is anticipated that the next version of POSIX will
|
|
allow the GNU @code{m4} behavior of treating @samp{-} as a
|
|
range operator.
|
|
|
|
@item
|
|
POSIX requires @code{m4} to honor the locale environment
|
|
variables of @env{LANG}, @env{LC_ALL}, @env{LC_CTYPE},
|
|
@env{LC_MESSAGES}, and @env{NLSPATH}, but this has not yet been
|
|
implemented in GNU @code{m4}.
|
|
|
|
@item
|
|
POSIX states that only unquoted leading newlines and blanks
|
|
(that is, space and tab) are ignored when collecting macro arguments.
|
|
However, this appears to be a bug in POSIX, since most
|
|
traditional implementations also ignore all whitespace (formfeed,
|
|
carriage return, and vertical tab). GNU @code{m4} follows
|
|
tradition and ignores all leading unquoted whitespace.
|
|
|
|
@item
|
|
@cindex @env{POSIXLY_CORRECT}
|
|
A strictly-compliant POSIX client is not allowed to use
|
|
command-line arguments not specified by POSIX. However, since
|
|
this version of M4 ignores @env{POSIXLY_CORRECT} and enables the option
|
|
@code{--gnu} by default (@pxref{Limits control, , Invoking m4}), a
|
|
client desiring to be strictly compliant has no way to disable
|
|
GNU extensions that conflict with POSIX when
|
|
directly invoking the compiled @code{m4}. A future version of
|
|
@code{GNU} M4 will honor the environment variable @env{POSIXLY_CORRECT},
|
|
implicitly enabling @option{--traditional} if it is set, in order to
|
|
allow a strictly-compliant client. In the meantime, a client needing
|
|
strict POSIX compliance can use the workaround of invoking a
|
|
shell script wrapper, where the wrapper then adds @option{--traditional}
|
|
to the arguments passed to the compiled @code{m4}.
|
|
@end itemize
|
|
|
|
@node Other Incompatibilities
|
|
@section Other incompatibilities
|
|
|
|
There are a few other incompatibilities between this implementation of
|
|
@code{m4}, and the System V version.
|
|
|
|
@itemize @bullet
|
|
@item
|
|
GNU @code{m4} implements sync lines differently from System V
|
|
@code{m4}, when text is being diverted. GNU @code{m4} outputs
|
|
the sync lines when the text is being diverted, and System V @code{m4}
|
|
when the diverted text is being brought back.
|
|
|
|
The problem is which lines and file names should be attached to text
|
|
that is being, or has been, diverted. System V @code{m4} regards all
|
|
the diverted text as being generated by the source line containing the
|
|
@code{undivert} call, whereas GNU @code{m4} regards the
|
|
diverted text as being generated at the time it is diverted.
|
|
|
|
The sync line option is used mostly when using @code{m4} as
|
|
a front end to a compiler. If a diverted line causes a compiler error,
|
|
the error messages should most probably refer to the place where the
|
|
diversion was made, and not where it was inserted again.
|
|
|
|
@comment options: -s
|
|
@example
|
|
divert(2)2
|
|
divert(1)1
|
|
divert`'0
|
|
@result{}#line 3 "stdin"
|
|
@result{}0
|
|
^D
|
|
@result{}#line 2 "stdin"
|
|
@result{}1
|
|
@result{}#line 1 "stdin"
|
|
@result{}2
|
|
@end example
|
|
|
|
The current @code{m4} implementation has a limitation that the syncline
|
|
output at the start of each diversion occurs no matter what, even if the
|
|
previous diversion did not end with a newline. This goes contrary to
|
|
the claim that synclines appear on a line by themselves, so this
|
|
limitation may be corrected in a future version of @code{m4}. In the
|
|
meantime, when using @option{-s}, it is wisest to make sure all
|
|
diversions end with newline.
|
|
|
|
@item
|
|
GNU @code{m4} makes no attempt at prohibiting self-referential
|
|
definitions like:
|
|
|
|
@example
|
|
define(`x', `x')
|
|
@result{}
|
|
define(`x', `x ')
|
|
@result{}
|
|
@end example
|
|
|
|
@cindex rescanning
|
|
There is nothing inherently wrong with defining @samp{x} to
|
|
return @samp{x}. The wrong thing is to expand @samp{x} unquoted,
|
|
because that would cause an infinite rescan loop.
|
|
In @code{m4}, one might use macros to hold strings, as we do for
|
|
variables in other programming languages, further checking them with:
|
|
|
|
@comment ignore
|
|
@example
|
|
ifelse(defn(`@var{holder}'), `@var{value}', @dots{})
|
|
@end example
|
|
|
|
@noindent
|
|
In cases like this one, an interdiction for a macro to hold its own name
|
|
would be a useless limitation. Of course, this leaves more rope for the
|
|
GNU @code{m4} user to hang himself! Rescanning hangs may be
|
|
avoided through careful programming, a little like for endless loops in
|
|
traditional programming languages.
|
|
@end itemize
|
|
|
|
@node Answers
|
|
@chapter Correct version of some examples
|
|
|
|
Some of the examples in this manuals are buggy or not very robust, for
|
|
demonstration purposes. Improved versions of these composite macros are
|
|
presented here.
|
|
|
|
@menu
|
|
* Improved exch:: Solution for @code{exch}
|
|
* Improved forloop:: Solution for @code{forloop}
|
|
* Improved foreach:: Solution for @code{foreach}
|
|
* Improved copy:: Solution for @code{copy}
|
|
* Improved m4wrap:: Solution for @code{m4wrap}
|
|
* Improved cleardivert:: Solution for @code{cleardivert}
|
|
* Improved capitalize:: Solution for @code{capitalize}
|
|
* Improved fatal_error:: Solution for @code{fatal_error}
|
|
@end menu
|
|
|
|
@node Improved exch
|
|
@section Solution for @code{exch}
|
|
|
|
The @code{exch} macro (@pxref{Arguments}) as presented requires clients
|
|
to double quote their arguments. A nicer definition, which lets
|
|
clients follow the rule of thumb of one level of quoting per level of
|
|
parentheses, involves adding quotes in the definition of @code{exch}, as
|
|
follows:
|
|
|
|
@example
|
|
define(`exch', ``$2', `$1'')
|
|
@result{}
|
|
define(exch(`expansion text', `macro'))
|
|
@result{}
|
|
macro
|
|
@result{}expansion text
|
|
@end example
|
|
|
|
@node Improved forloop
|
|
@section Solution for @code{forloop}
|
|
|
|
The @code{forloop} macro (@pxref{Forloop}) as presented earlier can go
|
|
into an infinite loop if given an iterator that is not parsed as a macro
|
|
name. It does not do any sanity checking on its numeric bounds, and
|
|
only permits decimal numbers for bounds. Here is an improved version,
|
|
shipped as @file{m4-@value{VERSION}/@/examples/@/forloop2.m4}; this
|
|
version also optimizes overhead by calling four macros instead of six
|
|
per iteration (excluding those in @var{text}), by not dereferencing the
|
|
@var{iterator} in the helper @code{@w{_forloop}}.
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -d -I examples}
|
|
undivert(`forloop2.m4')dnl
|
|
@result{}divert(`-1')
|
|
@result{}# forloop(var, from, to, stmt) - improved version:
|
|
@result{}# works even if VAR is not a strict macro name
|
|
@result{}# performs sanity check that FROM is larger than TO
|
|
@result{}# allows complex numerical expressions in TO and FROM
|
|
@result{}define(`forloop', `ifelse(eval(`($2) <= ($3)'), `1',
|
|
@result{} `pushdef(`$1')_$0(`$1', eval(`$2'),
|
|
@result{} eval(`$3'), `$4')popdef(`$1')')')
|
|
@result{}define(`_forloop',
|
|
@result{} `define(`$1', `$2')$4`'ifelse(`$2', `$3', `',
|
|
@result{} `$0(`$1', incr(`$2'), `$3', `$4')')')
|
|
@result{}divert`'dnl
|
|
include(`forloop2.m4')
|
|
@result{}
|
|
forloop(`i', `2', `1', `no iteration occurs')
|
|
@result{}
|
|
forloop(`', `1', `2', ` odd iterator name')
|
|
@result{} odd iterator name odd iterator name
|
|
forloop(`i', `5 + 5', `0xc', ` 0x`'eval(i, `16')')
|
|
@result{} 0xa 0xb 0xc
|
|
forloop(`i', `a', `b', `non-numeric bounds')
|
|
@error{}m4:stdin:6: bad expression in eval (bad input): (a) <= (b)
|
|
@result{}
|
|
@end example
|
|
|
|
One other change to notice is that the improved version used @samp{_$0}
|
|
rather than @samp{_forloop} to invoke the helper routine. In general,
|
|
this is a good practice to follow, because then the set of macros can be
|
|
uniformly transformed. The following example shows a transformation
|
|
that doubles the current quoting and appends a suffix @samp{2} to each
|
|
transformed macro. If @code{forloop} refers to the literal
|
|
@samp{_forloop}, then @code{forloop2} invokes @code{_forloop} instead of
|
|
the intended @code{_forloop2}, and the mixing of quoting paradigms leads
|
|
to an infinite recursion loop in this example.
|
|
|
|
@comment options: -L9
|
|
@comment status: 1
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -d -L 9 -I examples}
|
|
define(`arg1', `$1')include(`forloop2.m4')include(`quote.m4')
|
|
@result{}
|
|
define(`double', `define(`$1'`2',
|
|
arg1(patsubst(dquote(defn(`$1')), `[`']', `\&\&')))')
|
|
@result{}
|
|
double(`forloop')double(`_forloop')defn(`forloop2')
|
|
@result{}ifelse(eval(``($2) <= ($3)''), ``1'',
|
|
@result{} ``pushdef(``$1'')_$0(``$1'', eval(``$2''),
|
|
@result{} eval(``$3''), ``$4'')popdef(``$1'')'')
|
|
forloop(i, 1, 5, `ifelse(')forloop(i, 1, 5, `)')
|
|
@result{}
|
|
changequote(`[', `]')changequote([``], [''])
|
|
@result{}
|
|
forloop2(i, 1, 5, ``ifelse('')forloop2(i, 1, 5, ``)'')
|
|
@result{}
|
|
changequote`'include(`forloop.m4')
|
|
@result{}
|
|
double(`forloop')double(`_forloop')defn(`forloop2')
|
|
@result{}pushdef(``$1'', ``$2'')_forloop($@@)popdef(``$1'')
|
|
forloop(i, 1, 5, `ifelse(')forloop(i, 1, 5, `)')
|
|
@result{}
|
|
changequote(`[', `]')changequote([``], [''])
|
|
@result{}
|
|
forloop2(i, 1, 5, ``ifelse('')forloop2(i, 1, 5, ``)'')
|
|
@error{}m4:stdin:12: recursion limit of 9 exceeded, use -L<N> to change it
|
|
@end example
|
|
|
|
One more optimization is still possible. Instead of repeatedly
|
|
assigning a variable then invoking or dereferencing it, it is possible
|
|
to pass the current iterator value as a single argument. Coupled with
|
|
@code{curry} if other arguments are needed (@pxref{Composition}), or
|
|
with helper macros if the argument is needed in more than one place in
|
|
the expansion, the output can be generated with three, rather than four,
|
|
macros of overhead per iteration. Notice how the file
|
|
@file{m4-@value{VERSION}/@/examples/@/forloop3.m4} rearranges the
|
|
arguments of the helper @code{_forloop} to take two arguments that are
|
|
placed around the current value. By splitting a balanced set of
|
|
parentheses across multiple arguments, the helper macro can now be
|
|
shared by @code{forloop} and the new @code{forloop_arg}.
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
include(`forloop3.m4')
|
|
@result{}
|
|
undivert(`forloop3.m4')dnl
|
|
@result{}divert(`-1')
|
|
@result{}# forloop_arg(from, to, macro) - invoke MACRO(value) for
|
|
@result{}# each value between FROM and TO, without define overhead
|
|
@result{}define(`forloop_arg', `ifelse(eval(`($1) <= ($2)'), `1',
|
|
@result{} `_forloop(`$1', eval(`$2'), `$3(', `)')')')
|
|
@result{}# forloop(var, from, to, stmt) - refactored to share code
|
|
@result{}define(`forloop', `ifelse(eval(`($2) <= ($3)'), `1',
|
|
@result{} `pushdef(`$1')_forloop(eval(`$2'), eval(`$3'),
|
|
@result{} `define(`$1',', `)$4')popdef(`$1')')')
|
|
@result{}define(`_forloop',
|
|
@result{} `$3`$1'$4`'ifelse(`$1', `$2', `',
|
|
@result{} `$0(incr(`$1'), `$2', `$3', `$4')')')
|
|
@result{}divert`'dnl
|
|
forloop(`i', `1', `3', ` i')
|
|
@result{} 1 2 3
|
|
define(`echo', `$@@')
|
|
@result{}
|
|
forloop_arg(`1', `3', ` echo')
|
|
@result{} 1 2 3
|
|
include(`curry.m4')
|
|
@result{}
|
|
forloop_arg(`1', `3', `curry(`pushdef', `a')')
|
|
@result{}
|
|
a
|
|
@result{}3
|
|
popdef(`a')a
|
|
@result{}2
|
|
popdef(`a')a
|
|
@result{}1
|
|
popdef(`a')a
|
|
@result{}a
|
|
@end example
|
|
|
|
Of course, it is possible to make even more improvements, such as
|
|
adding an optional step argument, or allowing iteration through
|
|
descending sequences. GNU Autoconf provides some of these
|
|
additional bells and whistles in its @code{m4_for} macro.
|
|
|
|
@node Improved foreach
|
|
@section Solution for @code{foreach}
|
|
|
|
The @code{foreach} and @code{foreachq} macros (@pxref{Foreach}) as
|
|
presented earlier each have flaws. First, we will examine and fix the
|
|
quadratic behavior of @code{foreachq}:
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
include(`foreachq.m4')
|
|
@result{}
|
|
traceon(`shift')debugmode(`aq')
|
|
@result{}
|
|
foreachq(`x', ``1', `2', `3', `4'', `x
|
|
')dnl
|
|
@result{}1
|
|
@error{}m4trace: -3- shift(`1', `2', `3', `4')
|
|
@error{}m4trace: -2- shift(`1', `2', `3', `4')
|
|
@result{}2
|
|
@error{}m4trace: -4- shift(`1', `2', `3', `4')
|
|
@error{}m4trace: -3- shift(`2', `3', `4')
|
|
@error{}m4trace: -3- shift(`1', `2', `3', `4')
|
|
@error{}m4trace: -2- shift(`2', `3', `4')
|
|
@result{}3
|
|
@error{}m4trace: -5- shift(`1', `2', `3', `4')
|
|
@error{}m4trace: -4- shift(`2', `3', `4')
|
|
@error{}m4trace: -3- shift(`3', `4')
|
|
@error{}m4trace: -4- shift(`1', `2', `3', `4')
|
|
@error{}m4trace: -3- shift(`2', `3', `4')
|
|
@error{}m4trace: -2- shift(`3', `4')
|
|
@result{}4
|
|
@error{}m4trace: -6- shift(`1', `2', `3', `4')
|
|
@error{}m4trace: -5- shift(`2', `3', `4')
|
|
@error{}m4trace: -4- shift(`3', `4')
|
|
@error{}m4trace: -3- shift(`4')
|
|
@end example
|
|
|
|
@cindex quadratic behavior, avoiding
|
|
@cindex avoiding quadratic behavior
|
|
Each successive iteration was adding more quoted @code{shift}
|
|
invocations, and the entire list contents were passing through every
|
|
iteration. In general, when recursing, it is a good idea to make the
|
|
recursion use fewer arguments, rather than adding additional quoted
|
|
uses of @code{shift}. By doing so, @code{m4} uses less memory, invokes
|
|
fewer macros, is less likely to run into machine limits, and most
|
|
importantly, performs faster. The fixed version of @code{foreachq} can
|
|
be found in @file{m4-@value{VERSION}/@/examples/@/foreachq2.m4}:
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
include(`foreachq2.m4')
|
|
@result{}
|
|
undivert(`foreachq2.m4')dnl
|
|
@result{}include(`quote.m4')dnl
|
|
@result{}divert(`-1')
|
|
@result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
|
|
@result{}# quoted list, improved version
|
|
@result{}define(`foreachq', `pushdef(`$1')_$0($@@)popdef(`$1')')
|
|
@result{}define(`_arg1q', ``$1'')
|
|
@result{}define(`_rest', `ifelse(`$#', `1', `', `dquote(shift($@@))')')
|
|
@result{}define(`_foreachq', `ifelse(`$2', `', `',
|
|
@result{} `define(`$1', _arg1q($2))$3`'$0(`$1', _rest($2), `$3')')')
|
|
@result{}divert`'dnl
|
|
traceon(`shift')debugmode(`aq')
|
|
@result{}
|
|
foreachq(`x', ``1', `2', `3', `4'', `x
|
|
')dnl
|
|
@result{}1
|
|
@error{}m4trace: -3- shift(`1', `2', `3', `4')
|
|
@result{}2
|
|
@error{}m4trace: -3- shift(`2', `3', `4')
|
|
@result{}3
|
|
@error{}m4trace: -3- shift(`3', `4')
|
|
@result{}4
|
|
@end example
|
|
|
|
Note that the fixed version calls unquoted helper macros in
|
|
@code{@w{_foreachq}} to trim elements immediately; those helper macros
|
|
in turn must re-supply the layer of quotes lost in the macro invocation.
|
|
Contrast the use of @code{@w{_arg1q}}, which quotes the first list
|
|
element, with @code{@w{_arg1}} of the earlier implementation that
|
|
returned the first list element directly. Additionally, by calling the
|
|
helper method immediately, the @samp{defn(`@var{iterator}')} no longer
|
|
contains unexpanded macros.
|
|
|
|
The astute m4 programmer might notice that the solution above still uses
|
|
more memory and macro invocations, and thus more time, than strictly
|
|
necessary. Note that @samp{$2}, which contains an arbitrarily long
|
|
quoted list, is expanded and rescanned three times per iteration of
|
|
@code{_foreachq}. Furthermore, every iteration of the algorithm
|
|
effectively unboxes then reboxes the list, which costs a couple of macro
|
|
invocations. It is possible to rewrite the algorithm for a bit more
|
|
speed by swapping the order of the arguments to @code{_foreachq} in
|
|
order to operate on an unboxed list in the first place, and by using the
|
|
fixed-length @samp{$#} instead of an arbitrary length list as the key to
|
|
end recursion. The result is an overhead of six macro invocations per
|
|
loop (excluding any macros in @var{text}), instead of eight. This
|
|
alternative approach is available as
|
|
@file{m4-@value{VERSION}/@/examples/@/foreach3.m4}:
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
include(`foreachq3.m4')
|
|
@result{}
|
|
undivert(`foreachq3.m4')dnl
|
|
@result{}divert(`-1')
|
|
@result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
|
|
@result{}# quoted list, alternate improved version
|
|
@result{}define(`foreachq', `ifelse(`$2', `', `',
|
|
@result{} `pushdef(`$1')_$0(`$1', `$3', `', $2)popdef(`$1')')')
|
|
@result{}define(`_foreachq', `ifelse(`$#', `3', `',
|
|
@result{} `define(`$1', `$4')$2`'$0(`$1', `$2',
|
|
@result{} shift(shift(shift($@@))))')')
|
|
@result{}divert`'dnl
|
|
traceon(`shift')debugmode(`aq')
|
|
@result{}
|
|
foreachq(`x', ``1', `2', `3', `4'', `x
|
|
')dnl
|
|
@result{}1
|
|
@error{}m4trace: -4- shift(`x', `x
|
|
@error{}', `', `1', `2', `3', `4')
|
|
@error{}m4trace: -3- shift(`x
|
|
@error{}', `', `1', `2', `3', `4')
|
|
@error{}m4trace: -2- shift(`', `1', `2', `3', `4')
|
|
@result{}2
|
|
@error{}m4trace: -4- shift(`x', `x
|
|
@error{}', `1', `2', `3', `4')
|
|
@error{}m4trace: -3- shift(`x
|
|
@error{}', `1', `2', `3', `4')
|
|
@error{}m4trace: -2- shift(`1', `2', `3', `4')
|
|
@result{}3
|
|
@error{}m4trace: -4- shift(`x', `x
|
|
@error{}', `2', `3', `4')
|
|
@error{}m4trace: -3- shift(`x
|
|
@error{}', `2', `3', `4')
|
|
@error{}m4trace: -2- shift(`2', `3', `4')
|
|
@result{}4
|
|
@error{}m4trace: -4- shift(`x', `x
|
|
@error{}', `3', `4')
|
|
@error{}m4trace: -3- shift(`x
|
|
@error{}', `3', `4')
|
|
@error{}m4trace: -2- shift(`3', `4')
|
|
@end example
|
|
|
|
In the current version of M4, every instance of @samp{$@@} is rescanned
|
|
as it is encountered. Thus, the @file{foreachq3.m4} alternative uses
|
|
much less memory than @file{foreachq2.m4}, and executes as much as 10%
|
|
faster, since each iteration encounters fewer @samp{$@@}. However, the
|
|
implementation of rescanning every byte in @samp{$@@} is quadratic in
|
|
the number of bytes scanned (for example, making the broken version in
|
|
@file{foreachq.m4} cubic, rather than quadratic, in behavior). A future
|
|
release of M4 will improve the underlying implementation by reusing
|
|
results of previous scans, so that both styles of @code{foreachq} can
|
|
become linear in the number of bytes scanned. Notice how the
|
|
implementation injects an empty argument prior to expanding @samp{$2}
|
|
within @code{foreachq}; the helper macro @code{_foreachq} then ignores
|
|
the third argument altogether, and ends recursion when there are three
|
|
arguments left because there was nothing left to pass through
|
|
@code{shift}. Thus, each iteration only needs one @code{ifelse}, rather
|
|
than the two conditionals used in the version from @file{foreachq2.m4}.
|
|
|
|
@cindex nine arguments, more than
|
|
@cindex more than nine arguments
|
|
@cindex arguments, more than nine
|
|
So far, all of the implementations of @code{foreachq} presented have
|
|
been quadratic with M4 1.4.x. But @code{forloop} is linear, because
|
|
each iteration parses a constant amount of arguments. So, it is
|
|
possible to design a variant that uses @code{forloop} to do the
|
|
iteration, then uses @samp{$@@} only once at the end, giving a linear
|
|
result even with older M4 implementations. This implementation relies
|
|
on the GNU extension that @samp{$10} expands to the tenth
|
|
argument rather than the first argument concatenated with @samp{0}. The
|
|
trick is to define an intermediate macro that repeats the text
|
|
@code{m4_define(`$1', `$@var{n}')$2`'}, with @samp{n} set to successive
|
|
integers corresponding to each argument. The helper macro
|
|
@code{_foreachq_} is needed in order to generate the literal sequences
|
|
such as @samp{$1} into the intermediate macro, rather than expanding
|
|
them as the arguments of @code{_foreachq}. With this approach, no
|
|
@code{shift} calls are even needed! Even though there are seven macros
|
|
of overhead per iteration instead of six in @file{foreachq3.m4}, the
|
|
linear scaling is apparent at relatively small list sizes. However,
|
|
this approach will need adjustment when a future version of M4 follows
|
|
POSIX by no longer treating @samp{$10} as the tenth argument;
|
|
the anticipation is that @samp{$@{10@}} can be used instead, although
|
|
that alternative syntax is not yet supported.
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
include(`foreachq4.m4')
|
|
@result{}
|
|
undivert(`foreachq4.m4')dnl
|
|
@result{}include(`forloop2.m4')dnl
|
|
@result{}divert(`-1')
|
|
@result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
|
|
@result{}# quoted list, version based on forloop
|
|
@result{}define(`foreachq',
|
|
@result{}`ifelse(`$2', `', `', `_$0(`$1', `$3', $2)')')
|
|
@result{}define(`_foreachq',
|
|
@result{}`pushdef(`$1', forloop(`$1', `3', `$#',
|
|
@result{} `$0_(`1', `2', indir(`$1'))')`popdef(
|
|
@result{} `$1')')indir(`$1', $@@)')
|
|
@result{}define(`_foreachq_',
|
|
@result{}``define(`$$1', `$$3')$$2`''')
|
|
@result{}divert`'dnl
|
|
traceon(`shift')debugmode(`aq')
|
|
@result{}
|
|
foreachq(`x', ``1', `2', `3', `4'', `x
|
|
')dnl
|
|
@result{}1
|
|
@result{}2
|
|
@result{}3
|
|
@result{}4
|
|
@end example
|
|
|
|
For yet another approach, the improved version of @code{foreach},
|
|
available in @file{m4-@value{VERSION}/@/examples/@/foreach2.m4}, simply
|
|
overquotes the arguments to @code{@w{_foreach}} to begin with, using
|
|
@code{dquote_elt}. Then @code{@w{_foreach}} can just use
|
|
@code{@w{_arg1}} to remove the extra layer of quoting that was added up
|
|
front:
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
include(`foreach2.m4')
|
|
@result{}
|
|
undivert(`foreach2.m4')dnl
|
|
@result{}include(`quote.m4')dnl
|
|
@result{}divert(`-1')
|
|
@result{}# foreach(x, (item_1, item_2, ..., item_n), stmt)
|
|
@result{}# parenthesized list, improved version
|
|
@result{}define(`foreach', `pushdef(`$1')_$0(`$1',
|
|
@result{} (dquote(dquote_elt$2)), `$3')popdef(`$1')')
|
|
@result{}define(`_arg1', `$1')
|
|
@result{}define(`_foreach', `ifelse(`$2', `(`')', `',
|
|
@result{} `define(`$1', _arg1$2)$3`'$0(`$1', (dquote(shift$2)), `$3')')')
|
|
@result{}divert`'dnl
|
|
traceon(`shift')debugmode(`aq')
|
|
@result{}
|
|
foreach(`x', `(`1', `2', `3', `4')', `x
|
|
')dnl
|
|
@error{}m4trace: -4- shift(`1', `2', `3', `4')
|
|
@error{}m4trace: -4- shift(`2', `3', `4')
|
|
@error{}m4trace: -4- shift(`3', `4')
|
|
@result{}1
|
|
@error{}m4trace: -3- shift(``1'', ``2'', ``3'', ``4'')
|
|
@result{}2
|
|
@error{}m4trace: -3- shift(``2'', ``3'', ``4'')
|
|
@result{}3
|
|
@error{}m4trace: -3- shift(``3'', ``4'')
|
|
@result{}4
|
|
@error{}m4trace: -3- shift(``4'')
|
|
@end example
|
|
|
|
It is likewise possible to write a variant of @code{foreach} that
|
|
performs in linear time on M4 1.4.x; the easiest method is probably
|
|
writing a version of @code{foreach} that unboxes its list, then invokes
|
|
@code{_foreachq} as previously defined in @file{foreachq4.m4}.
|
|
|
|
In summary, recursion over list elements is trickier than it appeared at
|
|
first glance, but provides a powerful idiom within @code{m4} processing.
|
|
As a final demonstration, both list styles are now able to handle
|
|
several scenarios that would wreak havoc on one or both of the original
|
|
implementations. This points out one other difference between the
|
|
list styles. @code{foreach} evaluates unquoted list elements only once,
|
|
in preparation for calling @code{@w{_foreach}}, similarly for
|
|
@code{foreachq} as provided by @file{foreachq3.m4} or
|
|
@file{foreachq4.m4}. But
|
|
@code{foreachq}, as provided by @file{foreachq2.m4},
|
|
evaluates unquoted list elements twice while visiting the first list
|
|
element, once in @code{@w{_arg1q}} and once in @code{@w{_rest}}. When
|
|
deciding which list style to use, one must take into account whether
|
|
repeating the side effects of unquoted list elements will have any
|
|
detrimental effects.
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
include(`foreach2.m4')
|
|
@result{}
|
|
include(`foreachq2.m4')
|
|
@result{}
|
|
dnl 0-element list:
|
|
foreach(`x', `', `<x>') / foreachq(`x', `', `<x>')
|
|
@result{} /@w{ }
|
|
dnl 1-element list of empty element
|
|
foreach(`x', `()', `<x>') / foreachq(`x', ``'', `<x>')
|
|
@result{}<> / <>
|
|
dnl 2-element list of empty elements
|
|
foreach(`x', `(`',`')', `<x>') / foreachq(`x', ``',`'', `<x>')
|
|
@result{}<><> / <><>
|
|
dnl 1-element list of a comma
|
|
foreach(`x', `(`,')', `<x>') / foreachq(`x', ``,'', `<x>')
|
|
@result{}<,> / <,>
|
|
dnl 2-element list of unbalanced parentheses
|
|
foreach(`x', `(`(', `)')', `<x>') / foreachq(`x', ``(', `)'', `<x>')
|
|
@result{}<(><)> / <(><)>
|
|
define(`ab', `oops')dnl using defn(`iterator')
|
|
foreach(`x', `(`a', `b')', `defn(`x')') /dnl
|
|
foreachq(`x', ``a', `b'', `defn(`x')')
|
|
@result{}ab / ab
|
|
define(`active', `ACT, IVE')
|
|
@result{}
|
|
traceon(`active')
|
|
@result{}
|
|
dnl list of unquoted macros; expansion occurs before recursion
|
|
foreach(`x', `(active, active)', `<x>
|
|
')dnl
|
|
@error{}m4trace: -4- active -> `ACT, IVE'
|
|
@error{}m4trace: -4- active -> `ACT, IVE'
|
|
@result{}<ACT>
|
|
@result{}<IVE>
|
|
@result{}<ACT>
|
|
@result{}<IVE>
|
|
foreachq(`x', `active, active', `<x>
|
|
')dnl
|
|
@error{}m4trace: -3- active -> `ACT, IVE'
|
|
@error{}m4trace: -3- active -> `ACT, IVE'
|
|
@result{}<ACT>
|
|
@error{}m4trace: -3- active -> `ACT, IVE'
|
|
@error{}m4trace: -3- active -> `ACT, IVE'
|
|
@result{}<IVE>
|
|
@result{}<ACT>
|
|
@result{}<IVE>
|
|
dnl list of quoted macros; expansion occurs during recursion
|
|
foreach(`x', `(`active', `active')', `<x>
|
|
')dnl
|
|
@error{}m4trace: -1- active -> `ACT, IVE'
|
|
@result{}<ACT, IVE>
|
|
@error{}m4trace: -1- active -> `ACT, IVE'
|
|
@result{}<ACT, IVE>
|
|
foreachq(`x', ``active', `active'', `<x>
|
|
')dnl
|
|
@error{}m4trace: -1- active -> `ACT, IVE'
|
|
@result{}<ACT, IVE>
|
|
@error{}m4trace: -1- active -> `ACT, IVE'
|
|
@result{}<ACT, IVE>
|
|
dnl list of double-quoted macro names; no expansion
|
|
foreach(`x', `(``active'', ``active'')', `<x>
|
|
')dnl
|
|
@result{}<active>
|
|
@result{}<active>
|
|
foreachq(`x', ```active'', ``active''', `<x>
|
|
')dnl
|
|
@result{}<active>
|
|
@result{}<active>
|
|
@end example
|
|
|
|
@ignore
|
|
@comment Not worth putting in the manual, but make sure that foreach
|
|
@comment implementations behave, and that final implementation is
|
|
@comment linear.
|
|
|
|
@comment boxed recursion
|
|
|
|
@comment examples
|
|
@comment options: -Dlimit=10 -Dverbose
|
|
@example
|
|
$ @kbd{m4 -I examples -Dlimit=10 -Dverbose}
|
|
include(`loop.m4')dnl
|
|
@result{} 1 2 3 4 5 6 7 8 9 10
|
|
@end example
|
|
|
|
@comment unboxed recursion
|
|
|
|
@comment examples
|
|
@comment options: -Dlimit=10 -Dverbose -Dalt
|
|
@example
|
|
$ @kbd{m4 -I examples -Dlimit=10 -Dverbose -Dalt}
|
|
include(`loop.m4')dnl
|
|
@result{} 1 2 3 4 5 6 7 8 9 10
|
|
@end example
|
|
|
|
@comment foreach via forloop recursion
|
|
|
|
@comment examples
|
|
@comment options: -Dlimit=10 -Dverbose -Dalt=4
|
|
@example
|
|
$ @kbd{m4 -I examples -Dlimit=10 -Dverbose -Dalt=4}
|
|
include(`loop.m4')dnl
|
|
@result{} 1 2 3 4 5 6 7 8 9 10
|
|
@end example
|
|
|
|
@comment examples
|
|
@comment options: -Dlimit=2500 -Dalt=4
|
|
@example
|
|
$ @kbd{m4 -I examples -Dlimit=2500 -Dalt=4}
|
|
include(`loop.m4')dnl
|
|
@end example
|
|
|
|
@comment examples
|
|
@comment options: -Dlimit=10000 -Dalt=4
|
|
@example
|
|
$ @kbd{m4 -I examples -Dlimit=10000 -Dalt=4}
|
|
define(`foo', `divert`'len(popdef(`_foreachq')_foreachq($@@))')dnl
|
|
define(`debug', `pushdef(`_foreachq', defn(`foo'))')
|
|
@result{}
|
|
include(`loop.m4')dnl
|
|
@result{}48894
|
|
@end example
|
|
|
|
@end ignore
|
|
|
|
@node Improved copy
|
|
@section Solution for @code{copy}
|
|
|
|
The macro @code{copy} presented above
|
|
is unable to handle builtin tokens with M4 1.4.x, because it tries to
|
|
pass the builtin token through the macro @code{curry}, where it is
|
|
silently flattened to an empty string (@pxref{Composition}). Rather
|
|
than using the problematic @code{curry} to work around the limitation
|
|
that @code{stack_foreach} expects to invoke a macro that takes exactly
|
|
one argument, we can write a new macro that lets us form the exact
|
|
two-argument @code{pushdef} call sequence needed, so that we are no
|
|
longer passing a builtin token through a text macro.
|
|
|
|
@deffn Composite stack_foreach_sep (@var{macro}, @var{pre}, @var{post}, @
|
|
@var{sep})
|
|
@deffnx Composite stack_foreach_sep_lifo (@var{macro}, @var{pre}, @
|
|
@var{post}, @var{sep})
|
|
For each of the @code{pushdef} definitions associated with @var{macro},
|
|
expand the sequence @samp{@var{pre}`'definition`'@var{post}}.
|
|
Additionally, expand @var{sep} between definitions.
|
|
@code{stack_foreach_sep} visits the oldest definition first, while
|
|
@code{stack_foreach_sep_lifo} visits the current definition first. The
|
|
expansion may dereference @var{macro}, but should not modify it. There
|
|
are a few special macros, such as @code{defn}, which cannot be used as
|
|
the @var{macro} parameter.
|
|
@end deffn
|
|
|
|
Note that @code{stack_foreach(`@var{macro}', `@var{action}')} is
|
|
equivalent to @code{stack_foreach_sep(`@var{macro}', `@var{action}(',
|
|
`)')}. By supplying explicit parentheses, split among the @var{pre} and
|
|
@var{post} arguments to @code{stack_foreach_sep}, it is now possible to
|
|
construct macro calls with more than one argument, without passing
|
|
builtin tokens through a macro call. It is likewise possible to
|
|
directly reference the stack definitions without a macro call, by
|
|
leaving @var{pre} and @var{post} empty. Thus, in addition to fixing
|
|
@code{copy} on builtin tokens, it also executes with fewer macro
|
|
invocations.
|
|
|
|
The new macro also adds a separator that is only output after the first
|
|
iteration of the helper @code{_stack_reverse_sep}, implemented by
|
|
prepending the original @var{sep} to @var{pre} and omitting a @var{sep}
|
|
argument in subsequent iterations. Note that the empty string that
|
|
separates @var{sep} from @var{pre} is provided as part of the fourth
|
|
argument when originally calling @code{_stack_reverse_sep}, and not by
|
|
writing @code{$4`'$3} as the third argument in the recursive call; while
|
|
the other approach would give the same output, it does so at the expense
|
|
of increasing the argument size on each iteration of
|
|
@code{_stack_reverse_sep}, which results in quadratic instead of linear
|
|
execution time. The improved stack walking macros are available in
|
|
@file{m4-@value{VERSION}/@/examples/@/stack_sep.m4}:
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
include(`stack_sep.m4')
|
|
@result{}
|
|
define(`copy', `ifdef(`$2', `errprint(`$2 already defined
|
|
')m4exit(`1')',
|
|
`stack_foreach_sep(`$1', `pushdef(`$2',', `)')')')dnl
|
|
pushdef(`a', `1')pushdef(`a', defn(`divnum'))
|
|
@result{}
|
|
copy(`a', `b')
|
|
@result{}
|
|
b
|
|
@result{}0
|
|
popdef(`b')
|
|
@result{}
|
|
b
|
|
@result{}1
|
|
pushdef(`c', `1')pushdef(`c', `2')
|
|
@result{}
|
|
stack_foreach_sep_lifo(`c', `', `', `, ')
|
|
@result{}2, 1
|
|
undivert(`stack_sep.m4')dnl
|
|
@result{}divert(`-1')
|
|
@result{}# stack_foreach_sep(macro, pre, post, sep)
|
|
@result{}# Invoke PRE`'defn`'POST with a single argument of each definition
|
|
@result{}# from the definition stack of MACRO, starting with the oldest, and
|
|
@result{}# separated by SEP between definitions.
|
|
@result{}define(`stack_foreach_sep',
|
|
@result{}`_stack_reverse_sep(`$1', `tmp-$1')'dnl
|
|
@result{}`_stack_reverse_sep(`tmp-$1', `$1', `$2`'defn(`$1')$3', `$4`'')')
|
|
@result{}# stack_foreach_sep_lifo(macro, pre, post, sep)
|
|
@result{}# Like stack_foreach_sep, but starting with the newest definition.
|
|
@result{}define(`stack_foreach_sep_lifo',
|
|
@result{}`_stack_reverse_sep(`$1', `tmp-$1', `$2`'defn(`$1')$3', `$4`'')'dnl
|
|
@result{}`_stack_reverse_sep(`tmp-$1', `$1')')
|
|
@result{}define(`_stack_reverse_sep',
|
|
@result{}`ifdef(`$1', `pushdef(`$2', defn(`$1'))$3`'popdef(`$1')$0(
|
|
@result{} `$1', `$2', `$4$3')')')
|
|
@result{}divert`'dnl
|
|
@end example
|
|
|
|
@ignore
|
|
@comment Not worth putting in the manual, but make sure that
|
|
@comment stack_foreach_sep has linear performance.
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
include(`forloop3.m4')include(`stack_sep.m4')dnl
|
|
forloop(`i', `1', `10000', `pushdef(`s', i)')
|
|
@result{}
|
|
define(`colon', `:')define(`dash', `-')
|
|
@result{}
|
|
len(stack_foreach_sep(`s', `dash', `', `colon'))
|
|
@result{}58893
|
|
@end example
|
|
@end ignore
|
|
|
|
@node Improved m4wrap
|
|
@section Solution for @code{m4wrap}
|
|
|
|
The replacement @code{m4wrap} versions presented above, designed to
|
|
guarantee FIFO or LIFO order regardless of the underlying M4
|
|
implementation, share a bug when dealing with wrapped text that looks
|
|
like parameter expansion. Note how the invocation of
|
|
@code{m4wrap@var{n}} interprets these parameters, while using the
|
|
builtin preserves them for their intended use.
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
include(`wraplifo.m4')
|
|
@result{}
|
|
m4wrap(`define(`foo', ``$0:'-$1-$*-$#-')foo(`a', `b')
|
|
')
|
|
@result{}
|
|
builtin(`m4wrap', ``'define(`bar', ``$0:'-$1-$*-$#-')bar(`a', `b')
|
|
')
|
|
@result{}
|
|
^D
|
|
@result{}bar:-a-a,b-2-
|
|
@result{}m4wrap0:---0-
|
|
@end example
|
|
|
|
Additionally, the computation of @code{_m4wrap_level} and creation of
|
|
multiple @code{m4wrap@var{n}} placeholders in the original examples is
|
|
more expensive in time and memory than strictly necessary. Notice how
|
|
the improved version grabs the wrapped text via @code{defn} to avoid
|
|
parameter expansion, then undefines @code{_m4wrap_text}, before
|
|
stripping a level of quotes with @code{_arg1} to expand the text. That
|
|
way, each level of wrapping reuses the single placeholder, which starts
|
|
each nesting level in an undefined state.
|
|
|
|
Finally, it is worth emulating the GNU M4 extension of saving
|
|
all arguments to @code{m4wrap}, separated by a space, rather than saving
|
|
just the first argument. This is done with the @code{join} macro
|
|
documented previously (@pxref{Shift}). The improved LIFO example is
|
|
shipped as @file{m4-@value{VERSION}/@/examples/@/wraplifo2.m4}, and can
|
|
easily be converted to a FIFO solution by swapping the adjacent
|
|
invocations of @code{joinall} and @code{defn}.
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
include(`wraplifo2.m4')
|
|
@result{}
|
|
undivert(`wraplifo2.m4')dnl
|
|
@result{}dnl Redefine m4wrap to have LIFO semantics, improved example.
|
|
@result{}include(`join.m4')dnl
|
|
@result{}define(`_m4wrap', defn(`m4wrap'))dnl
|
|
@result{}define(`_arg1', `$1')dnl
|
|
@result{}define(`m4wrap',
|
|
@result{}`ifdef(`_$0_text',
|
|
@result{} `define(`_$0_text', joinall(` ', $@@)defn(`_$0_text'))',
|
|
@result{} `_$0(`_arg1(defn(`_$0_text')undefine(`_$0_text'))')dnl
|
|
@result{}define(`_$0_text', joinall(` ', $@@))')')dnl
|
|
m4wrap(`define(`foo', ``$0:'-$1-$*-$#-')foo(`a', `b')
|
|
')
|
|
@result{}
|
|
m4wrap(`lifo text
|
|
m4wrap(`nested', `', `$@@
|
|
')')
|
|
@result{}
|
|
^D
|
|
@result{}lifo text
|
|
@result{}foo:-a-a,b-2-
|
|
@result{}nested $@@
|
|
@end example
|
|
|
|
@node Improved cleardivert
|
|
@section Solution for @code{cleardivert}
|
|
|
|
The @code{cleardivert} macro (@pxref{Cleardivert}) cannot, as it stands, be
|
|
called without arguments to clear all pending diversions. That is
|
|
because using undivert with an empty string for an argument is different
|
|
than using it with no arguments at all. Compare the earlier definition
|
|
with one that takes the number of arguments into account:
|
|
|
|
@example
|
|
define(`cleardivert',
|
|
`pushdef(`_n', divnum)divert(`-1')undivert($@@)divert(_n)popdef(`_n')')
|
|
@result{}
|
|
divert(`1')one
|
|
divert
|
|
@result{}
|
|
cleardivert
|
|
@result{}
|
|
undivert
|
|
@result{}one
|
|
@result{}
|
|
define(`cleardivert',
|
|
`pushdef(`_num', divnum)divert(`-1')ifelse(`$#', `0',
|
|
`undivert`'', `undivert($@@)')divert(_num)popdef(`_num')')
|
|
@result{}
|
|
divert(`2')two
|
|
divert
|
|
@result{}
|
|
cleardivert
|
|
@result{}
|
|
undivert
|
|
@result{}
|
|
@end example
|
|
|
|
@node Improved capitalize
|
|
@section Solution for @code{capitalize}
|
|
|
|
The @code{capitalize} macro (@pxref{Patsubst}) as presented earlier does
|
|
not allow clients to follow the quoting rule of thumb. Consider the
|
|
three macros @code{active}, @code{Active}, and @code{ACTIVE}, and the
|
|
difference between calling @code{capitalize} with the expansion of a
|
|
macro, expanding the result of a case change, and changing the case of a
|
|
double-quoted string:
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
include(`capitalize.m4')dnl
|
|
define(`active', `act1, ive')dnl
|
|
define(`Active', `Act2, Ive')dnl
|
|
define(`ACTIVE', `ACT3, IVE')dnl
|
|
upcase(active)
|
|
@result{}ACT1,IVE
|
|
upcase(`active')
|
|
@result{}ACT3, IVE
|
|
upcase(``active'')
|
|
@result{}ACTIVE
|
|
downcase(ACTIVE)
|
|
@result{}act3,ive
|
|
downcase(`ACTIVE')
|
|
@result{}act1, ive
|
|
downcase(``ACTIVE'')
|
|
@result{}active
|
|
capitalize(active)
|
|
@result{}Act1
|
|
capitalize(`active')
|
|
@result{}Active
|
|
capitalize(``active'')
|
|
@result{}_capitalize(`active')
|
|
define(`A', `OOPS')
|
|
@result{}
|
|
capitalize(active)
|
|
@result{}OOPSct1
|
|
capitalize(`active')
|
|
@result{}OOPSctive
|
|
@end example
|
|
|
|
First, when @code{capitalize} is called with more than one argument, it
|
|
was throwing away later arguments, whereas @code{upcase} and
|
|
@code{downcase} used @samp{$*} to collect them all. The fix is simple:
|
|
use @samp{$*} consistently.
|
|
|
|
Next, with single-quoting, @code{capitalize} outputs a single character,
|
|
a set of quotes, then the rest of the characters, making it impossible
|
|
to invoke @code{Active} after the fact, and allowing the alternate macro
|
|
@code{A} to interfere. Here, the solution is to use additional quoting
|
|
in the helper macros, then pass the final over-quoted output string
|
|
through @code{_arg1} to remove the extra quoting and finally invoke the
|
|
concatenated portions as a single string.
|
|
|
|
Finally, when passed a double-quoted string, the nested macro
|
|
@code{_capitalize} is never invoked because it ended up nested inside
|
|
quotes. This one is the toughest to fix. In short, we have no idea how
|
|
many levels of quotes are in effect on the substring being altered by
|
|
@code{patsubst}. If the replacement string cannot be expressed entirely
|
|
in terms of literal text and backslash substitutions, then we need a
|
|
mechanism to guarantee that the helper macros are invoked outside of
|
|
quotes. In other words, this sounds like a job for @code{changequote}
|
|
(@pxref{Changequote}). By changing the active quoting characters, we
|
|
can guarantee that replacement text injected by @code{patsubst} always
|
|
occurs in the middle of a string that has exactly one level of
|
|
over-quoting using alternate quotes; so the replacement text closes the
|
|
quoted string, invokes the helper macros, then reopens the quoted
|
|
string. In turn, that means the replacement text has unbalanced quotes,
|
|
necessitating another round of @code{changequote}.
|
|
|
|
In the fixed version below, (also shipped as
|
|
@file{m4-@value{VERSION}/@/examples/@/capitalize2.m4}), @code{capitalize}
|
|
uses the alternate quotes of @samp{<<[} and @samp{]>>} (the longer
|
|
strings are chosen so as to be less likely to appear in the text being
|
|
converted). The helpers @code{_to_alt} and @code{_from_alt} merely
|
|
reduce the number of characters required to perform a
|
|
@code{changequote}, since the definition changes twice. The outermost
|
|
pair means that @code{patsubst} and @code{_capitalize_alt} are invoked
|
|
with alternate quoting; the innermost pair is used so that the third
|
|
argument to @code{patsubst} can contain an unbalanced
|
|
@samp{]>>}/@samp{<<[} pair. Note that @code{upcase} and @code{downcase}
|
|
must be redefined as @code{_upcase_alt} and @code{_downcase_alt}, since
|
|
they contain nested quotes but are invoked with the alternate quoting
|
|
scheme in effect.
|
|
|
|
@comment examples
|
|
@example
|
|
$ @kbd{m4 -I examples}
|
|
include(`capitalize2.m4')dnl
|
|
define(`active', `act1, ive')dnl
|
|
define(`Active', `Act2, Ive')dnl
|
|
define(`ACTIVE', `ACT3, IVE')dnl
|
|
define(`A', `OOPS')dnl
|
|
capitalize(active; `active'; ``active''; ```actIVE''')
|
|
@result{}Act1,Ive; Act2, Ive; Active; `Active'
|
|
undivert(`capitalize2.m4')dnl
|
|
@result{}divert(`-1')
|
|
@result{}# upcase(text)
|
|
@result{}# downcase(text)
|
|
@result{}# capitalize(text)
|
|
@result{}# change case of text, improved version
|
|
@result{}define(`upcase', `translit(`$*', `a-z', `A-Z')')
|
|
@result{}define(`downcase', `translit(`$*', `A-Z', `a-z')')
|
|
@result{}define(`_arg1', `$1')
|
|
@result{}define(`_to_alt', `changequote(`<<[', `]>>')')
|
|
@result{}define(`_from_alt', `changequote(<<[`]>>, <<[']>>)')
|
|
@result{}define(`_upcase_alt', `translit(<<[$*]>>, <<[a-z]>>, <<[A-Z]>>)')
|
|
@result{}define(`_downcase_alt', `translit(<<[$*]>>, <<[A-Z]>>, <<[a-z]>>)')
|
|
@result{}define(`_capitalize_alt',
|
|
@result{} `regexp(<<[$1]>>, <<[^\(\w\)\(\w*\)]>>,
|
|
@result{} <<[_upcase_alt(<<[<<[\1]>>]>>)_downcase_alt(<<[<<[\2]>>]>>)]>>)')
|
|
@result{}define(`capitalize',
|
|
@result{} `_arg1(_to_alt()patsubst(<<[<<[$*]>>]>>, <<[\w+]>>,
|
|
@result{} _from_alt()`]>>_$0_alt(<<[\&]>>)<<['_to_alt())_from_alt())')
|
|
@result{}divert`'dnl
|
|
@end example
|
|
|
|
@node Improved fatal_error
|
|
@section Solution for @code{fatal_error}
|
|
|
|
The @code{fatal_error} macro (@pxref{M4exit}) is not robust to versions
|
|
of GNU M4 earlier than 1.4.8, where invoking
|
|
@code{@w{__file__}} (@pxref{Location}) inside @code{m4wrap} would result
|
|
in an empty string, and @code{@w{__line__}} resulted in @samp{0} even
|
|
though all files start at line 1. Furthermore, versions earlier than
|
|
1.4.6 did not support the @code{@w{__program__}} macro. If you want
|
|
@code{fatal_error} to work across the entire 1.4.x release series, a
|
|
better implementation would be:
|
|
|
|
@comment status: 1
|
|
@example
|
|
define(`fatal_error',
|
|
`errprint(ifdef(`__program__', `__program__', ``m4'')'dnl
|
|
`:ifelse(__line__, `0', `',
|
|
`__file__:__line__:')` fatal error: $*
|
|
')m4exit(`1')')
|
|
@result{}
|
|
m4wrap(`divnum(`demo of internal message')
|
|
fatal_error(`inside wrapped text')')
|
|
@result{}
|
|
^D
|
|
@error{}m4:stdin:6: Warning: excess arguments to builtin `divnum' ignored
|
|
@result{}0
|
|
@error{}m4:stdin:6: fatal error: inside wrapped text
|
|
@end example
|
|
|
|
@c ========================================================== Appendices
|
|
|
|
@node Copying This Package
|
|
@appendix How to make copies of the overall M4 package
|
|
@cindex License, code
|
|
|
|
This appendix covers the license for copying the source code of the
|
|
overall M4 package. This manual is under a different set of
|
|
restrictions, covered later (@pxref{Copying This Manual}).
|
|
|
|
@menu
|
|
* GNU General Public License:: License for copying the M4 package
|
|
@end menu
|
|
|
|
@node GNU General Public License
|
|
@appendixsec License for copying the M4 package
|
|
@cindex GPL, GNU General Public License
|
|
@cindex GNU General Public License
|
|
@cindex General Public License (GPL), GNU
|
|
@include gpl-3.0.texi
|
|
|
|
@node Copying This Manual
|
|
@appendix How to make copies of this manual
|
|
@cindex License, manual
|
|
|
|
This appendix covers the license for copying this manual. Note that
|
|
some of the longer examples in this manual are also distributed in the
|
|
directory @file{m4-@value{VERSION}/@/examples/}, where a more
|
|
permissive license is in effect when copying just the examples.
|
|
|
|
@menu
|
|
* GNU Free Documentation License:: License for copying this manual
|
|
@end menu
|
|
|
|
@node GNU Free Documentation License
|
|
@appendixsec License for copying this manual
|
|
@cindex FDL, GNU Free Documentation License
|
|
@cindex GNU Free Documentation License
|
|
@cindex Free Documentation License (FDL), GNU
|
|
@include fdl-1.3.texi
|
|
|
|
@node Indices
|
|
@appendix Indices of concepts and macros
|
|
|
|
@menu
|
|
* Macro index:: Index for all @code{m4} macros
|
|
* Concept index:: Index for many concepts
|
|
@end menu
|
|
|
|
@node Macro index
|
|
@appendixsec Index for all @code{m4} macros
|
|
|
|
This index covers all @code{m4} builtins, as well as several useful
|
|
composite macros. References are exclusively to the places where a
|
|
macro is introduced the first time.
|
|
|
|
@printindex fn
|
|
|
|
@node Concept index
|
|
@appendixsec Index for many concepts
|
|
|
|
@printindex cp
|
|
|
|
@bye
|
|
|
|
@c Local Variables:
|
|
@c coding: utf-8
|
|
@c fill-column: 72
|
|
@c ispell-local-dictionary: "american"
|
|
@c indent-tabs-mode: nil
|
|
@c whitespace-check-buffer-indent: nil
|
|
@c End:
|