@menu
* Introduction::         AutoGen's Purpose
* Definitions File::     AutoGen Definitions File
* Template File::        AutoGen Template
* Augmenting AutoGen::   Augmenting AutoGen Features
* autogen Invocation::   Invoking AutoGen
* Installation::         Configuring and Installing
* AutoOpts::             Automated Option Processing
* Add-Ons::              Add-on packages for AutoGen
* Future::               Some ideas for the future.
* Copying This Manual::  Copying This Manual
* Concept Index::        General index
* Function Index::       Function index
@end menu

@ignore
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
@end ignore
@page
@node Introduction
@chapter Introduction
@cindex Introduction

AutoGen is a tool designed for generating program files that contain
repetitive text with varied substitutions.  Its goal is to simplify the
maintenance of programs that contain large amounts of repetitious text.
This is especially valuable if there are several blocks of such text
that must be kept synchronized in parallel tables.

An obvious example is the problem of maintaining the code required for
processing program options and configuration settings.  Processing options
requires a minimum of four different constructs be kept in proper order in
different places in your program.  You need at least:

@enumerate
@item
The flag character in the flag string,
@item
code to process the flag when it is encountered,
@item
a global state variable or two, and
@item
a line in the usage text.
@end enumerate

@noindent
You will need more things besides this if you choose to implement long option
names, configuration (rc/ini) file processing, environment variable settings
and keep all the documentation for these up to date.  This can be done
mechanically; with the proper templates and this program.  In fact, it has
already been done and AutoGen itself uses it@: @xref{AutoOpts}.  For a simple
example of Automated Option processing, @xref{Quick Start}.  For a full list
of the Automated Option features, @xref{Features}.  Be forewarned, though, the
feature list is ridiculously extensive.

@menu
* Generalities::         The Purpose of AutoGen
* Example Usage::        A Simple Example
* csh/zsh caveat::       csh/zsh caveat
* Testimonial::          A User's Perspective
@end menu

@c === SECTION MARKER

@node Generalities
@section The Purpose of AutoGen

The idea of this program is to have a text file, a template if
you will, that contains the general text of the desired output file.
That file includes substitution expressions and sections of text that are
replicated under the control of separate definition files.

@cindex design goals

AutoGen was designed with the following features:

@enumerate
@item
The definitions are completely separate from the template.  By completely
isolating the definitions from the template it greatly increases the
flexibility of the template implementation.  A secondary goal is that a
template user only needs to specify those data that are necessary to describe
his application of a template.

@item
Each datum in the definitions is named.  Thus, the definitions can be
rearranged, augmented and become obsolete without it being necessary to
go back and clean up older definition files.  Reduce incompatibilities!

@item
Every definition name defines an array of values, even when there is
only one entry.  These arrays of values are used to control the
replication of sections of the template.

@item
There are named collections of definitions.  They form a nested hierarchy.
Associated values are collected and associated with a group name.
These associated data are used collectively in sets of substitutions.

@item
The template has special markers to indicate where substitutions are
required, much like the @code{$@{VAR@}} construct in a shell @code{here doc}.
These markers are not fixed strings.  They are specified at the start of
each template.  Template designers know best what fits into their
syntax and can avoid marker conflicts.

We did this because it is burdensome and difficult to avoid conflicts
using either M4 tokenization or C preprocessor substitution rules.  It
also makes it easier to specify expressions that transform the value.
Of course, our expressions are less cryptic than the shell methods.

@item
These same markers are used, in conjunction with enclosed keywords, to
indicate sections of text that are to be skipped and for sections of
text that are to be repeated.  This is a major improvement over using C
preprocessing macros.  With the C preprocessor, you have no way of
selecting output text because it is an @i{un}varying, mechanical
substitution process.

@item
Finally, we supply methods for carefully controlling the output.
Sometimes, it is just simply easier and clearer to compute some text or
a value in one context when its application needs to be later.  So,
functions are available for saving text or values for later use.
@end enumerate

@c === SECTION MARKER

@node Example Usage
@section A Simple Example
@cindex example, simple AutoGen

This is just one simple example that shows a few basic features.
If you are interested, you also may run "make check" with the
@code{VERBOSE} environment variable set and see a number of other
examples in the @file{agen5/test/testdir} directory.

Assume you have an enumeration of names and you wish to associate some
string with each name.  Assume also, for the sake of this example,
that it is either too complex or too large to maintain easily by hand.
We will start by writing an abbreviated version of what the result
is supposed to be.  We will use that to construct our output templates.

@noindent
In a header file, @file{list.h}, you define the enumeration
and the global array containing the associated strings:

@example
typedef enum @{
        IDX_ALPHA,
        IDX_BETA,
        IDX_OMEGA @}  list_enum;

extern char const* az_name_list[ 3 ];
@end example

@noindent
Then you also have @file{list.c} that defines the actual strings:

@example
#include "list.h"
char const* az_name_list[] = @{
        "some alpha stuff",
        "more beta stuff",
        "final omega stuff" @};
@end example

@noindent
First, we will define the information that is unique for each enumeration
name/string pair.  This would be placed in a file named, @file{list.def},
for example.

@example
autogen definitions list;
list = @{ list_element = alpha;
         list_info    = "some alpha stuff"; @};
list = @{ list_info    = "more beta stuff";
         list_element = beta; @};
list = @{ list_element = omega;
         list_info    = "final omega stuff"; @};
@end example

The @code{autogen definitions list;} entry defines the file as an AutoGen
definition file that uses a template named @code{list}.  That is followed by
three @code{list} entries that define the associations between the
enumeration names and the strings.  The order of the differently named
elements inside of list is unimportant.  They are reversed inside of the
@code{beta} entry and the output is unaffected.

Now, to actually create the output, we need a template or two that can be
expanded into the files you want.  In this program, we use a single template
that is capable of multiple output files.  The definitions above refer to a
@file{list} template, so it would normally be named, @file{list.tpl}.

It looks something like this.
(For a full description, @xref{Template File}.)

@example
[+ AutoGen5 template h c +]
[+ CASE (suffix) +][+
   ==  h  +]
typedef enum @{[+
   FOR list "," +]
        IDX_[+ (string-upcase! (get "list_element")) +][+
   ENDFOR list +] @}  list_enum;

extern char const* az_name_list[ [+ (count "list") +] ];
[+

   ==  c  +]
#include "list.h"
char const* az_name_list[] = @{[+
  FOR list "," +]
        "[+list_info+]"[+
  ENDFOR list +] @};[+

ESAC +]
@end example

The @code{[+ AutoGen5 template h c +]} text tells AutoGen that this is
an AutoGen version 5 template file; that it is to be processed twice;
that the start macro marker is @code{[+}; and the end marker is
@code{+]}.  The template will be processed first with a suffix value of
@code{h} and then with @code{c}.  Normally, the suffix values are
appended to the @file{base-name} to create the output file name.

The @code{[+ == h +]} and @code{[+ == c +]} @code{CASE} selection clauses
select different text for the two different passes.  In this example,
the output is nearly disjoint and could have been put in two separate
templates.  However, sometimes there are common sections and this is
just an example.

The @code{[+FOR list "," +]} and @code{[+ ENDFOR list +]} clauses delimit
a block of text that will be repeated for every definition of @code{list}.
Inside of that block, the definition name-value pairs that
are members of each @code{list} are available for substitutions.

The remainder of the macros are expressions.  Some of these contain
special expression functions that are dependent on AutoGen named values;
others are simply Scheme expressions, the result of which will be
inserted into the output text.  Other expressions are names of AutoGen
values.  These values will be inserted into the output text.  For example,
@code{[+list_info+]} will result in the value associated with
the name @code{list_info} being inserted between the double quotes and
@code{(string-upcase! (get "list_element"))} will first "get" the value
associated with the name @code{list_element}, then change the case of
all the letters to upper case.  The result will be inserted into the
output document.

If you have compiled AutoGen, you can copy out the template and definitions
as described above and run @code{autogen list.def}.  This will produce
exactly the hypothesized desired output.

One more point, too.  Lets say you decided it was too much trouble to figure
out how to use AutoGen, so you created this enumeration and string list with
thousands of entries.  Now, requirements have changed and it has become
necessary to map a string containing the enumeration name into the enumeration
number.  With AutoGen, you just alter the template to emit the table of names.
It will be guaranteed to be in the correct order, missing none of the entries.
If you want to do that by hand, well, good luck.

@c === SECTION MARKER

@node csh/zsh caveat
@section csh/zsh caveat

AutoGen tries to use your normal shell so that you can supply shell code
in a manner you are accustomed to using.  If, however, you use csh or
zsh, you cannot do this.  Csh is sufficiently difficult to program that
it is unsupported.  Zsh, though largely programmable, also has some
anomalies that make it incompatible with AutoGen usage.  Therefore, when
invoking AutoGen from these environments, you must be certain to set the
SHELL environment variable to a Bourne-derived shell, e.g., sh, ksh or
bash.

Any shell you choose for your own scripts need to follow these basic
requirements:

@enumerate
@item
It handles @code{trap ":" $sig} without output to standard out.  This is done
when the server shell is first started.  If your shell does not handle this,
then it may be able to by loading functions from its start up files.
@item
At the beginning of each scriptlet, the command @code{\\cd $PWD}
is inserted.  This ensures that @code{cd} is not aliased to something
peculiar and each scriptlet starts life in the execution directory.
@item
At the end of each scriptlet, the command @code{echo mumble} is
appended.  The program you use as a shell must emit the single
argument @code{mumble} on a line by itself.
@end enumerate

@c === SECTION MARKER

@node Testimonial
@section A User's Perspective

@format
Alexandre wrote:
>
> I'd appreciate opinions from others about advantages/disadvantages of
> each of these macro packages.
@end format

I am using AutoGen in my pet project, and find one of its best points to
be that it separates the operational data from the implementation.

Indulge me for a few paragraphs, and all will be revealed:
In the manual, Bruce cites the example of maintaining command line flags
inside the source code; traditionally spreading usage information, flag
names, letters and processing across several functions (if not files).
Investing the time in writing a sort of boiler plate (a template in
AutoGen terminology) pays by moving all of the option details (usage,
flags names etc.) into a well structured table (a definition file if you
will),  so that adding a new command line option becomes a simple matter
of adding a set of details to the table.

So far so good!  Of course, now that there is a template, writing all of
that tedious optargs processing and usage functions is no longer an
issue.  Creating a table of the options needed for the new project and
running AutoGen generates all of the option processing code in C
automatically from just the tabular data.  AutoGen in fact already ships
with such a template... AutoOpts.

One final consequence of the good separation in the design of AutoGen is
that it is retargetable to a greater extent.  The
egcs/gcc/fixinc/inclhack.def can equally be used (with different
templates) to create a shell script (inclhack.sh) or a c program
(fixincl.c).

This is just the tip of the iceberg.  AutoGen is far more powerful than
these examples might indicate, and has many other varied uses.  I am
certain Bruce or I could supply you with many and varied examples, and I
would heartily recommend that you try it for your project and see for
yourself how it compares to m4.
@cindex m4

As an aside, I would be interested to see whether someone might be
persuaded to rationalise autoconf with AutoGen in place of m4...  Ben,
are you listening?  autoconf-3.0! `kay?  =)O|

@format
Sincerely,
        Gary V. Vaughan
@end format

@ignore
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
@end ignore
@page
@node Definitions File
@chapter Definitions File
@cindex definitions file
@cindex .def file

This chapter describes the syntax and semantics of the AutoGen
definition file.  In order to instantiate a template, you normally must
provide a definitions file that identifies itself and contains some
value definitions.  Consequently, we keep it very simple.  For
"advanced" users, there are preprocessing directives, sparse
arrays, named indexes and comments that may be used as well.

The definitions file is used to associate values with names.  Every
value is implicitly an array of values, even if there is only one value.
Values may be either simple strings or compound collections of
name-value pairs.  An array may not contain both simple and compound
members.  Fundamentally, it is as simple as:

@example
prog-name = "autogen";
flag = @{
    name      = templ_dirs;
    value     = L;
    descrip   = "Template search directory list";
@};
@end example

For purposes of commenting and controlling the processing of the
definitions, C-style comments and most C preprocessing directives are
honored.  The major exception is that the @code{#if} directive is
ignored, along with all following text through the matching
@code{#endif} directive.  The C preprocessor is not actually invoked, so
C macro substitution is @strong{not} performed.

@menu
* Identification::        The Identification Definition
* Definitions::           Named Definitions
* Index Assignments::     Assigning an Index to a Definition
* Dynamic Text::          Dynamic Text
* Directives::            Controlling What Gets Processed
* Predefines::            Pre-defined Names
* Comments::              Commenting Your Definitions
* Example::               What it all looks like.
* Full Syntax::           Finite State Machine Grammar
* Alternate Definition::  Alternate Definition Forms
@end menu

@c === SECTION MARKER

@node Identification
@section The Identification Definition
@cindex identification

The first definition in this file is used to identify it as a
AutoGen file.  It consists of the two keywords,
@samp{autogen} and @samp{definitions} followed by the default
template name and a terminating semi-colon (@code{;}).  That is:

@example
        AutoGen Definitions @var{template-name};
@end example

@noindent
Note that, other than the name @var{template-name}, the words
@samp{AutoGen} and @samp{Definitions} are searched for without case
sensitivity.  Most lookups in this program are case insensitive.

@noindent
Also, if the input contains more identification definitions,
they will be ignored.  This is done so that you may include
(@pxref{Directives}) other definition files without an identification
conflict.

@cindex template file

@noindent
AutoGen uses the name of the template to find the corresponding template
file.  It searches for the file in the following way, stopping when
it finds the file:

@enumerate
@item
It tries to open @file{./@var{template-name}}.  If it fails,
@item
it tries @file{./@var{template-name}.tpl}.
@item
It searches for either of these files in the directories listed in the
templ-dirs command line option.
@end enumerate

If AutoGen fails to find the template file in one of these places,
it prints an error message and exits.

@c === SECTION MARKER

@node Definitions
@section Named Definitions
@cindex definitions

A name is a sequence of characters beginning with an alphabetic character
(@code{a} through @code{z}) followed by zero or more alpha-numeric
characters and/or separator characters: hyphen (@code{-}), underscore
(@code{_}) or carat (@code{^}).  Names are case insensitive.

Any name may have multiple values associated with it.  Every name may be
considered a sparse array of one or more elements.  If there is more than
one value, the values my be accessed by indexing the value with
@code{[index]} or by iterating over them using the FOR (@pxref{FOR}) AutoGen
macro on it, as described in the next chapter.  Sparse arrays are specified
by specifying an index when defining an entry
(@pxref{Index Assignments,, Assigning an Index to a Definition}).

There are two kinds of definitions, @samp{simple} and @samp{compound}.
They are defined thus (@pxref{Full Syntax}):

@example
compound_name '=' '@{' definition-list '@}' ';'

simple-name[2] '=' string ';'

no^text^name ';'
@end example

@noindent
@code{simple-name} has the third index (index number 2) defined here.
@code{No^text^name} is a simple definition with a shorthand empty string
value.  The string values for definitions may be specified in any of
several formation rules.

@menu
* def-list::                 Definition List
* double-quote-string::      Double Quote String
* single-quote-string::      Single Quote String
* simple-string::            An Unquoted String
* shell-generated::          Shell Output String
* scheme-generated::         Scheme Result String
* here-string::              A Here String
* concat-string::            Concatenated Strings
@end menu

@cindex simple definitions
@cindex compound definitions

@node def-list
@subsection Definition List

@code{definition-list} is a list of definitions that may or may not
contain nested compound definitions.  Any such definitions may
@strong{only} be expanded within a @code{FOR} block iterating over the
containing compound definition.  @xref{FOR}.

Here is, again, the example definitions from the previous chapter,
with three additional name value pairs.  Two with an empty value
assigned (@var{first} and @var{last}), and a "global" @var{group_name}.

@example
autogen definitions list;
group_name = example;
list = @{ list_element = alpha;  first;
         list_info    = "some alpha stuff"; @};
list = @{ list_info    = "more beta stuff";
         list_element = beta; @};
list = @{ list_element = omega;  last;
         list_info    = "final omega stuff"; @};
@end example

@node double-quote-string
@subsection Double Quote String

@cindex string, double quote
The string follows the C-style escaping, using the backslash to quote
(escape) the following character(s).  Certain letters are translated to
various control codes (e.g. @code{\n}, @code{\f}, @code{\t}, etc.).
@code{x} introduces a two character hex code.  @code{0} (the digit zero)
introduces a one to three character octal code (note: an octal byte followed
by a digit must be represented with three octal digits, thus: @code{"\0001"}
yielding a NUL byte followed by the ASCII digit 1).  Any other character
following the backslash escape is simply inserted, without error, into the
string being formed.

Like ANSI "C", a series of these strings, possibly intermixed with
single quote strings, will be concatenated together.

@node single-quote-string
@subsection Single Quote String

@cindex string, single quote
This is similar to the shell single-quote string.  However, escapes
@code{\} are honored before another escape, single quotes @code{'}
and hash characters @code{#}.  This latter is done specifically
to disambiguate lines starting with a hash character inside
of a quoted string.  In other words,

@example
fumble = '
#endif
';
@end example

could be misinterpreted by the definitions scanner, whereas
this would not:

@example
fumble = '
\#endif
';
@end example

@*
As with the double quote string, a series of these, even intermixed
with double quote strings, will be concatenated together.

@node simple-string
@subsection An Unquoted String

A simple string that does not contain white space @i{may} be left
unquoted.  The string must not contain any of the characters special to
the definition text (i.e., @code{"}, @code{#}, @code{'}, @code{(},
@code{)}, @code{,}, @code{;}, @code{<}, @code{=}, @code{>}, @code{[},
@code{]}, @code{`}, @code{@{}, or @code{@}}).  This list is subject to
change, but it will never contain underscore (@code{_}), period
(@code{.}), slash (@code{/}), colon (@code{:}), hyphen (@code{-}) or
backslash (@code{\\}).  Basically, if the string looks like it is a
normal DOS or UNIX file or variable name, and it is not one of two
keywords (@samp{autogen} or @samp{definitions}) then it is OK to not
quote it, otherwise you should.

@node shell-generated
@subsection Shell Output String
@cindex shell-generated string

@cindex string, shell output
This is assembled according to the same rules as the double quote string,
except that there is no concatenation of strings and the resulting string is
written to a shell server process.  The definition takes on the value of
the output string.

NB@: The text is interpreted by a server shell.  There may be left over
state from previous server shell processing.  This scriptlet may also leave
state for subsequent processing.  However, a @code{cd} to the original
directory is always issued before the new command is issued.

@node scheme-generated
@subsection Scheme Result String

A scheme result string must begin with an open parenthesis @code{(}.
The scheme expression will be evaluated by Guile and the
value will be the result.  The AutoGen expression functions
are @strong{dis}abled at this stage, so do not use them.

@node here-string
@subsection A Here String
@cindex here-string

A @samp{here string} is formed in much the same way as a shell here doc.  It
is denoted with two less than characters(@code{<<}) and, optionally, a hyphen.
This is followed by optional horizontal white space and an ending
marker-identifier.  This marker must follow the syntax rules for identifiers.
Unlike the shell version, however, you must not quote this marker.

The resulting string will start with the first character on the next line and
continue up to but not including the newline that precedes the line that
begins with the marker token.  The characters are copied directly into the
result string.  Mostly.

If a hyphen follows the less than characters, then leading tabs will be
stripped and the terminating marker will be recognized even if preceded by
tabs.  Also, if the first character on the line (after removing tabs) is a
backslash and the next character a tab, then the backslash will be removed as
well.  No other kind of processing is done on this string.

Here are two examples:
@example
str1 = <<-  STR_END
        $quotes = " ' `
        STR_END;

str2 = <<   STR_END
        $quotes = " ' `
        STR_END;
STR_END;
@end example
The first string contains no new line characters.
The first character is the dollar sign, the last the back quote.

The second string contains one new line character.  The first character
is the tab character preceding the dollar sign.  The last character is
the semicolon after the @code{STR_END}.  That @code{STR_END} does not
end the string because it is not at the beginning of the line.  In the
preceding case, the leading tab was stripped.

@node concat-string
@subsection Concatenated Strings
@cindex concat-string

If single or double quote characters are used,
then you also have the option, a la ANSI-C syntax,
of implicitly concatenating a series of them together,
with intervening white space ignored.

NB@:  You @strong{cannot} use directives to alter the string
content.  That is,

@example
str = "fumble"
#ifdef LATER
      "stumble"
#endif
      ;
@end example

@noindent
will result in a syntax error.  The preprocessing directives are not
carried out by the C preprocessor.  However,

@example
str = '"fumble\n"
#ifdef LATER
"     stumble\n"
#endif
';
@end example

@noindent
@strong{Will} work.  It will enclose the @samp{#ifdef LATER}
and @samp{#endif} in the string.  But it may also wreak
havoc with the definition processing directives.  The hash
characters in the first column should be disambiguated with
an escape @code{\} or join them with previous lines:
@code{"fumble\n#ifdef LATER...}.

@c === SECTION MARKER

@node Index Assignments
@section Assigning an Index to a Definition
@cindex Definition Index

In AutoGen, every name is implicitly an array of values.
When assigning values, they are usually implicitly
assigned to the next highest slot.  They can also be
specified explicitly:

@example
mumble[9] = stumble;
mumble[0] = grumble;
@end example

@noindent
If, subsequently, you assign a value to @code{mumble} without an
index, its index will be @code{10}, not @code{1}.
If indexes are specified, they must not cause conflicts.

@code{#define}-d names may also be used for index values.
This is equivalent to the above:

@example
#define FIRST 0
#define LAST  9
mumble[LAST]  = stumble;
mumble[FIRST] = grumble;
@end example

All values in a range do @strong{not} have to be filled in.
If you leave gaps, then you will have a sparse array.  This
is fine (@pxref{FOR}).  You have your choice of iterating
over all the defined values, or iterating over a range
of slots.  This:

@example
[+ FOR mumble +][+ ENDFOR +]
@end example

@noindent
iterates over all and only the defined entries, whereas this:

@example
[+ FOR mumble (for-by 1) +][+ ENDFOR +]
@end example

@noindent
will iterate over all 10 "slots".  Your template will
likely have to contain something like this:

@example
[+ IF (exist? (sprintf "mumble[%d]" (for-index))) +]
@end example

@noindent
or else "mumble" will have to be a compound value that,
say, always contains a "grumble" value:

@example
[+ IF (exist? "grumble") +]
@end example

@c === SECTION MARKER

@node Dynamic Text
@section Dynamic Text
@cindex Dynamic Definition Text

There are several methods for including dynamic content inside a definitions
file.  Three of them are mentioned above (@ref{shell-generated} and
@pxref{scheme-generated}) in the discussion of string formation rules.
Another method uses the @code{#shell} processing directive.
It will be discussed in the next section (@pxref{Directives}).
Guile/Scheme may also be used to yield to create definitions.

When the Scheme expression is preceded by a backslash and single
quote, then the expression is expected to be an alist of
names and values that will be used to create AutoGen definitions.

@noindent
This method can be be used as follows:

@example
\'( (name  (value-expression))
    (name2 (another-expr))  )
@end example

@noindent
This is entirely equivalent to:

@example
name  = (value-expression);
name2 = (another-expr);
@end example

@noindent
Under the covers, the expression gets handed off to a Guile function
named @code{alist->autogen-def} in an expression that looks like this:

@example
(alist->autogen-def
    ( (name (value-expression))  (name2 (another-expr)) ) )
@end example