diff options
Diffstat (limited to 'doc/autogen-intro.texi')
-rw-r--r-- | doc/autogen-intro.texi | 788 |
1 files changed, 788 insertions, 0 deletions
diff --git a/doc/autogen-intro.texi b/doc/autogen-intro.texi new file mode 100644 index 0000000..24e47d0 --- /dev/null +++ b/doc/autogen-intro.texi @@ -0,0 +1,788 @@ + +@menu +* Introduction:: AutoGen's Purpose +* Definitions File:: AutoGen Definitions File +* Template File:: AutoGen Template +* Augmenting AutoGen:: Augmenting AutoGen Features +* autogen Invocation:: Invoking AutoGen +* Installation:: Configuring and Installing +* AutoOpts:: Automated Option Processing +* Add-Ons:: Add-on packages for AutoGen +* Future:: Some ideas for the future. +* Copying This Manual:: Copying This Manual +* Concept Index:: General index +* Function Index:: Function index +@end menu + +@ignore +* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * +@end ignore +@page +@node Introduction +@chapter Introduction +@cindex Introduction + +AutoGen is a tool designed for generating program files that contain +repetitive text with varied substitutions. Its goal is to simplify the +maintenance of programs that contain large amounts of repetitious text. +This is especially valuable if there are several blocks of such text +that must be kept synchronized in parallel tables. + +An obvious example is the problem of maintaining the code required for +processing program options and configuration settings. Processing options +requires a minimum of four different constructs be kept in proper order in +different places in your program. You need at least: + +@enumerate +@item +The flag character in the flag string, +@item +code to process the flag when it is encountered, +@item +a global state variable or two, and +@item +a line in the usage text. +@end enumerate + +@noindent +You will need more things besides this if you choose to implement long option +names, configuration (rc/ini) file processing, environment variable settings +and keep all the documentation for these up to date. This can be done +mechanically; with the proper templates and this program. In fact, it has +already been done and AutoGen itself uses it@: @xref{AutoOpts}. For a simple +example of Automated Option processing, @xref{Quick Start}. For a full list +of the Automated Option features, @xref{Features}. Be forewarned, though, the +feature list is ridiculously extensive. + +@menu +* Generalities:: The Purpose of AutoGen +* Example Usage:: A Simple Example +* csh/zsh caveat:: csh/zsh caveat +* Testimonial:: A User's Perspective +@end menu + +@c === SECTION MARKER + +@node Generalities +@section The Purpose of AutoGen + +The idea of this program is to have a text file, a template if +you will, that contains the general text of the desired output file. +That file includes substitution expressions and sections of text that are +replicated under the control of separate definition files. + +@cindex design goals + +AutoGen was designed with the following features: + +@enumerate +@item +The definitions are completely separate from the template. By completely +isolating the definitions from the template it greatly increases the +flexibility of the template implementation. A secondary goal is that a +template user only needs to specify those data that are necessary to describe +his application of a template. + +@item +Each datum in the definitions is named. Thus, the definitions can be +rearranged, augmented and become obsolete without it being necessary to +go back and clean up older definition files. Reduce incompatibilities! + +@item +Every definition name defines an array of values, even when there is +only one entry. These arrays of values are used to control the +replication of sections of the template. + +@item +There are named collections of definitions. They form a nested hierarchy. +Associated values are collected and associated with a group name. +These associated data are used collectively in sets of substitutions. + +@item +The template has special markers to indicate where substitutions are +required, much like the @code{$@{VAR@}} construct in a shell @code{here doc}. +These markers are not fixed strings. They are specified at the start of +each template. Template designers know best what fits into their +syntax and can avoid marker conflicts. + +We did this because it is burdensome and difficult to avoid conflicts +using either M4 tokenization or C preprocessor substitution rules. It +also makes it easier to specify expressions that transform the value. +Of course, our expressions are less cryptic than the shell methods. + +@item +These same markers are used, in conjunction with enclosed keywords, to +indicate sections of text that are to be skipped and for sections of +text that are to be repeated. This is a major improvement over using C +preprocessing macros. With the C preprocessor, you have no way of +selecting output text because it is an @i{un}varying, mechanical +substitution process. + +@item +Finally, we supply methods for carefully controlling the output. +Sometimes, it is just simply easier and clearer to compute some text or +a value in one context when its application needs to be later. So, +functions are available for saving text or values for later use. +@end enumerate + +@c === SECTION MARKER + +@node Example Usage +@section A Simple Example +@cindex example, simple AutoGen + +This is just one simple example that shows a few basic features. +If you are interested, you also may run "make check" with the +@code{VERBOSE} environment variable set and see a number of other +examples in the @file{agen5/test/testdir} directory. + +Assume you have an enumeration of names and you wish to associate some +string with each name. Assume also, for the sake of this example, +that it is either too complex or too large to maintain easily by hand. +We will start by writing an abbreviated version of what the result +is supposed to be. We will use that to construct our output templates. + +@noindent +In a header file, @file{list.h}, you define the enumeration +and the global array containing the associated strings: + +@example +typedef enum @{ + IDX_ALPHA, + IDX_BETA, + IDX_OMEGA @} list_enum; + +extern char const* az_name_list[ 3 ]; +@end example + +@noindent +Then you also have @file{list.c} that defines the actual strings: + +@example +#include "list.h" +char const* az_name_list[] = @{ + "some alpha stuff", + "more beta stuff", + "final omega stuff" @}; +@end example + +@noindent +First, we will define the information that is unique for each enumeration +name/string pair. This would be placed in a file named, @file{list.def}, +for example. + +@example +autogen definitions list; +list = @{ list_element = alpha; + list_info = "some alpha stuff"; @}; +list = @{ list_info = "more beta stuff"; + list_element = beta; @}; +list = @{ list_element = omega; + list_info = "final omega stuff"; @}; +@end example + +The @code{autogen definitions list;} entry defines the file as an AutoGen +definition file that uses a template named @code{list}. That is followed by +three @code{list} entries that define the associations between the +enumeration names and the strings. The order of the differently named +elements inside of list is unimportant. They are reversed inside of the +@code{beta} entry and the output is unaffected. + +Now, to actually create the output, we need a template or two that can be +expanded into the files you want. In this program, we use a single template +that is capable of multiple output files. The definitions above refer to a +@file{list} template, so it would normally be named, @file{list.tpl}. + +It looks something like this. +(For a full description, @xref{Template File}.) + +@example +[+ AutoGen5 template h c +] +[+ CASE (suffix) +][+ + == h +] +typedef enum @{[+ + FOR list "," +] + IDX_[+ (string-upcase! (get "list_element")) +][+ + ENDFOR list +] @} list_enum; + +extern char const* az_name_list[ [+ (count "list") +] ]; +[+ + + == c +] +#include "list.h" +char const* az_name_list[] = @{[+ + FOR list "," +] + "[+list_info+]"[+ + ENDFOR list +] @};[+ + +ESAC +] +@end example + +The @code{[+ AutoGen5 template h c +]} text tells AutoGen that this is +an AutoGen version 5 template file; that it is to be processed twice; +that the start macro marker is @code{[+}; and the end marker is +@code{+]}. The template will be processed first with a suffix value of +@code{h} and then with @code{c}. Normally, the suffix values are +appended to the @file{base-name} to create the output file name. + +The @code{[+ == h +]} and @code{[+ == c +]} @code{CASE} selection clauses +select different text for the two different passes. In this example, +the output is nearly disjoint and could have been put in two separate +templates. However, sometimes there are common sections and this is +just an example. + +The @code{[+FOR list "," +]} and @code{[+ ENDFOR list +]} clauses delimit +a block of text that will be repeated for every definition of @code{list}. +Inside of that block, the definition name-value pairs that +are members of each @code{list} are available for substitutions. + +The remainder of the macros are expressions. Some of these contain +special expression functions that are dependent on AutoGen named values; +others are simply Scheme expressions, the result of which will be +inserted into the output text. Other expressions are names of AutoGen +values. These values will be inserted into the output text. For example, +@code{[+list_info+]} will result in the value associated with +the name @code{list_info} being inserted between the double quotes and +@code{(string-upcase! (get "list_element"))} will first "get" the value +associated with the name @code{list_element}, then change the case of +all the letters to upper case. The result will be inserted into the +output document. + +If you have compiled AutoGen, you can copy out the template and definitions +as described above and run @code{autogen list.def}. This will produce +exactly the hypothesized desired output. + +One more point, too. Lets say you decided it was too much trouble to figure +out how to use AutoGen, so you created this enumeration and string list with +thousands of entries. Now, requirements have changed and it has become +necessary to map a string containing the enumeration name into the enumeration +number. With AutoGen, you just alter the template to emit the table of names. +It will be guaranteed to be in the correct order, missing none of the entries. +If you want to do that by hand, well, good luck. + +@c === SECTION MARKER + +@node csh/zsh caveat +@section csh/zsh caveat + +AutoGen tries to use your normal shell so that you can supply shell code +in a manner you are accustomed to using. If, however, you use csh or +zsh, you cannot do this. Csh is sufficiently difficult to program that +it is unsupported. Zsh, though largely programmable, also has some +anomalies that make it incompatible with AutoGen usage. Therefore, when +invoking AutoGen from these environments, you must be certain to set the +SHELL environment variable to a Bourne-derived shell, e.g., sh, ksh or +bash. + +Any shell you choose for your own scripts need to follow these basic +requirements: + +@enumerate +@item +It handles @code{trap ":" $sig} without output to standard out. This is done +when the server shell is first started. If your shell does not handle this, +then it may be able to by loading functions from its start up files. +@item +At the beginning of each scriptlet, the command @code{\\cd $PWD} +is inserted. This ensures that @code{cd} is not aliased to something +peculiar and each scriptlet starts life in the execution directory. +@item +At the end of each scriptlet, the command @code{echo mumble} is +appended. The program you use as a shell must emit the single +argument @code{mumble} on a line by itself. +@end enumerate + +@c === SECTION MARKER + +@node Testimonial +@section A User's Perspective + +@format +Alexandre wrote: +> +> I'd appreciate opinions from others about advantages/disadvantages of +> each of these macro packages. +@end format + +I am using AutoGen in my pet project, and find one of its best points to +be that it separates the operational data from the implementation. + +Indulge me for a few paragraphs, and all will be revealed: +In the manual, Bruce cites the example of maintaining command line flags +inside the source code; traditionally spreading usage information, flag +names, letters and processing across several functions (if not files). +Investing the time in writing a sort of boiler plate (a template in +AutoGen terminology) pays by moving all of the option details (usage, +flags names etc.) into a well structured table (a definition file if you +will), so that adding a new command line option becomes a simple matter +of adding a set of details to the table. + +So far so good! Of course, now that there is a template, writing all of +that tedious optargs processing and usage functions is no longer an +issue. Creating a table of the options needed for the new project and +running AutoGen generates all of the option processing code in C +automatically from just the tabular data. AutoGen in fact already ships +with such a template... AutoOpts. + +One final consequence of the good separation in the design of AutoGen is +that it is retargetable to a greater extent. The +egcs/gcc/fixinc/inclhack.def can equally be used (with different +templates) to create a shell script (inclhack.sh) or a c program +(fixincl.c). + +This is just the tip of the iceberg. AutoGen is far more powerful than +these examples might indicate, and has many other varied uses. I am +certain Bruce or I could supply you with many and varied examples, and I +would heartily recommend that you try it for your project and see for +yourself how it compares to m4. +@cindex m4 + +As an aside, I would be interested to see whether someone might be +persuaded to rationalise autoconf with AutoGen in place of m4... Ben, +are you listening? autoconf-3.0! `kay? =)O| + +@format +Sincerely, + Gary V. Vaughan +@end format + +@ignore +* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * +@end ignore +@page +@node Definitions File +@chapter Definitions File +@cindex definitions file +@cindex .def file + +This chapter describes the syntax and semantics of the AutoGen +definition file. In order to instantiate a template, you normally must +provide a definitions file that identifies itself and contains some +value definitions. Consequently, we keep it very simple. For +"advanced" users, there are preprocessing directives, sparse +arrays, named indexes and comments that may be used as well. + +The definitions file is used to associate values with names. Every +value is implicitly an array of values, even if there is only one value. +Values may be either simple strings or compound collections of +name-value pairs. An array may not contain both simple and compound +members. Fundamentally, it is as simple as: + +@example +prog-name = "autogen"; +flag = @{ + name = templ_dirs; + value = L; + descrip = "Template search directory list"; +@}; +@end example + +For purposes of commenting and controlling the processing of the +definitions, C-style comments and most C preprocessing directives are +honored. The major exception is that the @code{#if} directive is +ignored, along with all following text through the matching +@code{#endif} directive. The C preprocessor is not actually invoked, so +C macro substitution is @strong{not} performed. + +@menu +* Identification:: The Identification Definition +* Definitions:: Named Definitions +* Index Assignments:: Assigning an Index to a Definition +* Dynamic Text:: Dynamic Text +* Directives:: Controlling What Gets Processed +* Predefines:: Pre-defined Names +* Comments:: Commenting Your Definitions +* Example:: What it all looks like. +* Full Syntax:: Finite State Machine Grammar +* Alternate Definition:: Alternate Definition Forms +@end menu + +@c === SECTION MARKER + +@node Identification +@section The Identification Definition +@cindex identification + +The first definition in this file is used to identify it as a +AutoGen file. It consists of the two keywords, +@samp{autogen} and @samp{definitions} followed by the default +template name and a terminating semi-colon (@code{;}). That is: + +@example + AutoGen Definitions @var{template-name}; +@end example + +@noindent +Note that, other than the name @var{template-name}, the words +@samp{AutoGen} and @samp{Definitions} are searched for without case +sensitivity. Most lookups in this program are case insensitive. + +@noindent +Also, if the input contains more identification definitions, +they will be ignored. This is done so that you may include +(@pxref{Directives}) other definition files without an identification +conflict. + +@cindex template file + +@noindent +AutoGen uses the name of the template to find the corresponding template +file. It searches for the file in the following way, stopping when +it finds the file: + +@enumerate +@item +It tries to open @file{./@var{template-name}}. If it fails, +@item +it tries @file{./@var{template-name}.tpl}. +@item +It searches for either of these files in the directories listed in the +templ-dirs command line option. +@end enumerate + +If AutoGen fails to find the template file in one of these places, +it prints an error message and exits. + +@c === SECTION MARKER + +@node Definitions +@section Named Definitions +@cindex definitions + +A name is a sequence of characters beginning with an alphabetic character +(@code{a} through @code{z}) followed by zero or more alpha-numeric +characters and/or separator characters: hyphen (@code{-}), underscore +(@code{_}) or carat (@code{^}). Names are case insensitive. + +Any name may have multiple values associated with it. Every name may be +considered a sparse array of one or more elements. If there is more than +one value, the values my be accessed by indexing the value with +@code{[index]} or by iterating over them using the FOR (@pxref{FOR}) AutoGen +macro on it, as described in the next chapter. Sparse arrays are specified +by specifying an index when defining an entry +(@pxref{Index Assignments,, Assigning an Index to a Definition}). + +There are two kinds of definitions, @samp{simple} and @samp{compound}. +They are defined thus (@pxref{Full Syntax}): + +@example +compound_name '=' '@{' definition-list '@}' ';' + +simple-name[2] '=' string ';' + +no^text^name ';' +@end example + +@noindent +@code{simple-name} has the third index (index number 2) defined here. +@code{No^text^name} is a simple definition with a shorthand empty string +value. The string values for definitions may be specified in any of +several formation rules. + +@menu +* def-list:: Definition List +* double-quote-string:: Double Quote String +* single-quote-string:: Single Quote String +* simple-string:: An Unquoted String +* shell-generated:: Shell Output String +* scheme-generated:: Scheme Result String +* here-string:: A Here String +* concat-string:: Concatenated Strings +@end menu + +@cindex simple definitions +@cindex compound definitions + +@node def-list +@subsection Definition List + +@code{definition-list} is a list of definitions that may or may not +contain nested compound definitions. Any such definitions may +@strong{only} be expanded within a @code{FOR} block iterating over the +containing compound definition. @xref{FOR}. + +Here is, again, the example definitions from the previous chapter, +with three additional name value pairs. Two with an empty value +assigned (@var{first} and @var{last}), and a "global" @var{group_name}. + +@example +autogen definitions list; +group_name = example; +list = @{ list_element = alpha; first; + list_info = "some alpha stuff"; @}; +list = @{ list_info = "more beta stuff"; + list_element = beta; @}; +list = @{ list_element = omega; last; + list_info = "final omega stuff"; @}; +@end example + +@node double-quote-string +@subsection Double Quote String + +@cindex string, double quote +The string follows the C-style escaping, using the backslash to quote +(escape) the following character(s). Certain letters are translated to +various control codes (e.g. @code{\n}, @code{\f}, @code{\t}, etc.). +@code{x} introduces a two character hex code. @code{0} (the digit zero) +introduces a one to three character octal code (note: an octal byte followed +by a digit must be represented with three octal digits, thus: @code{"\0001"} +yielding a NUL byte followed by the ASCII digit 1). Any other character +following the backslash escape is simply inserted, without error, into the +string being formed. + +Like ANSI "C", a series of these strings, possibly intermixed with +single quote strings, will be concatenated together. + +@node single-quote-string +@subsection Single Quote String + +@cindex string, single quote +This is similar to the shell single-quote string. However, escapes +@code{\} are honored before another escape, single quotes @code{'} +and hash characters @code{#}. This latter is done specifically +to disambiguate lines starting with a hash character inside +of a quoted string. In other words, + +@example +fumble = ' +#endif +'; +@end example + +could be misinterpreted by the definitions scanner, whereas +this would not: + +@example +fumble = ' +\#endif +'; +@end example + +@* +As with the double quote string, a series of these, even intermixed +with double quote strings, will be concatenated together. + +@node simple-string +@subsection An Unquoted String + +A simple string that does not contain white space @i{may} be left +unquoted. The string must not contain any of the characters special to +the definition text (i.e., @code{"}, @code{#}, @code{'}, @code{(}, +@code{)}, @code{,}, @code{;}, @code{<}, @code{=}, @code{>}, @code{[}, +@code{]}, @code{`}, @code{@{}, or @code{@}}). This list is subject to +change, but it will never contain underscore (@code{_}), period +(@code{.}), slash (@code{/}), colon (@code{:}), hyphen (@code{-}) or +backslash (@code{\\}). Basically, if the string looks like it is a +normal DOS or UNIX file or variable name, and it is not one of two +keywords (@samp{autogen} or @samp{definitions}) then it is OK to not +quote it, otherwise you should. + +@node shell-generated +@subsection Shell Output String +@cindex shell-generated string + +@cindex string, shell output +This is assembled according to the same rules as the double quote string, +except that there is no concatenation of strings and the resulting string is +written to a shell server process. The definition takes on the value of +the output string. + +NB@: The text is interpreted by a server shell. There may be left over +state from previous server shell processing. This scriptlet may also leave +state for subsequent processing. However, a @code{cd} to the original +directory is always issued before the new command is issued. + +@node scheme-generated +@subsection Scheme Result String + +A scheme result string must begin with an open parenthesis @code{(}. +The scheme expression will be evaluated by Guile and the +value will be the result. The AutoGen expression functions +are @strong{dis}abled at this stage, so do not use them. + +@node here-string +@subsection A Here String +@cindex here-string + +A @samp{here string} is formed in much the same way as a shell here doc. It +is denoted with two less than characters(@code{<<}) and, optionally, a hyphen. +This is followed by optional horizontal white space and an ending +marker-identifier. This marker must follow the syntax rules for identifiers. +Unlike the shell version, however, you must not quote this marker. + +The resulting string will start with the first character on the next line and +continue up to but not including the newline that precedes the line that +begins with the marker token. The characters are copied directly into the +result string. Mostly. + +If a hyphen follows the less than characters, then leading tabs will be +stripped and the terminating marker will be recognized even if preceded by +tabs. Also, if the first character on the line (after removing tabs) is a +backslash and the next character a tab, then the backslash will be removed as +well. No other kind of processing is done on this string. + +Here are two examples: +@example +str1 = <<- STR_END + $quotes = " ' ` + STR_END; + +str2 = << STR_END + $quotes = " ' ` + STR_END; +STR_END; +@end example +The first string contains no new line characters. +The first character is the dollar sign, the last the back quote. + +The second string contains one new line character. The first character +is the tab character preceding the dollar sign. The last character is +the semicolon after the @code{STR_END}. That @code{STR_END} does not +end the string because it is not at the beginning of the line. In the +preceding case, the leading tab was stripped. + +@node concat-string +@subsection Concatenated Strings +@cindex concat-string + +If single or double quote characters are used, +then you also have the option, a la ANSI-C syntax, +of implicitly concatenating a series of them together, +with intervening white space ignored. + +NB@: You @strong{cannot} use directives to alter the string +content. That is, + +@example +str = "fumble" +#ifdef LATER + "stumble" +#endif + ; +@end example + +@noindent +will result in a syntax error. The preprocessing directives are not +carried out by the C preprocessor. However, + +@example +str = '"fumble\n" +#ifdef LATER +" stumble\n" +#endif +'; +@end example + +@noindent +@strong{Will} work. It will enclose the @samp{#ifdef LATER} +and @samp{#endif} in the string. But it may also wreak +havoc with the definition processing directives. The hash +characters in the first column should be disambiguated with +an escape @code{\} or join them with previous lines: +@code{"fumble\n#ifdef LATER...}. + +@c === SECTION MARKER + +@node Index Assignments +@section Assigning an Index to a Definition +@cindex Definition Index + +In AutoGen, every name is implicitly an array of values. +When assigning values, they are usually implicitly +assigned to the next highest slot. They can also be +specified explicitly: + +@example +mumble[9] = stumble; +mumble[0] = grumble; +@end example + +@noindent +If, subsequently, you assign a value to @code{mumble} without an +index, its index will be @code{10}, not @code{1}. +If indexes are specified, they must not cause conflicts. + +@code{#define}-d names may also be used for index values. +This is equivalent to the above: + +@example +#define FIRST 0 +#define LAST 9 +mumble[LAST] = stumble; +mumble[FIRST] = grumble; +@end example + +All values in a range do @strong{not} have to be filled in. +If you leave gaps, then you will have a sparse array. This +is fine (@pxref{FOR}). You have your choice of iterating +over all the defined values, or iterating over a range +of slots. This: + +@example +[+ FOR mumble +][+ ENDFOR +] +@end example + +@noindent +iterates over all and only the defined entries, whereas this: + +@example +[+ FOR mumble (for-by 1) +][+ ENDFOR +] +@end example + +@noindent +will iterate over all 10 "slots". Your template will +likely have to contain something like this: + +@example +[+ IF (exist? (sprintf "mumble[%d]" (for-index))) +] +@end example + +@noindent +or else "mumble" will have to be a compound value that, +say, always contains a "grumble" value: + +@example +[+ IF (exist? "grumble") +] +@end example + +@c === SECTION MARKER + +@node Dynamic Text +@section Dynamic Text +@cindex Dynamic Definition Text + +There are several methods for including dynamic content inside a definitions +file. Three of them are mentioned above (@ref{shell-generated} and +@pxref{scheme-generated}) in the discussion of string formation rules. +Another method uses the @code{#shell} processing directive. +It will be discussed in the next section (@pxref{Directives}). +Guile/Scheme may also be used to yield to create definitions. + +When the Scheme expression is preceded by a backslash and single +quote, then the expression is expected to be an alist of +names and values that will be used to create AutoGen definitions. + +@noindent +This method can be be used as follows: + +@example +\'( (name (value-expression)) + (name2 (another-expr)) ) +@end example + +@noindent +This is entirely equivalent to: + +@example +name = (value-expression); +name2 = (another-expr); +@end example + +@noindent +Under the covers, the expression gets handed off to a Guile function +named @code{alist->autogen-def} in an expression that looks like this: + +@example +(alist->autogen-def + ( (name (value-expression)) (name2 (another-expr)) ) ) +@end example |